Re: [linux-pm] [PATCH v2] Add suspend/resume for HPET

2007-04-02 Thread Thomas Gleixner
On Mon, 2007-04-02 at 16:04 -0400, Alan Stern wrote:
> > It's not that simple though, especially with HPET.  The BIOS may expect
> > the PIT to work, but Linux currently (and problematically!) uses HPET in
> > "legacy replacement mode".  And ISTR the problems are coming up when the
> > system is already in a low-functionality state:  IRQs off everywhere,
> > even timer ticks have stopped.
> 
> I know nothing about the workings of the HPET and other clock code.  My 
> point was this: Suspend passes through various intermediate stages in 
> which some devices are available and others aren't.  So long as those 
> stages are exact duplicates (in reverse order) of the stages that occurred 
> during startup, it should be possible to make them all work.

Unfortunately it is not a fully linear problem. Devices are initialized
late and put the system into a more complex state (i.e. dynticks,
highres) which needs to be suspended and resumed. If we want to do this
completely linear we need to do a full reverse rollback of the system
states, which moves even more complexity into such systems.

Also the linear approach is not working with other devices, as one can
see with the still unresolved "IRQ#X nobody cared" issues at resume,
which break my laptop. It works nice on startup of the system, but
breaks on resume.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Could the k8temp driver be interfering with ACPI?

2007-04-02 Thread Jean Delvare
Hi Dave,

On Mon, 2 Apr 2007 15:22:09 -0400, Dave Jones wrote:
> On Mon, Apr 02, 2007 at 05:48:59PM +0200, Jean Delvare wrote:
>  > +  u8  val;
>  > +#ifdef CONFIG_ACPI
>  > +  acpi_ut_acquire_mutex(ACPI_MTX_INTERPRETER);
>  > +#endif
>  >outb(reg, data->addr + ADDR_REG_OFFSET);
>  > -  return inb(data->addr + DATA_REG_OFFSET);
>  > +  val = inb(data->addr + DATA_REG_OFFSET);
>  > +#ifdef CONFIG_ACPI
>  > +  acpi_ut_release_mutex(ACPI_MTX_INTERPRETER);
>  > +#endif
>  > +  return val;
>  > ... deletia, more of the same.
> 
> it'd probably end up a lot cleaner to #define them to empty macros
> in the !ACPI case in acpi/acpi.h and just #include it unconditionally.

Sure, the implementation details can be refined later. I'm only trying
to see what can be done for now.

-- 
Jean Delvare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Expose system-wide UTF-8 default setting via sysfs

2007-04-02 Thread Antonino A. Daplas
On Tue, 2007-04-03 at 10:06 +0600, Alexander E. Patrakov wrote:
> Antonino A. Daplas wrote:
> > Create a variable, default_utf8, that defines the system-wide default UTF-8
> > setting.  This variable can be altered via sysfs. If the variable is 
> > properly
> > set, this should mimimize breakage of UTF-8 encoded consoles when doing a
> > reset or echo -e '\033c' and of newly opened/allocated consoles.
> > 
> > This is based from patches by Jan Engelhardt and Paul LeoNerd Evans.
> > 
> > Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]>
> > ---
> >> I think you're missing the whole point of console reset.  Its purpose is 
> >> to force the console into a known-good state.  The fewer pieces of state 
> >> it leaves unset, the better.  To some degree it's less important what 
> >> that state actually is.
> > 
> > Okay, you convinced me. Hopefully this is acceptable to all parties.
> > 
> > Andrew,
> > 
> > If everybody agrees, can you drop the previous patch I sent to you, and use
> > this instead?
> > 
> > Tony
> > +static int default_utf8;
> > +module_param(default_utf8, int, S_IRUGO | S_IWUSR);
> 
> Module parameter without description and documentation? Yes, I understand 
> that it is impossible to make vt a module. How about adding a line to 
> Documentation/kernel-parameters.txt?

I'll do that (and I'll also include Jan's palette patch) once I'm sure
there's no violent objection against the change.

Tony 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-02 Thread Tejun Heo
[resending.  my mail service was down for more than a week and this
message didn't get delivered.]

[EMAIL PROTECTED] wrote:
> > Anyway, what's annoying is that I can't figure out how to bring the
> > drive back on line without resetting the box.  It's in a hot-swap
enclosure,
> > but power cycling the drive doesn't seem to help.  I thought libata
hotplug
> > was working?  (SiI3132 card, using the sil24 driver.)

Yeah, it's working but failing resets are considered highly dangerous
(in that the controller status is unknown and may cause something
dangerous like screaming interrupts) and port is muted after that.  The
plan is to handle this with polling hotplug such that libata tries to
revive the port if PHY status change is detected by polling.  Patches
are available but they need other things to resolved to get integrated.
 I think it'll happen before the summer.

Anyways, you can tell libata to retry the port by manually telling it to
rescan the port (echo - - - > /sys/class/scsi_host/hostX/scan).

> > (H'm... after rebooting, reallocated sectors jumped from 26 to 39.
> > Something is up with that drive.)

Yeap, seems like a broken drive to me.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 21:57 -0700, Andrew Morton wrote:
> On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:
> > Does that mean the to function correctly every user needs some internal
> > cursor so it doesn't end up scanning the first N entries over and over?
> > 
> 
> If it wants to be well-behaved, and to behave as the VM expects, yes. 
> 
> There's an expectation that the callback will be performing some scan-based
> aging operation and of course to do LRU (or whatever) aging, the callback
> will need to remember where it was up to last time it was called.
> 
> But it's just a guideline - callbacks could do something different but
> in-the-spirit, I guess.

Hmm, actually the callers I looked at (nfs, dcache, mbcache) seem to use
an LRU list and just walk the first "nr_to_scan" entries, and nr_to_scan
is always 128.

Someone who keeps a cursor will be disadvantaged: the other shrinkers
could well get less effective on repeated calls, but we won't.  Someone
who picks entries at random might have the same issue.

I think it is clearest to describe how we expect everyone to work, and
let whoever is getting creative worry about it themselves.

How's this:
==
Cleanup and kernelify shrinker registration.

I can never remember what the function to register to receive VM pressure
is called.  I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

New version:
1) Don't hide struct shrinker.  It contains no magic.
2) Don't allocate "struct shrinker".  It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Reduce the 17 lines of waffly comments to 13, but document it properly.

Comments:
1) The comment in reiserfs4 makes me a little queasy.
2) The wrapper code in xfs might no longer be needed.
3) The placing in the x86-64 "hot function list" for seems a little
   unlikely.  Clearly, Andi was testing if anyone was paying attention.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r 0b43dab739aa arch/x86_64/kernel/functionlist
--- a/arch/x86_64/kernel/functionlist   Tue Apr 03 15:37:49 2007 +1000
+++ b/arch/x86_64/kernel/functionlist   Tue Apr 03 15:37:53 2007 +1000
@@ -1118,7 +1118,6 @@
 *(.text.simple_strtoll)
 *(.text.set_termios)
 *(.text.set_task_comm)
-*(.text.set_shrinker)
 *(.text.set_normalized_timespec)
 *(.text.set_brk)
 *(.text.serial_in)
diff -r 0b43dab739aa fs/dcache.c
--- a/fs/dcache.c   Tue Apr 03 15:37:49 2007 +1000
+++ b/fs/dcache.c   Tue Apr 03 15:37:53 2007 +1000
@@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, 
}
return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dcache_shrinker = {
+   .shrink = shrink_dcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /**
  * d_alloc -   allocate a dcache entry
@@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned 
 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
 SLAB_MEM_SPREAD),
 NULL, NULL);
-   
-   set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory);
+
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in dcache_init_early */
if (!hashdist)
diff -r 0b43dab739aa fs/dquot.c
--- a/fs/dquot.cTue Apr 03 15:37:49 2007 +1000
+++ b/fs/dquot.cTue Apr 03 15:37:53 2007 +1000
@@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr,
}
return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dqcache_shrinker = {
+   .shrink = shrink_dqcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /*
  * Put reference to dquot
@@ -1871,7 +1876,7 @@ static int __init dquot_init(void)
printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n",
nr_hash, order, (PAGE_SIZE << order));
 
-   set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory);
+   register_shrinker(_shrinker);
 
return 0;
 }
diff -r 0b43dab739aa fs/inode.c
--- a/fs/inode.cTue Apr 03 15:37:49 2007 +1000
+++ b/fs/inode.cTue Apr 03 15:37:53 2007 +1000
@@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, 
return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
 
+static struct shrinker icache_shrinker = {
+   .shrink = shrink_icache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
+
 static void __wait_on_freeing_inode(struct inode *inode);
 /*
  * Called with the inode lock held.
@@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem
 SLAB_MEM_SPREAD),
 init_once,
 NULL);
-   set_shrinker(DEFAULT_SEEKS, shrink_icache_memory);
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in 

Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-02 Thread Christian Kujau

On Tue, 3 Apr 2007, Len Brown wrote:

Which increased stability, disabling ACPI, or disabling the IOAPIC?


To be honest, we're not sure. See below.


Your box has MPS, so you should be able to use the IOAPIC in either mode.


MPS - Multiprocessor Specification? SMP? Yes, it'd be good to use the 
IOAPIC again.



Note that you can do these both independently at boot-time with "acpi=off"
and "noapic", respectively.
eg. 4 combos
1. 
2. noapic
3. acpi=off
4. acpi=off noapic
you started with #1, and are running hard-coded #4 now, but skipped #2 
and #3


Indeed, we skipped quite a few options. As mentioned before, the boxes 
are in production already so we don't have much time to play around and

we were just happy when they survived a few hours :(

But yes, we'll try booting with "acpi=off" and enabled IOAPIC again.

@Malte: when will we be able to do so?

Len et al., do you even suggest to use ACPI on a server system at all? I 
myself always thought of ACPI being evil and to avoid when possible 
(thus switching it off completely on a serversystem).


Since these NETDEV WATCHDOG issues seems to be a "known issue" (kinda, 
since the many postings on the lists in the past), is there something 
else we should look into? Would more debug .config options help to find 
out why they lock up?


Thanks for your comments,
Christian.
--
BOFH excuse #340:

Well fix that in the next (upgrade, update, patch release, service pack).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.21-rc5-mm4

2007-04-02 Thread Andrew Morton

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm4/

- The oops in git-net.patch has been fixed, so that tree has been restored. 
  It is huge.

- Added the device-mapper development tree to the -mm lineup (Alasdair
  Kergon).  It is a quilt tree, living at
  ftp://ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/.

- Added davidel's signalfd stuff.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git 
tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

echo "subscribe mm-commits" | mail [EMAIL PROTECTED]

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.




Changes since 2.6.21-rc5-mm3:


 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-agpgart.patch
 git-arm.patch
 git-avr32.patch
 git-cifs.patch
 git-cpufreq.patch
 git-powerpc.patch
 git-drm.patch
 git-dvb.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ia64.patch
 git-ieee1394.patch
 git-infiniband.patch
 git-input.patch
 git-kbuild.patch
 git-kvm.patch
 git-leds.patch
 git-libata-all.patch
 git-md-accel.patch
 git-md-accel-fix.patch
 git-mips.patch
 git-mmc.patch
 git-mtd.patch
 git-ubi.patch
 git-netdev-all.patch
 git-e1000.patch
 git-net.patch
 git-ioat.patch
 git-ocfs2.patch
 git-parisc.patch
 git-r8169.patch
 git-selinux.patch
 git-pciseg.patch
 git-s390.patch
 git-scsi-misc.patch
 git-block.patch
 git-unionfs.patch
 git-watchdog.patch
 git-wireless.patch
 git-ipwireless_cs.patch
 git-cryptodev.patch
 git-gccbug.patch

 git trees.

-proc-fix-linkage-with-config_sysctl=y-config_proc_sysctl=n.patch
-uml-fix-unreasonably-long-udelay.patch
-fix-firmware-sample-code.patch
-jdelvare-i2c-i2c-algo-bit-document-udelay.patch
-pcmcia-allow-pcmcia-scsi-drivers-to-be-built-into-the.patch
-gregkh-pci-pci-set-pci-bfsort-for-poweredge-r900.patch
-fix-gregkh-pci-pci-piggy-bus.patch
-drivers-scsi-dpt_i2oc-remove-dead-code.patch
-scsi-whitespace-cleanup-in-the-dpt-driver.patch
-drivers-scsi-aic7xxx-make-functions-static.patch
-remove-some-unused-scsi-related-kernel-config-variables.patch
-drivers-scsi-aacraid-cleanups.patch
-make-mptspi_target_destroy-static.patch
-qla2xxx-remove-duplicate-pci_disable_device-call.patch
-gregkh-usb-usb-gtcoc-fix-a-use-before-check.patch
-gregkh-usb-usb-ati_remote2-add-channel-support.patch
-usb-serial-whiteheat-convert-to-generic-boolean.patch
-x86_64-mm-dont-probe-for-ddc-on-vbe1_2.patch
-x86_64-mm-remove-hardcoding-of-hard_smp_processor_id-on-up-systems.patch
-drivers-mfd-sm501c-fix-an-off-by-one.patch

 Merged into mainline or a subsystem tree.

+md-avoid-a-deadlock-when-removing-a-device-from-an-md-array-via-sysfs.patch
+md-avoid-a-deadlock-when-removing-a-device-from-an-md-array-via-sysfs-fix.patch
+revert-driver-core-do-not-wait-unnecessarily-in-driver_unregister.patch

 2.6.21 queue.

+vmi-paravirt-ops-bugfix-for-2621.patch

 Might be 2.6.21 queue.

+drivers-acpi-kconfig-formulation-fixpatch.patch

 ACPI fixlet

+pata_platform-for-arm-riscpc.patch

 ARM/pata fix

+cifs-use-mutexdiff.patch

 CIFS cleanup

+agk-dm-dm-merge-max_hw_sector.patch
+agk-dm-dm-raid1-one-kmirrord-per-mirror.patch
+agk-dm-dm-crypt-disable-barriers.patch
+agk-dm-dm-crypt-add-null-iv.patch
+agk-dm-dm-mpath-log-device-name.patch
+agk-dm-dm-allow-offline-devices.patch
+agk-dm-dm-log-fault-detection.patch
+agk-dm-dm-log-report-fault-status.patch
+agk-dm-dm-raid1-add-handle_errors-feature-flag.patch
+agk-dm-dm-io-delay-dec_count.patch
+agk-dm-dm-io-prepare-for-new-interface.patch
+agk-dm-dm-io-new-interface.patch
+agk-dm-dm-kcopyd-update-dm-io-interface.patch
+agk-dm-dm-exception-store-update-dm-io-interface.patch
+agk-dm-dm-log-update-dm-io-interface.patch
+agk-dm-dm-raid1-update-dm-io-interface.patch
+agk-dm-dm-io-remove-old-interface.patch

 Device mapper development tree.


Re: [KJ][PATCH]ROUND_UP macro cleanup in drivers/net/ixgb

2007-04-02 Thread Kok, Auke

Milind Arun Choudhary wrote:

IXGB_ROUNDUP macro cleanup ,use ALIGN


cool beans!

Same reply as to the ALIGN patch you sent for e1000 -> We'll take it for a spin 
and I'll push your patch upstream as part of the regular updates!


Thanks,

Auke




Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---

 ixgb.h |3 ---
 ixgb_ethtool.c |4 ++--
 ixgb_main.c|4 ++--
 ixgb_param.c   |4 ++--
 4 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h
index cf30a10..c8e9086 100644
--- a/drivers/net/ixgb/ixgb.h
+++ b/drivers/net/ixgb/ixgb.h
@@ -111,9 +111,6 @@ struct ixgb_adapter;
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define IXGB_RX_BUFFER_WRITE   8   /* Must be power of 2 */
 
-/* only works for sizes that are powers of 2 */

-#define IXGB_ROUNDUP(i, size) ((i) = (((i) + (size) - 1) & ~((size) - 1)))
-
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer */
 struct ixgb_buffer {
diff --git a/drivers/net/ixgb/ixgb_ethtool.c b/drivers/net/ixgb/ixgb_ethtool.c
index d6628bd..cdefaff 100644
--- a/drivers/net/ixgb/ixgb_ethtool.c
+++ b/drivers/net/ixgb/ixgb_ethtool.c
@@ -577,11 +577,11 @@ ixgb_set_ringparam(struct net_device *netdev,
 
 	rxdr->count = max(ring->rx_pending,(uint32_t)MIN_RXD);

rxdr->count = min(rxdr->count,(uint32_t)MAX_RXD);
-	IXGB_ROUNDUP(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); 
+	rxdr->count = ALIGN(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); 
 
 	txdr->count = max(ring->tx_pending,(uint32_t)MIN_TXD);

txdr->count = min(txdr->count,(uint32_t)MAX_TXD);
-	IXGB_ROUNDUP(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); 
+	txdr->count = ALIGN(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); 
 
 	if(netif_running(adapter->netdev)) {

/* Try to get new resources before deleting old */
diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
index afc2ec7..158c71e 100644
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -685,7 +685,7 @@ ixgb_setup_tx_resources(struct ixgb_adapter *adapter)
/* round up to nearest 4K */
 
 	txdr->size = txdr->count * sizeof(struct ixgb_tx_desc);

-   IXGB_ROUNDUP(txdr->size, 4096);
+   txdr->size = ALIGN(txdr->size, 4096);
 
 	txdr->desc = pci_alloc_consistent(pdev, txdr->size, >dma);

if(!txdr->desc) {
@@ -774,7 +774,7 @@ ixgb_setup_rx_resources(struct ixgb_adapter *adapter)
/* Round up to nearest 4K */
 
 	rxdr->size = rxdr->count * sizeof(struct ixgb_rx_desc);

-   IXGB_ROUNDUP(rxdr->size, 4096);
+   rxdr->size = ALIGN(rxdr->size, 4096);
 
 	rxdr->desc = pci_alloc_consistent(pdev, rxdr->size, >dma);
 
diff --git a/drivers/net/ixgb/ixgb_param.c b/drivers/net/ixgb/ixgb_param.c

index b27442a..ee8cc67 100644
--- a/drivers/net/ixgb/ixgb_param.c
+++ b/drivers/net/ixgb/ixgb_param.c
@@ -284,7 +284,7 @@ ixgb_check_options(struct ixgb_adapter *adapter)
} else {
tx_ring->count = opt.def;
}
-   IXGB_ROUNDUP(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE);
+   tx_ring->count = ALIGN(tx_ring->count, 
IXGB_REQ_TX_DESCRIPTOR_MULTIPLE);
}
{ /* Receive Descriptor Count */
struct ixgb_option opt = {
@@ -303,7 +303,7 @@ ixgb_check_options(struct ixgb_adapter *adapter)
} else {
rx_ring->count = opt.def;
}
-   IXGB_ROUNDUP(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE);
+   rx_ring->count = ALIGN(rx_ring->count, 
IXGB_REQ_RX_DESCRIPTOR_MULTIPLE);
}
{ /* Receive Checksum Offload Enable */
struct ixgb_option opt = {




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Kprobes: Print details of kretprobe on assertion failure

2007-04-02 Thread Ananth N Mavinakayanahalli
On Mon, Apr 02, 2007 at 02:17:32PM -0700, Andrew Morton wrote:
> On Mon, 2 Apr 2007 14:56:36 +0530
> Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote:
> 
> > From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
> > 
> > In certain cases like when the real return address can't be found or
> > when the number of tracked calls to a kretprobed function is less than
> > the number of returns, we may not be able to find the correct return
> > address after processing a kretprobe. Currently we just do a BUG_ON, but
> > no information is provided about the actual failing kretprobe.
> > 
> > Print out details of the kretprobe before calling BUG().
> > 
> > Signed-off-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
> > 
> > ---
> >  arch/i386/kernel/kprobes.c|6 +-
> >  arch/ia64/kernel/kprobes.c|7 ++-
> >  arch/powerpc/kernel/kprobes.c |7 ++-
> >  arch/x86_64/kernel/kprobes.c  |7 ++-
> >  4 files changed, 23 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6.21-rc5/arch/i386/kernel/kprobes.c
> > ===
> > --- linux-2.6.21-rc5.orig/arch/i386/kernel/kprobes.c
> > +++ linux-2.6.21-rc5/arch/i386/kernel/kprobes.c
> > @@ -440,7 +440,11 @@ fastcall void *__kprobes trampoline_hand
> > break;
> > }
> >  
> > -   BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
> > +   if (!orig_ret_address || (orig_ret_address == trampoline_address)) {
> > +   printk("kretprobe BUG!: Processing kretprobe %p @ %p\n",
> > +   ri->rp, ri->rp->kp.addr);
> > +   BUG();
> > +   }
> >  
> > spin_unlock_irqrestore(_lock, flags);
> >  
> > Index: linux-2.6.21-rc5/arch/ia64/kernel/kprobes.c
> > ===
> > --- linux-2.6.21-rc5.orig/arch/ia64/kernel/kprobes.c
> > +++ linux-2.6.21-rc5/arch/ia64/kernel/kprobes.c
> > @@ -444,7 +444,12 @@ int __kprobes trampoline_probe_handler(s
> > break;
> > }
> >  
> > -   BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
> > +   if (!orig_ret_address || (orig_ret_address == trampoline_address)) {
> > +   printk("kretprobe BUG!: Processing kretprobe %p @ %p\n",
> > +   ri->rp, ri->rp->kp.addr);
> > +   BUG();
> > +   }
> > +
> > regs->cr_iip = orig_ret_address;
> >  
> > reset_current_kprobe();
> > Index: linux-2.6.21-rc5/arch/powerpc/kernel/kprobes.c
> > ===
> > --- linux-2.6.21-rc5.orig/arch/powerpc/kernel/kprobes.c
> > +++ linux-2.6.21-rc5/arch/powerpc/kernel/kprobes.c
> > @@ -293,7 +293,12 @@ int __kprobes trampoline_probe_handler(s
> > break;
> > }
> >  
> > -   BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
> > +   if (!orig_ret_address || (orig_ret_address == trampoline_address)) {
> > +   printk("kretprobe BUG!: Processing kretprobe %p @ %p\n",
> > +   ri->rp, ri->rp->kp.addr);
> > +   BUG();
> > +   }
> > +
> > regs->nip = orig_ret_address;
> >  
> > reset_current_kprobe();
> > Index: linux-2.6.21-rc5/arch/x86_64/kernel/kprobes.c
> > ===
> > --- linux-2.6.21-rc5.orig/arch/x86_64/kernel/kprobes.c
> > +++ linux-2.6.21-rc5/arch/x86_64/kernel/kprobes.c
> > @@ -438,7 +438,12 @@ int __kprobes trampoline_probe_handler(s
> > break;
> > }
> >  
> > -   BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address));
> > +   if (!orig_ret_address || (orig_ret_address == trampoline_address)) {
> > +   printk("kretprobe BUG!: Processing kretprobe %p @ %p\n",
> > +   ri->rp, ri->rp->kp.addr);
> > +   BUG();
> > +   }
> > +
> > regs->rip = orig_ret_address;
> >  
> 
> A lot of copying-and-pasting there.  Would it be better if this assertion
> was performed in a library function in kernel/kprobes.c?

Indeed. Here is the updated patch...

From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>

In certain cases like when the real return address can't be found or
when the number of tracked calls to a kretprobed function is less than
the number of returns, we may not be able to find the correct return
address after processing a kretprobe. Currently we just do a BUG_ON, but
no information is provided about the actual failing kretprobe.

Print out details of the kretprobe before calling BUG().

Signed-off-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>

---
 arch/i386/kernel/kprobes.c|3 +--
 arch/ia64/kernel/kprobes.c|3 ++-
 arch/powerpc/kernel/kprobes.c |2 +-
 arch/x86_64/kernel/kprobes.c  |2 +-
 include/linux/kprobes.h   |   10 ++
 5 files changed, 15 insertions(+), 5 deletions(-)

Index: linux-2.6.21-rc5/arch/i386/kernel/kprobes.c

Re: [xfs-masters] Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread David Chinner
On Mon, Apr 02, 2007 at 09:57:02PM -0700, Andrew Morton wrote:
> On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote:
> > > On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> 
> > > wrote:
> > > 
> > > > 
> > > > I can never remember what the function to register to receive VM 
> > > > pressure
> > > > is called.  I have to trace down from __alloc_pages() to find it.
> > > > 
> > > > It's called "set_shrinker()", and it needs Your Help.
> > > > 
> > > > New version:
> > > > 1) Don't hide struct shrinker.  It contains no magic.
> > > > 2) Don't allocate "struct shrinker".  It's not helpful.
> > > > 3) Call them "register_shrinker" and "unregister_shrinker".
> > > > 4) Call the function "shrink" not "shrinker".
> > > > 5) Rename "nr_to_scan" argument to "nr_to_free".
> > > 
> > > No, it is actually the number to scan.  This is >= the number of freed
> > > objects.
> > > 
> > > This is because, for better of for worse, the VM tries to balance the
> > > scanning rate of the various caches, not the reclaiming rate.
> > 
> > Err, ok, I completely missed that distinction.
> > 
> > Does that mean the to function correctly every user needs some internal
> > cursor so it doesn't end up scanning the first N entries over and over?
> > 
> 
> If it wants to be well-behaved, and to behave as the VM expects, yes. 
> 
> There's an expectation that the callback will be performing some scan-based
> aging operation and of course to do LRU (or whatever) aging, the callback
> will need to remember where it was up to last time it was called.
> 
> But it's just a guideline - callbacks could do something different but
> in-the-spirit, I guess.

In XFS, one of the shrinkers cwthat gets registered calls causes all
the xfsbufd's in the system to run and write back delayed write
metadata - this can't be freed up until it is clean, and this is the
only hook we have that can be used to trigger writeback on memory
pressure. We need this because we can potentially have hundreds of
megabytes of dirty metadata per XFS filesystem.

IOW, the way the VM expects the shrinkers to work can be far, far
away from what subsystems need the shrinker callbacks for

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-02 Thread Christian Kujau

On Mon, 2 Apr 2007, Chuck Ebbert wrote:

Where is the info from before you changed to "noapic"? Or were the
machines always using XT-PIC for all the interrupts???


XT-PIC is only used since we switched to noapic, before there was 
IO-APIC-fasteoi on both ethernet cards and interrupts were balanced 
well.


Thanks,
Christian.
--
BOFH excuse #340:

Well fix that in the next (upgrade, update, patch release, service pack).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-04-02 Thread Mike Galbraith
On Tue, 2007-04-03 at 12:37 +1000, Con Kolivas wrote:
> On Thursday 29 March 2007 15:50, Mike Galbraith wrote:
> > On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote:
> > + * This contains a bitmap for each dynamic priority level with empty slots
> > + * for the valid priorities each different nice level can have. It allows
> > + * us to stagger the slots where differing priorities run in a way that
> > + * keeps latency differences between different nice levels at a minimum.
> > + * ie, where 0 means a slot for that priority, priority running from left
> > to + * right:
> > + * nice -20 
> > + * nice -10 1001000100100010001001000100010010001000
> > + * nice   0 0101010101010101010101010101010101010101
> > + * nice   5 1101011010110101101011010110101101011011
> > + * nice  10 0110111011011101110110111011101101110111
> > + * nice  15 0101101101011011
> > + * nice  19 1110
> 
> Try two instances of chew.c at _differing_ nice levels on one cpu on 
> mainline, 
> and then SD. This is why you can't renice X on mainline.

How about something more challenging instead :)

The numbers below are from my scheduler tree with massive_intr running
at nice 0, and chew at nice 5.  Below these numbers are 100 lines from
the exact center of chew's output.

(interactivity remains intact with this rather heavy load)

[EMAIL PROTECTED]: ./massive_intr 30 180
005671  1506
005657  1506
005651  1491
005647  1466
005661  1484
005660  1475
005645  1514
005668  1384
005673  1516
005656  1449
005664  1512
005659  1507
005667  1513
005663  1521
005670  1440
005649  1522
005652  1487
005648  1405
005665  1472
005669  1418
005662  1489
005674  1523
005650  1480
005655  1476
005672  1530
005653  1463
005654  1427
005646  1499
005658  1510
005666  1476

100 sequential lines from the middle of chew's logged output.

pid 5642, prio   5, out for2 ms, ran for1 ms, load  34%
pid 5642, prio   5, out for 1268 ms, ran for   63 ms, load   4%
pid 5642, prio   5, out for   52 ms, ran for0 ms, load   0%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  14%
pid 5642, prio   5, out for9 ms, ran for1 ms, load  12%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  15%
pid 5642, prio   5, out for9 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  15%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  12%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  18%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  11%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  18%
pid 5642, prio   5, out for4 ms, ran for1 ms, load  22%
pid 5642, prio   5, out for 1395 ms, ran for   50 ms, load   3%
pid 5642, prio   5, out for   26 ms, ran for0 ms, load   3%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  15%
pid 5642, prio   5, out for9 ms, ran for1 ms, load  11%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  13%
pid 5642, prio   5, out for7 ms, ran for0 ms, load  11%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  11%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  14%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  20%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  14%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  13%
pid 5642, prio   5, out for 1400 ms, ran for   53 ms, load   3%
pid 5642, prio   5, out for   22 ms, ran for1 ms, load   6%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  18%
pid 5642, prio   5, out for9 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for2 ms, ran for1 ms, load  49%
pid 5642, prio   5, out for 1281 ms, ran for   50 ms, load   3%
pid 5642, prio   5, out for   50 ms, ran for0 ms, load   1%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  15%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  16%
pid 5642, prio   5, out for8 ms, ran for1 ms, load  19%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  17%
pid 5642, prio   5, out for7 ms, ran for1 ms, load  13%
pid 5642, prio   

Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"

2007-04-02 Thread Eric W. Biederman
Vivek Goyal <[EMAIL PROTECTED]> writes:

>> I guess at this point the easy case is that we modify /sbin/kexec to support
>> it.  And the other bootloaders can come be upgraded if the feature is
>> interesting enough.
>> 
>> > On i386, somebody already found an interesting usage of
> CONFIG_PHYSICAL_START
>> > where he was running his kernel above 16MB so that he can maximize on
>> > DMA ZONE. Can't think of any usage for x86_64 at the moment but I think
>> > down the line people might come up with such usages.
>> 
>> Agreed.  We do have CONFIG_PHYSICAL_ALIGN that can handle that case,
>> although I admit that is a bit of a hack.
>> 
>
> Yes, but x86_64 will not have any of those options and only way to run 
> kernel will be either use kexec or modify your boot-loader to so that
> it can handle relocatable images.

True.

>> > To me, retaining CONFIG_PHYSICAL_START gives added flexibility to the user,
>> > at the expense of reduced simplicity. We should definitely change the type
>> > of vmlinux to ET_DYN but at the same time it might still be worth to retain
>> > CONFIG_PHYSICAL_START option.
>> 
>> I think something like CONFIG_PHYSICAL_START currently gives us very
>> little gain, and is hard to use correctly, and there are alternative
>> solutions.  So if we can get rid of it, by only inconveniencing users
>> who want load their kernels at a weird address it is worth it.
>> 
>> >> I think I can switch the vmlinux header type in about 100 lines or so
>> >> of code.  Assuming I can ever get 30 minutes with the appropriate
>> >> kernel.
>> >> 
>> >
>> > That would be awesome. Then vmlinux will be relocatable too. (Officially).
>> 
>> Yes.  For x86_64 I can do this.  i386 is more difficult.  (Although with
>> a little cleverness we can move the code that processes relocations into
>> vmlinux).  
>> 
>
> Performing relocations in vmlinux will be interesting. That way i386 vmlinux
> too will become relocatable and only piece of puzzle to solve will be to
> make vmlinux of type ET_DYN.

Actually making vmlinux have type ET_DYN is the easier piece.  Basically
the quick way to do this is to have an arch specific: "cmd_vmlinux__"
like uml does so we can edit things after the make.

Changing an integer in an ELF header is simple.

Inserting the code to perform the relocations feels a bit trickier but
we can probably just dump it in head.S like we do on x86_64.  We still need
to insert the actual relocations to process though.  Which requires all of the
post processing we currently do just called at a slightly different location.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL

2007-04-02 Thread Mike Galbraith
On Tue, 2007-04-03 at 12:34 +1000, Con Kolivas wrote:
> On Saturday 31 March 2007 19:28, Xenofon Antidides wrote:
> > For long time now I use windows to work 
> > problems. I cannot play wine games with audio, I
> > cannot sample video, I cannot use skype, I cannot play
> > midi. And even linux only things I try do I cannot
> > share my X, I cannot use more than one vmware. All
> > those is fix for me with SD.
> 
> Any semblance of cpu bandwidth and latency guarantees are easily shot on 
> mainline by a single process going wild (eg open tab in firefox).

I've written a patch which I believe fixes that.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20.4: NETDEV WATCHDOG and lockups

2007-04-02 Thread Len Brown
On Monday 02 April 2007 15:41, Christian Kujau wrote:
> 
> Hi there,
> 
> we have serious problems with 2 of our servers: both shiny new amd64 
> dual core, with both 2GB RAM, 32bit kernel+userland (Debian/testing).
> Both servers have 2 NICs, RTL8139 (eth0, irq10) and RTL8169s
> (eth1, irq11).
> 
> Both boxes are running fine but after "a while" they lock up and 
> eventually restart all of a sudden. The last messages in the logfile 
> are:
> 
> 14:15:11 db2 kernel: NETDEV WATCHDOG: eth0: transmit timed out
> 14:15:14 db2 kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
> 
> Then the box reboots, nothing else in the log.
> 
> As the servers have been set up recently, we only know that it happend 
> with Debian's 2.6.17-? kernel. When we upgraded the installation, we 
> went to 2.6.18-4-k7 and the problem persistent. We're using now vanilla 
> 2.6.20.4 and while the problem persists, it takes longer to lockup (~20h 
> as opposed to 4-5h). While this is a good thing for us, it's now harder
> to reproduce (we have to wait longer).
> 
> Searching the archives turned up quite a few results but no real fix and 
> lots of old postings too. We then disabled ACPI completely and booted 
> with 'noapic'. Now both boxes are running for > 20h and we're curious 
> how long they make it. However, booting with 'noapic' slowed down both 
> servers *a lot*.

Which increased stability, disabling ACPI, or disabling the IOAPIC?
Your box has MPS, so you should be able to use the IOAPIC in either mode.
Note that you can do these both independently at boot-time with "acpi=off"
and "noapic", respectively.
eg. 4 combos
1. 
2. noapic
3. acpi=off
4. acpi=off noapic

you started with #1, and are running hard-coded #4 now, but skipped #2 and #3

cheers,
-Len

> >From /proc/interrupts we can see that only CPU0 (core 0) is handling 
> interrupts while CPU1 does not. We compiled with CONFIG_IRQBALANCE=n so 
> that irqbalance(1) would work - but to no avail.
> 
> Please see http://nerdbynature.de/bits/2.6.20.4/ for details for both 
> hosts and feel free to ask for more details. Although both boxes are in 
> production we'll be happy test more bootoptions/patches and the like.
> 
> TIA,
> Christian.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [RFC] UML kernel & rootfs bundle with every kernel release ?

2007-04-02 Thread Jason Lunz
On Mon, Apr 02, 2007 at 05:44:34PM -0400, Jeff Dike wrote:
> There are sites (http://uml.nagafix.co.uk/ being the best one I know
> of) where, with two downloads, two uncompressions, and one command
> line later, you have a booted UML.
> 
> The only way I know of to improve on this, aside from inprovements in
> the booted distro, is to package the filesystem as a rootfs within the
> UML kernel binary.  I've considered this, but haven't done anything
> with it.

I've done the converse: package the uml kernel within the rootfs image,
and use a script that plays the part of bootloader. With ext2 at least,
it's fairly easy to use the debugfs 'cat' command for this.

That way, you simply distribute the fs image with a companion script
that can boot any number of such images.

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Driver core: add suspend() and resume() to struct device_type

2007-04-02 Thread Dmitry Torokhov
Hi Greg,

Here is another patch extending struct device_type. I need it to
implement generic suspend/resume routines for input devices.
As you may remember input core devices and interface devices are
mixed in the same class and because suspend/resume only applies
to core devices so I can't define these methods on class level. 

-- 
Dmitry

Driver core: add suspend() and resume() to struct device_type

In cases when there are devices of different types in the same class
we can't use class's implementation of suspend and resume methods and
we need to add them to struct device_type instead.

Also fix error handling in resume code (we should not try to call
class's resume method iof bus's resume method for the device failed.

Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]>
---

 drivers/base/power/resume.c  |   13 -
 drivers/base/power/suspend.c |   12 
 include/linux/device.h   |2 ++
 3 files changed, 26 insertions(+), 1 deletion(-)

Index: work/drivers/base/power/resume.c
===
--- work.orig/drivers/base/power/resume.c
+++ work/drivers/base/power/resume.c
@@ -26,7 +26,9 @@ int resume_device(struct device * dev)
 
TRACE_DEVICE(dev);
TRACE_RESUME(0);
+
down(>sem);
+
if (dev->power.pm_parent
&& dev->power.pm_parent->power.power_state.event) {
dev_err(dev, "PM: resume from %d, parent %s still %d\n",
@@ -34,15 +36,24 @@ int resume_device(struct device * dev)
dev->power.pm_parent->bus_id,
dev->power.pm_parent->power.power_state.event);
}
+
if (dev->bus && dev->bus->resume) {
dev_dbg(dev,"resuming\n");
error = dev->bus->resume(dev);
}
-   if (dev->class && dev->class->resume) {
+
+   if (!error && dev->type && dev->type->resume) {
+   dev_dbg(dev,"resuming\n");
+   error = dev->type->resume(dev);
+   }
+
+   if (!error && dev->class && dev->class->resume) {
dev_dbg(dev,"class resume\n");
error = dev->class->resume(dev);
}
+
up(>sem);
+
TRACE_RESUME(error);
return error;
 }
Index: work/drivers/base/power/suspend.c
===
--- work.orig/drivers/base/power/suspend.c
+++ work/drivers/base/power/suspend.c
@@ -78,6 +78,18 @@ int suspend_device(struct device * dev, 
suspend_report_result(dev->class->suspend, error);
}
 
+   if (!error && dev->type && dev->type->suspend && 
!dev->power.power_state.event) {
+   dev_dbg(dev, "%s%s\n",
+   suspend_verb(state.event),
+   ((state.event == PM_EVENT_SUSPEND)
+   && device_may_wakeup(dev))
+   ? ", may wakeup"
+   : ""
+   );
+   error = dev->type->suspend(dev, state);
+   suspend_report_result(dev->type->suspend, error);
+   }
+
if (!error && dev->bus && dev->bus->suspend && 
!dev->power.power_state.event) {
dev_dbg(dev, "%s%s\n",
suspend_verb(state.event),
Index: work/include/linux/device.h
===
--- work.orig/include/linux/device.h
+++ work/include/linux/device.h
@@ -332,6 +332,8 @@ struct device_type {
int (*uevent)(struct device *dev, char **envp, int num_envp,
  char *buffer, int buffer_size);
void (*release)(struct device *dev);
+   int (*suspend)(struct device * dev, pm_message_t state);
+   int (*resume)(struct device * dev);
 };
 
 /* interface for exporting device attributes */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: usb hid: reset NumLock

2007-04-02 Thread Dmitry Torokhov
On Monday 02 April 2007 19:12, Pete Zaitcev wrote:
> On Mon, 2 Apr 2007 16:48:24 +0200 (CEST), Jiri Kosina <[EMAIL PROTECTED]> 
> wrote:
> > On Sun, 1 Apr 2007, Pete Zaitcev wrote:
> 
> > could you please change the order of the two functions, so that you 
> > don't have to put the forward declaration here?
> >[...]
> > I'd say this is a little bit overcommented.
> >[...]
> > So as soon as you have the VIDs and PIDs of the hardware which 
> > requires this, could you please update the patch and send it to me again?
> 
> How about this?

Actually I think I will be adding the patch below, but it has to wait
till 2.6.22 as it requires input core to struct device conversion
patch.

What do you think?

-- 
Dmitry

Input: add generic suspend and resume for uinput devices

Automatically turn off leds and sound effects as part of suspend
process and restore led state, sounds and repeat rate at resume.

Also synchronize hardware state with logical state at device
registration.

Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]>
---

 drivers/input/input.c |   80 ++
 1 files changed, 80 insertions(+)

Index: work/drivers/input/input.c
===
--- work.orig/drivers/input/input.c
+++ work/drivers/input/input.c
@@ -997,10 +997,88 @@ static int input_dev_uevent(struct devic
return 0;
 }
 
+static void input_dev_toggle(struct input_dev *dev,
+unsigned int type, unsigned int code,
+unsigned long *cap_bits, unsigned long *bits,
+int force_off)
+{
+   if (test_bit(code, cap_bits)) {
+   if (!force_off)
+   dev->event(dev, type, code, test_bit(code, bits));
+   else if (test_bit(code, bits))
+   dev->event(dev, type, code, 0);
+   }
+}
+
+static void input_dev_reset(struct input_dev *dev, int force_off)
+{
+   int i;
+
+   if (!dev->event)
+   return;
+
+   /* synchronize led state */
+   if (test_bit(EV_LED, dev->evbit))
+   for (i = 0; i <= LED_MAX; i++)
+   input_dev_toggle(dev, EV_LED, i,
+dev->ledbit, dev->led, force_off);
+
+   /* restore sound */
+   if (test_bit(EV_SND, dev->evbit))
+   for (i = 0; i <= SND_MAX; i++)
+   input_dev_toggle(dev, EV_SND, i,
+dev->sndbit, dev->snd, force_off);
+
+   if (!force_off && test_bit(EV_REP, dev->evbit)) {
+   dev->event(dev, EV_REP, REP_PERIOD, dev->rep[REP_PERIOD]);
+   dev->event(dev, EV_REP, REP_DELAY, dev->rep[REP_DELAY]);
+   }
+}
+
+#ifdef CONFIG_PM
+static int input_dev_suspend(struct device *dev, pm_message_t state)
+{
+   struct input_dev *input_dev = to_input_dev(dev);
+
+   mutex_lock(_dev->mutex);
+
+   if (dev->power.power_state.event != state.event) {
+   if (state.event == PM_EVENT_SUSPEND)
+   input_dev_reset(input_dev, 1);
+
+   dev->power.power_state = state;
+   }
+
+   mutex_unlock(_dev->mutex);
+
+   return 0;
+}
+
+static int input_dev_resume(struct device *dev)
+{
+   struct input_dev *input_dev = to_input_dev(dev);
+
+   mutex_lock(_dev->mutex);
+
+   if (dev->power.power_state.event != PM_EVENT_ON)
+   input_dev_reset(to_input_dev(dev), 0);
+
+   dev->power.power_state = PMSG_ON;
+
+   mutex_unlock(_dev->mutex);
+
+   return 0;
+}
+#endif /* CONFIG_PM */
+
 static struct device_type input_dev_type = {
.groups = input_dev_attr_groups,
.release= input_dev_release,
.uevent = input_dev_uevent,
+#ifdef CONFIG_PM
+   .suspend= input_dev_suspend,
+   .resume = input_dev_resume,
+#endif
 };
 
 struct class input_class = {
@@ -1080,6 +1158,8 @@ int input_register_device(struct input_d
dev->rep[REP_PERIOD] = 33;
}
 
+   input_dev_reset(dev, 0);
+
if (!dev->getkeycode)
dev->getkeycode = input_default_getkeycode;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Andrew Morton
On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:

> On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote:
> > On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:
> > 
> > > 
> > > I can never remember what the function to register to receive VM pressure
> > > is called.  I have to trace down from __alloc_pages() to find it.
> > > 
> > > It's called "set_shrinker()", and it needs Your Help.
> > > 
> > > New version:
> > > 1) Don't hide struct shrinker.  It contains no magic.
> > > 2) Don't allocate "struct shrinker".  It's not helpful.
> > > 3) Call them "register_shrinker" and "unregister_shrinker".
> > > 4) Call the function "shrink" not "shrinker".
> > > 5) Rename "nr_to_scan" argument to "nr_to_free".
> > 
> > No, it is actually the number to scan.  This is >= the number of freed
> > objects.
> > 
> > This is because, for better of for worse, the VM tries to balance the
> > scanning rate of the various caches, not the reclaiming rate.
> 
> Err, ok, I completely missed that distinction.
> 
> Does that mean the to function correctly every user needs some internal
> cursor so it doesn't end up scanning the first N entries over and over?
> 

If it wants to be well-behaved, and to behave as the VM expects, yes. 

There's an expectation that the callback will be performing some scan-based
aging operation and of course to do LRU (or whatever) aging, the callback
will need to remember where it was up to last time it was called.

But it's just a guideline - callbacks could do something different but
in-the-spirit, I guess.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote:
> On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:
> 
> > 
> > I can never remember what the function to register to receive VM pressure
> > is called.  I have to trace down from __alloc_pages() to find it.
> > 
> > It's called "set_shrinker()", and it needs Your Help.
> > 
> > New version:
> > 1) Don't hide struct shrinker.  It contains no magic.
> > 2) Don't allocate "struct shrinker".  It's not helpful.
> > 3) Call them "register_shrinker" and "unregister_shrinker".
> > 4) Call the function "shrink" not "shrinker".
> > 5) Rename "nr_to_scan" argument to "nr_to_free".
> 
> No, it is actually the number to scan.  This is >= the number of freed
> objects.
> 
> This is because, for better of for worse, the VM tries to balance the
> scanning rate of the various caches, not the reclaiming rate.

Err, ok, I completely missed that distinction.

Does that mean the to function correctly every user needs some internal
cursor so it doesn't end up scanning the first N entries over and over?

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2)

2007-04-02 Thread Tejun Heo
Robert Hancock wrote:
> This adds some NCQ blacklist entries taken from the Silicon Image 3124/3132
> Windows driver .inf files. There are some confirming reports of problems
> with these drives under Linux (for example
> http://lkml.org/lkml/2007/3/4/178)
> so let's disable NCQ on these drives.
> 
> Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

Acked-by: Tejun Heo <[EMAIL PROTECTED]>

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2)

2007-04-02 Thread Robert Hancock

This adds some NCQ blacklist entries taken from the Silicon Image 3124/3132
Windows driver .inf files. There are some confirming reports of problems
with these drives under Linux (for example http://lkml.org/lkml/2007/3/4/178)
so let's disable NCQ on these drives.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.21-rc5-git9/drivers/ata/libata-core.c 2007-04-02 
21:03:29.0 -0600
+++ linux-2.6.21-rc5-git9edit/drivers/ata/libata-core.c 2007-04-02 
21:26:23.0 -0600
@@ -3363,6 +3363,11 @@ static const struct ata_blacklist_entry 
	{ "Maxtor 6L250S0", "BANC1G10", ATA_HORKAGE_NONCQ },

/* NCQ hard hangs device under heavier load, needs hard power cycle */
{ "Maxtor 6B250S0",   "BANC1B70",   ATA_HORKAGE_NONCQ },
+   /* Blacklist entries taken from Silicon Image 3124/3132
+  Windows driver .inf file - also several Linux problem reports */
+   { "HTS541060G9SA00","MB3OC60D", ATA_HORKAGE_NONCQ, },
+   { "HTS541080G9SA00","MB4OC60D", ATA_HORKAGE_NONCQ, },
+   { "HTS541010G9SA00","MBZOC60D", ATA_HORKAGE_NONCQ, },

/* Devices with NCQ limits */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Expose system-wide UTF-8 default setting via sysfs

2007-04-02 Thread Alexander E. Patrakov

Antonino A. Daplas wrote:

Create a variable, default_utf8, that defines the system-wide default UTF-8
setting.  This variable can be altered via sysfs. If the variable is properly
set, this should mimimize breakage of UTF-8 encoded consoles when doing a
reset or echo -e '\033c' and of newly opened/allocated consoles.

This is based from patches by Jan Engelhardt and Paul LeoNerd Evans.

Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]>
---
I think you're missing the whole point of console reset.  Its purpose is 
to force the console into a known-good state.  The fewer pieces of state 
it leaves unset, the better.  To some degree it's less important what 
that state actually is.


Okay, you convinced me. Hopefully this is acceptable to all parties.

Andrew,

If everybody agrees, can you drop the previous patch I sent to you, and use
this instead?

Tony
+static int default_utf8;
+module_param(default_utf8, int, S_IRUGO | S_IWUSR);


Module parameter without description and documentation? Yes, I understand 
that it is impossible to make vt a module. How about adding a line to 
Documentation/kernel-parameters.txt?


Other than that, the patch looks like a useful change.

--
Alexander E. Patrakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"

2007-04-02 Thread Vivek Goyal
On Mon, Apr 02, 2007 at 04:59:26PM +0200, [EMAIL PROTECTED] wrote:
> From: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> Date: Mon, Apr 02, 2007 at 04:49:14PM +0200
> > 
> > I used a working 2.6.21-rc3-mm2 tree, patched it up to 2.6.21-rc5-mm3
> > and applied your patch. I ended up with the .config later in this email,
> > and got this error:
> > 
> >   CC  arch/x86_64/kernel/head64.o
> > arch/x86_64/kernel/head64.c: In function 'x86_64_start_kernel':
> > arch/x86_64/kernel/head64.c:70: error: size of array 'type name' is negative
> > make[1]: *** [arch/x86_64/kernel/head64.o] Error 1
> > make: *** [arch/x86_64/kernel] Error 2
> > 
> > After reverting your patch, the build didn't fail, but of course the
> > kernel won't build.
> > 
> That should, of course, read 'kernel won't boot'.
> 

I agree that error message is not very clear. It is just an indication that
there is a problem on line 70 in head64.c. That's why I have put a
commet there so that anybody can make out that CONFIG_PHYSICAL_START
is not 2MB aligned hence the failure.

Unfortunately, Kconfig infrastrucutre does not allow to place alignment
restrictions on the values. Otherwise that would have been the best
solution.

So we still have detected the problem at compilation time in a little
indirect manner though.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"

2007-04-02 Thread Vivek Goyal
On Mon, Apr 02, 2007 at 11:26:38AM -0600, Eric W. Biederman wrote:
> Vivek Goyal <[EMAIL PROTECTED]> writes:
> 
> > Only advantage of CONFIG_PHYSICAL_START seems to be that one has got
> > capability to run the kernel from other addresses without modifying the
> > boot-loader. One can argue that now people should use a relocatable kernel
> > for such a feature. But for using relocatable kenrel, one needs to modify
> > grub, lilo and I am not sure if somebody is going to do that. Secondly, how
> > would one specify an address to a boot-loader to load image at?
> 
> I thought this was important for vmlinux and Xen?
> 

Yes it is. Actually you had already mentioned it in the previous mail that's
why I did not repeat it here. Xen folks wanted to continue using vmlinux
for capturing dump. I am not sure if there is any technical limitation in
using relocatable bzImage or just that they wanted to continue using
existing working interface and did not want to switch to new interface.

Magnus, Horms, do you want to add to it? Is there a reason that relocatable
bzImage will not work in Xen env and we need to retain CONFIG_PHYSICAL_START
option in x86_64?


> I guess at this point the easy case is that we modify /sbin/kexec to support
> it.  And the other bootloaders can come be upgraded if the feature is
> interesting enough.
> 
> > On i386, somebody already found an interesting usage of 
> > CONFIG_PHYSICAL_START
> > where he was running his kernel above 16MB so that he can maximize on
> > DMA ZONE. Can't think of any usage for x86_64 at the moment but I think
> > down the line people might come up with such usages.
> 
> Agreed.  We do have CONFIG_PHYSICAL_ALIGN that can handle that case,
> although I admit that is a bit of a hack.
> 

Yes, but x86_64 will not have any of those options and only way to run 
kernel will be either use kexec or modify your boot-loader to so that
it can handle relocatable images.

> > To me, retaining CONFIG_PHYSICAL_START gives added flexibility to the user,
> > at the expense of reduced simplicity. We should definitely change the type
> > of vmlinux to ET_DYN but at the same time it might still be worth to retain
> > CONFIG_PHYSICAL_START option.
> 
> I think something like CONFIG_PHYSICAL_START currently gives us very
> little gain, and is hard to use correctly, and there are alternative
> solutions.  So if we can get rid of it, by only inconveniencing users
> who want load their kernels at a weird address it is worth it.
> 
> >> I think I can switch the vmlinux header type in about 100 lines or so
> >> of code.  Assuming I can ever get 30 minutes with the appropriate
> >> kernel.
> >> 
> >
> > That would be awesome. Then vmlinux will be relocatable too. (Officially).
> 
> Yes.  For x86_64 I can do this.  i386 is more difficult.  (Although with
> a little cleverness we can move the code that processes relocations into
> vmlinux).  
> 

Performing relocations in vmlinux will be interesting. That way i386 vmlinux
too will become relocatable and only piece of puzzle to solve will be to
make vmlinux of type ET_DYN.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Andrew Morton
On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote:

> 
> I can never remember what the function to register to receive VM pressure
> is called.  I have to trace down from __alloc_pages() to find it.
> 
> It's called "set_shrinker()", and it needs Your Help.
> 
> New version:
> 1) Don't hide struct shrinker.  It contains no magic.
> 2) Don't allocate "struct shrinker".  It's not helpful.
> 3) Call them "register_shrinker" and "unregister_shrinker".
> 4) Call the function "shrink" not "shrinker".
> 5) Rename "nr_to_scan" argument to "nr_to_free".

No, it is actually the number to scan.  This is >= the number of freed
objects.

This is because, for better of for worse, the VM tries to balance the
scanning rate of the various caches, not the reclaiming rate.

> 6) Reduce the 17 lines of waffly comments to 10, and document the -1 return.
> 
> Comments:
> 1) The comment in reiserfs4 makes me a little queasy.

I'm going to have to split this patch up into mainline-bit and reiser4-bit.

And that's OK (it's a regular occurrence).  But never miss a chance to whine.

> 2) The wrapper code in xfs might no longer be needed.
> 3) The placing in the x86-64 "hot function list" for seems a little
>unlikely.  Clearly, Andi was testing if anyone was paying attention.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Powerpc build unhappy in 2.6.20.4?

2007-04-02 Thread Rob Landley
On Monday 02 April 2007 8:51 pm, Tony Breeds wrote:
> On Mon, Apr 02, 2007 at 03:14:14PM -0400, Rob Landley wrote:
>  
> > Sure, quite easily the source of the trouble.  Attached in both 
full .config 
> > and mini.config formats.
> 
> Okay, I have no idea how it happend but you seem to have an invalid
> config.  It looks to me like you need to select a platform.

So "make oldconfig ARCH=powerpc" will accept a config that doesn't have a 
platform selected?

> One of the following:
>   CONFIG_PPC_PSERIES
>   CONFIG_PPC_MAPLE
>   CONFIG_PPC_IBM_CELL_BLADE
>   CONFIG_PPC_PS3
>   CONFIG_PPC_CHRP
>   CONFIG_PPC_EFIKA
>   CONFIG_PPC_PMAC

Hmmm...  So CONFIG_PPC_MULTIPLATFORM doesn't cover it?  ("There is no help 
available for this kernel option"...  Maybe a website somewhere?)

I just ran "make oldconfig" again and it didn't complain about any of those 
not being set...

> When did this config last build a zImage?  I'm guessing either CHRP or
> PMAC?

Er, never.  I was largely guessing at what I needed via menuconfig.  (I'm 
trying to get something I can boot to a shell prompt under QEMU.)

I'll try this CHRP thing...

Thanks,

Rob
-- 
Penguicon 5.0 Apr 20-22, Linux Expo/SF Convention.  Bruce Schneier, Christine 
Peterson, Steve Jackson, Randy Milholland, Elizabeth Bear, Charlie Stross...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Rusty Russell
On Tue, 2007-04-03 at 13:45 +1000, Rusty Russell wrote:
> It's called "set_shrinker()", and it needs Your Help.

Wrong copy.  This is the one which actually compiles reiser4.

==
I can never remember what the function to register to receive VM pressure
is called.  I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

New version:
1) Don't hide struct shrinker.  It contains no magic.
2) Don't allocate "struct shrinker".  It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Rename "nr_to_scan" argument to "nr_to_free".
6) Reduce the 17 lines of waffly comments to 10, and document the -1 return.

Comments:
1) The comment in reiserfs4 makes me a little queasy.
2) The wrapper code in xfs might no longer be needed.
3) The placing in the x86-64 "hot function list" for seems a little
   unlikely.  Clearly, Andi was testing if anyone was paying attention.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r a6c8dede237c arch/x86_64/kernel/functionlist
--- a/arch/x86_64/kernel/functionlist   Tue Apr 03 12:53:59 2007 +1000
+++ b/arch/x86_64/kernel/functionlist   Tue Apr 03 13:15:11 2007 +1000
@@ -1118,7 +1118,6 @@
 *(.text.simple_strtoll)
 *(.text.set_termios)
 *(.text.set_task_comm)
-*(.text.set_shrinker)
 *(.text.set_normalized_timespec)
 *(.text.set_brk)
 *(.text.serial_in)
diff -r a6c8dede237c fs/dcache.c
--- a/fs/dcache.c   Tue Apr 03 12:53:59 2007 +1000
+++ b/fs/dcache.c   Tue Apr 03 13:09:55 2007 +1000
@@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, 
}
return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dcache_shrinker = {
+   .shrink = shrink_dcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /**
  * d_alloc -   allocate a dcache entry
@@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned 
 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
 SLAB_MEM_SPREAD),
 NULL, NULL);
-   
-   set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory);
+
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in dcache_init_early */
if (!hashdist)
diff -r a6c8dede237c fs/dquot.c
--- a/fs/dquot.cTue Apr 03 12:53:59 2007 +1000
+++ b/fs/dquot.cTue Apr 03 13:10:31 2007 +1000
@@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr,
}
return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dqcache_shrinker = {
+   .shrink = shrink_dqcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /*
  * Put reference to dquot
@@ -1871,7 +1876,7 @@ static int __init dquot_init(void)
printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n",
nr_hash, order, (PAGE_SIZE << order));
 
-   set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory);
+   register_shrinker(_shrinker);
 
return 0;
 }
diff -r a6c8dede237c fs/inode.c
--- a/fs/inode.cTue Apr 03 12:53:59 2007 +1000
+++ b/fs/inode.cTue Apr 03 13:11:05 2007 +1000
@@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, 
return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
 
+static struct shrinker icache_shrinker = {
+   .shrink = shrink_icache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
+
 static void __wait_on_freeing_inode(struct inode *inode);
 /*
  * Called with the inode lock held.
@@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem
 SLAB_MEM_SPREAD),
 init_once,
 NULL);
-   set_shrinker(DEFAULT_SEEKS, shrink_icache_memory);
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in inode_init_early */
if (!hashdist)
diff -r a6c8dede237c fs/mbcache.c
--- a/fs/mbcache.c  Tue Apr 03 12:53:59 2007 +1000
+++ b/fs/mbcache.c  Tue Apr 03 13:12:37 2007 +1000
@@ -100,7 +100,6 @@ static LIST_HEAD(mb_cache_list);
 static LIST_HEAD(mb_cache_list);
 static LIST_HEAD(mb_cache_lru_list);
 static DEFINE_SPINLOCK(mb_cache_spinlock);
-static struct shrinker *mb_shrinker;
 
 static inline int
 mb_cache_indexes(struct mb_cache *cache)
@@ -118,6 +117,10 @@ mb_cache_indexes(struct mb_cache *cache)
 
 static int mb_cache_shrink_fn(int nr_to_scan, gfp_t gfp_mask);
 
+static struct shrinker mb_cache_shrinker = {
+   .shrink = mb_cache_shrink_fn,
+   .seeks = DEFAULT_SEEKS,
+};
 
 static inline int
 __mb_cache_entry_is_hashed(struct mb_cache_entry *ce)
@@ -662,13 +665,13 @@ mb_cache_entry_find_next(struct mb_cache
 
 static int __init init_mbcache(void)
 {
-   mb_shrinker = set_shrinker(DEFAULT_SEEKS, mb_cache_shrink_fn);
+   register_shrinker(_cache_shrinker);
return 0;
 }
 
 

[PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)

2007-04-02 Thread Rusty Russell
I can never remember what the function to register to receive VM pressure
is called.  I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

New version:
1) Don't hide struct shrinker.  It contains no magic.
2) Don't allocate "struct shrinker".  It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Rename "nr_to_scan" argument to "nr_to_free".
6) Reduce the 17 lines of waffly comments to 10, and document the -1 return.

Comments:
1) The comment in reiserfs4 makes me a little queasy.
2) The wrapper code in xfs might no longer be needed.
3) The placing in the x86-64 "hot function list" for seems a little
   unlikely.  Clearly, Andi was testing if anyone was paying attention.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r a6c8dede237c arch/x86_64/kernel/functionlist
--- a/arch/x86_64/kernel/functionlist   Tue Apr 03 12:53:59 2007 +1000
+++ b/arch/x86_64/kernel/functionlist   Tue Apr 03 13:15:11 2007 +1000
@@ -1118,7 +1118,6 @@
 *(.text.simple_strtoll)
 *(.text.set_termios)
 *(.text.set_task_comm)
-*(.text.set_shrinker)
 *(.text.set_normalized_timespec)
 *(.text.set_brk)
 *(.text.serial_in)
diff -r a6c8dede237c fs/dcache.c
--- a/fs/dcache.c   Tue Apr 03 12:53:59 2007 +1000
+++ b/fs/dcache.c   Tue Apr 03 13:09:55 2007 +1000
@@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, 
}
return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dcache_shrinker = {
+   .shrink = shrink_dcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /**
  * d_alloc -   allocate a dcache entry
@@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned 
 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
 SLAB_MEM_SPREAD),
 NULL, NULL);
-   
-   set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory);
+
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in dcache_init_early */
if (!hashdist)
diff -r a6c8dede237c fs/dquot.c
--- a/fs/dquot.cTue Apr 03 12:53:59 2007 +1000
+++ b/fs/dquot.cTue Apr 03 13:10:31 2007 +1000
@@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr,
}
return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure;
 }
+
+static struct shrinker dqcache_shrinker = {
+   .shrink = shrink_dqcache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
 
 /*
  * Put reference to dquot
@@ -1871,7 +1876,7 @@ static int __init dquot_init(void)
printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n",
nr_hash, order, (PAGE_SIZE << order));
 
-   set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory);
+   register_shrinker(_shrinker);
 
return 0;
 }
diff -r a6c8dede237c fs/inode.c
--- a/fs/inode.cTue Apr 03 12:53:59 2007 +1000
+++ b/fs/inode.cTue Apr 03 13:11:05 2007 +1000
@@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, 
return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }
 
+static struct shrinker icache_shrinker = {
+   .shrink = shrink_icache_memory,
+   .seeks = DEFAULT_SEEKS,
+};
+
 static void __wait_on_freeing_inode(struct inode *inode);
 /*
  * Called with the inode lock held.
@@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem
 SLAB_MEM_SPREAD),
 init_once,
 NULL);
-   set_shrinker(DEFAULT_SEEKS, shrink_icache_memory);
+   register_shrinker(_shrinker);
 
/* Hash may have been set up in inode_init_early */
if (!hashdist)
diff -r a6c8dede237c fs/mbcache.c
--- a/fs/mbcache.c  Tue Apr 03 12:53:59 2007 +1000
+++ b/fs/mbcache.c  Tue Apr 03 13:12:37 2007 +1000
@@ -100,7 +100,6 @@ static LIST_HEAD(mb_cache_list);
 static LIST_HEAD(mb_cache_list);
 static LIST_HEAD(mb_cache_lru_list);
 static DEFINE_SPINLOCK(mb_cache_spinlock);
-static struct shrinker *mb_shrinker;
 
 static inline int
 mb_cache_indexes(struct mb_cache *cache)
@@ -118,6 +117,10 @@ mb_cache_indexes(struct mb_cache *cache)
 
 static int mb_cache_shrink_fn(int nr_to_scan, gfp_t gfp_mask);
 
+static struct shrinker mb_cache_shrinker = {
+   .shrink = mb_cache_shrink_fn,
+   .seeks = DEFAULT_SEEKS,
+};
 
 static inline int
 __mb_cache_entry_is_hashed(struct mb_cache_entry *ce)
@@ -662,13 +665,13 @@ mb_cache_entry_find_next(struct mb_cache
 
 static int __init init_mbcache(void)
 {
-   mb_shrinker = set_shrinker(DEFAULT_SEEKS, mb_cache_shrink_fn);
+   register_shrinker(_cache_shrinker);
return 0;
 }
 
 static void __exit exit_mbcache(void)
 {
-   remove_shrinker(mb_shrinker);
+   unregister_shrinker(_cache_shrinker);
 }
 
 module_init(init_mbcache)
diff -r a6c8dede237c 

Re: [linux-usb-devel] [RFC] HID bus design overview.

2007-04-02 Thread Dmitry Torokhov
On Monday 02 April 2007 21:15, Li Yu wrote:
> 
> If we don't use "flip-flopping" means, the common driver and specific
> driver concepts also don't need. They are completely same driver for HID
> bus, just one without some hooks, another without.

Exactly. I am glad we are getting on the same page.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [RFC] HID bus design overview.

2007-04-02 Thread Dmitry Torokhov
On Monday 02 April 2007 21:40, Li Yu wrote:
> May be, we need some means to change blacklist in runtime. and
> loading/unloading such driver by specific script to do it.

Please look at the new_id sysfs attribute implementation in
drivers/pci/pci-driver.c. I believe we need something similar
to dynamically adjust HID ignore blacklist.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


difference between arcmsr (areca) 1.20.0X.13 & 2.6.20 in tree driver?

2007-04-02 Thread Joshua Hoblitt
Has anyone else noticed this regression?

-J

--
- Forwarded message from Joshua Hoblitt <[EMAIL PROTECTED]> -

From: Joshua Hoblitt <[EMAIL PROTECTED]>
Date: Thu, 29 Mar 2007 16:38:20 -1000
To: [EMAIL PROTECTED]
Subject: difference between 1.20.0X.13 & 2.6.20 in tree driver?

Hello,

I just attempted to upgrade a system from 2.6.17 (gentoo revision 8) w/
1.20.0X.13 to 2.6.20 (gentoo revision 4) with the in tree arcmsr driver.
On the 2.6.17 kernel there are 2 ~4TB partitions that are visible on
the system as /dev/sdb1 & /dev/sdc1.  When the system is booted with the
2.6.20 kernel /dev/sdc1 is gone and /dev/sdb1 is properly reported as a
4TB EFI partition but fsck rejects the the filesystem as corrupt.  Is
this a regression or has there been a fundamental change in the way
arcmsr represents the array to the block layer?

Here is the info on the RAID card:

Controller Name ARC-1170
Firmware VersionV1.39 2005-12-13
BOOT ROM VersionV1.39 2005-12-13
Serial Number   Y605CAAVAR700117
Unit Serial #
Main Processor  500MHz IOP331
CPU ICache Size 32KBytes
CPU DCache Size 32KBytes / Write Back
System Memory   256MB / 333MHz

Any idea as to what's going on?

Thanks,

-J

--
Under 2.6.20:

Mar 29 15:41:05 ipp000 ARECA RAID ADAPTER4: FIRMWARE VERSION V1.39 2005-12-13
Mar 29 15:41:05 ipp000 scsi4 : Areca SATA Host Adapter RAID Controller( RAID6 
capable)
Mar 29 15:41:05 ipp000 Driver Version 1.20.00.13
Mar 29 15:41:05 ipp000 scsi 4:0:0:0: Direct-Access ArecaARC-1170-VOL#00 
 R001 PQ: 0 ANSI: 3
Mar 29 15:41:05 ipp000 sdb : very big device. try to use READ CAPACITY(16).
Mar 29 15:41:05 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors 
(4400995 MB) 
Mar 29 15:41:05 ipp000 sdb: Write Protect is off
Mar 29 15:41:05 ipp000 sdb: Mode Sense: cb 00 00 08
Mar 29 15:41:05 ipp000 SCSI device sdb: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA 
Mar 29 15:41:05 ipp000 sdb : very big device. try to use READ CAPACITY(16). 
Mar 29 15:41:05 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors 
(4400995 MB)
Mar 29 15:41:05 ipp000 sdb: Write Protect is off
Mar 29 15:41:05 ipp000 sdb: Mode Sense: cb 00 00 08
Mar 29 15:41:05 ipp000 SCSI device sdb: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
Mar 29 15:41:05 ipp000 sdb: sdb1
Mar 29 15:41:05 ipp000 sd 4:0:0:0: Attached scsi disk sdb
Mar 29 15:41:05 ipp000 scsi 4:0:16:0: Processor ArecaRAID 
controller  R001 PQ: 0 ANSI: 0

Under 2.6.17:

Mar 29 15:45:11 ipp000 ARECA RAID ADAPTER0: 64BITS PCI BUS DMA ADDRESSING 
SUPPORTED
Mar 29 15:45:11 ipp000 ARECA RAID ADAPTER0: FIRMWARE VERSION V1.39 2005-12-13
Mar 29 15:45:11 ipp000 scsi4 : Areca SATA Host Adapter RAID Controller( RAID6 
capable)
Mar 29 15:45:11 ipp000 Driver Version 1.20.0X.13
Mar 29 15:45:11 ipp000 Vendor: Areca Model: ARC-1170-VOL#00   Rev: R001
Mar 29 15:45:11 ipp000 Type:   Direct-Access  ANSI SCSI 
revision: 03
Mar 29 15:45:11 ipp000 sdb : very big device. try to use READ CAPACITY(16).
Mar 29 15:45:11 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors 
(4400995 MB)
Mar 29 15:45:11 ipp000 sdb: Write Protect is off
Mar 29 15:45:11 ipp000 sdb: Mode Sense: cb 00 00 08
Mar 29 15:45:11 ipp000 SCSI device sdb: drive cache: write back
Mar 29 15:45:11 ipp000 sdb : very big device. try to use READ CAPACITY(16).
Mar 29 15:45:11 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors 
(4400995 MB)
Mar 29 15:45:11 ipp000 sdb: Write Protect is off
Mar 29 15:45:11 ipp000 sdb: Mode Sense: cb 00 00 08
Mar 29 15:45:11 ipp000 SCSI device sdb: drive cache: write back
Mar 29 15:45:11 ipp000 sdb:<4>Alternate GPT is invalid, using primary GPT.
Mar 29 15:45:11 ipp000 sdb1
Mar 29 15:45:11 ipp000 sd 4:0:0:0: Attached scsi disk sdb
Mar 29 15:45:11 ipp000 sd 4:0:0:0: Attached scsi generic sg1 type 0
Mar 29 15:45:11 ipp000 Vendor: Areca Model: ARC-1170-VOL#01   Rev: R001
Mar 29 15:45:11 ipp000 Type:   Direct-Access  ANSI SCSI 
revision: 03
Mar 29 15:45:11 ipp000 sdc : very big device. try to use READ CAPACITY(16).
Mar 29 15:45:11 ipp000 SCSI device sdc: 8595580928 512-byte hdwr sectors 
(4400937 MB)
Mar 29 15:45:11 ipp000 sdc: Write Protect is off
Mar 29 15:45:11 ipp000 sdc: Mode Sense: cb 00 00 08
Mar 29 15:45:11 ipp000 SCSI device sdc: drive cache: write back
Mar 29 15:45:11 ipp000 sdc : very big device. try to use READ CAPACITY(16).
Mar 29 15:45:11 ipp000 SCSI device sdc: 8595580928 512-byte hdwr sectors 
(4400937 MB)
Mar 29 15:45:11 ipp000 sdc: Write Protect is off
Mar 29 15:45:11 ipp000 sdc: Mode Sense: cb 00 00 08
Mar 29 15:45:11 ipp000 SCSI device sdc: drive cache: write back
Mar 29 15:45:11 ipp000 sdc: sdc1
Mar 29 15:45:11 ipp000 sd 4:0:0:1: Attached scsi disk sdc
Mar 29 15:45:11 ipp000 sd 4:0:0:1: Attached scsi generic sg2 type 0
Mar 29 15:45:11 ipp000 Vendor: Areca Model: RAID controller   Rev: R001
Mar 29 15:45:11 ipp000 Type:   Processor  

Re: [PATCH 2/9] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code

2007-04-02 Thread David Miller
From: David Howells <[EMAIL PROTECTED]>
Date: Mon, 02 Apr 2007 23:45:03 +0100

> Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
> use it too.
> 
> The kdoc comments I've attached to the functions needs to be checked by 
> whoever
> wrote them as I had to make some guesses about the workings of these 
> functions.
> 
> Signed-Off-By: David Howells <[EMAIL PROTECTED]>

Patch applied to net-2.6.22, thanks a lot David.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD driver-core] Lifetime problems of the current driver model

2007-04-02 Thread Tejun Heo
Cornelia Huck wrote:
> On Mon, 2 Apr 2007 11:20:48 +0200,
> Cornelia Huck <[EMAIL PROTECTED]> wrote:
> 
>> Cool. However, there's something fishy there (not sure whether it's in
>> your patch or a latent bug in the ccw bus code that just has been
>> uncovered):
> 
> Similar bug when loading/unloading a module that creates a driver
> attribute. The winner seems to be kfree(sd->s_element) in
> release_sysfs_dirent() (in case of an attribute, it will point to the
> attribute structure, which is usually statically created)...

Thanks for finding it out.  I was suspecting that last minute change.
The code should be

if (dir node)
kfree(s_element)
else if (symlink node)
do things and kfree()

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] [PATCH] persistent_clock() for x86_64

2007-04-02 Thread john stultz
This patch converts the get_cmos_time() function to
read_persistent_clock(), which allows x86_64 to utilize the full generic
timekeeping suspend/resume code path.

Unfortunately I don't have any x86_64 boxes that suspend/resume to play
w/ so this is mostly untested (but uses the same generic code path as
i386).

Any thoughts or comments?

thanks
-john


Signed-off-by: John Stultz <[EMAIL PROTECTED]>


 time.c |   43 +--
 1 file changed, 1 insertion(+), 42 deletions(-)

diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
index 75d73a9..df09c65 100644
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -201,7 +201,7 @@ static irqreturn_t timer_interrupt(int i
return IRQ_HANDLED;
 }
 
-static unsigned long get_cmos_time(void)
+unsigned long read_persistent_clock(void)
 {
unsigned int year, mon, day, hour, min, sec;
unsigned long flags;
@@ -327,11 +327,6 @@ void __init time_init(void)
 {
if (nohpet)
hpet_address = 0;
-   xtime.tv_sec = get_cmos_time();
-   xtime.tv_nsec = 0;
-
-   set_normalized_timespec(_to_monotonic,
-   -xtime.tv_sec, -xtime.tv_nsec);
 
if (hpet_arch_init())
hpet_address = 0;
@@ -364,59 +359,23 @@ void __init time_init(void)
 }
 
 
-static long clock_cmos_diff;
-static unsigned long sleep_start;
-
 /*
  * sysfs support for the timer.
  */
 
-static int timer_suspend(struct sys_device *dev, pm_message_t state)
-{
-   /*
-* Estimate time zone so that set_time can update the clock
-*/
-   long cmos_time =  get_cmos_time();
-
-   clock_cmos_diff = -cmos_time;
-   clock_cmos_diff += get_seconds();
-   sleep_start = cmos_time;
-   return 0;
-}
-
 static int timer_resume(struct sys_device *dev)
 {
-   unsigned long flags;
-   unsigned long sec;
-   unsigned long ctime = get_cmos_time();
-   long sleep_length = (ctime - sleep_start) * HZ;
-
-   if (sleep_length < 0) {
-   printk(KERN_WARNING "Time skew detected in timer resume!\n");
-   /* The time after the resume must not be earlier than the time
-* before the suspend or some nasty things will happen
-*/
-   sleep_length = 0;
-   ctime = sleep_start;
-   }
if (hpet_address)
hpet_reenable();
else
i8254_timer_resume();
 
-   sec = ctime + clock_cmos_diff;
-   write_seqlock_irqsave(_lock,flags);
-   xtime.tv_sec = sec;
-   xtime.tv_nsec = 0;
-   jiffies += sleep_length;
-   write_sequnlock_irqrestore(_lock,flags);
touch_softlockup_watchdog();
return 0;
 }
 
 static struct sysdev_class timer_sysclass = {
.resume = timer_resume,
-   .suspend = timer_suspend,
set_kset_name("timer"),
 };
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[KJ][PATCH]ROUND_UP macro cleanup in drivers/net/ixgb

2007-04-02 Thread Milind Arun Choudhary
IXGB_ROUNDUP macro cleanup ,use ALIGN

Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---

 ixgb.h |3 ---
 ixgb_ethtool.c |4 ++--
 ixgb_main.c|4 ++--
 ixgb_param.c   |4 ++--
 4 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h
index cf30a10..c8e9086 100644
--- a/drivers/net/ixgb/ixgb.h
+++ b/drivers/net/ixgb/ixgb.h
@@ -111,9 +111,6 @@ struct ixgb_adapter;
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define IXGB_RX_BUFFER_WRITE   8   /* Must be power of 2 */
 
-/* only works for sizes that are powers of 2 */
-#define IXGB_ROUNDUP(i, size) ((i) = (((i) + (size) - 1) & ~((size) - 1)))
-
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer */
 struct ixgb_buffer {
diff --git a/drivers/net/ixgb/ixgb_ethtool.c b/drivers/net/ixgb/ixgb_ethtool.c
index d6628bd..cdefaff 100644
--- a/drivers/net/ixgb/ixgb_ethtool.c
+++ b/drivers/net/ixgb/ixgb_ethtool.c
@@ -577,11 +577,11 @@ ixgb_set_ringparam(struct net_device *netdev,
 
rxdr->count = max(ring->rx_pending,(uint32_t)MIN_RXD);
rxdr->count = min(rxdr->count,(uint32_t)MAX_RXD);
-   IXGB_ROUNDUP(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); 
+   rxdr->count = ALIGN(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); 
 
txdr->count = max(ring->tx_pending,(uint32_t)MIN_TXD);
txdr->count = min(txdr->count,(uint32_t)MAX_TXD);
-   IXGB_ROUNDUP(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); 
+   txdr->count = ALIGN(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); 
 
if(netif_running(adapter->netdev)) {
/* Try to get new resources before deleting old */
diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
index afc2ec7..158c71e 100644
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -685,7 +685,7 @@ ixgb_setup_tx_resources(struct ixgb_adapter *adapter)
/* round up to nearest 4K */
 
txdr->size = txdr->count * sizeof(struct ixgb_tx_desc);
-   IXGB_ROUNDUP(txdr->size, 4096);
+   txdr->size = ALIGN(txdr->size, 4096);
 
txdr->desc = pci_alloc_consistent(pdev, txdr->size, >dma);
if(!txdr->desc) {
@@ -774,7 +774,7 @@ ixgb_setup_rx_resources(struct ixgb_adapter *adapter)
/* Round up to nearest 4K */
 
rxdr->size = rxdr->count * sizeof(struct ixgb_rx_desc);
-   IXGB_ROUNDUP(rxdr->size, 4096);
+   rxdr->size = ALIGN(rxdr->size, 4096);
 
rxdr->desc = pci_alloc_consistent(pdev, rxdr->size, >dma);
 
diff --git a/drivers/net/ixgb/ixgb_param.c b/drivers/net/ixgb/ixgb_param.c
index b27442a..ee8cc67 100644
--- a/drivers/net/ixgb/ixgb_param.c
+++ b/drivers/net/ixgb/ixgb_param.c
@@ -284,7 +284,7 @@ ixgb_check_options(struct ixgb_adapter *adapter)
} else {
tx_ring->count = opt.def;
}
-   IXGB_ROUNDUP(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE);
+   tx_ring->count = ALIGN(tx_ring->count, 
IXGB_REQ_TX_DESCRIPTOR_MULTIPLE);
}
{ /* Receive Descriptor Count */
struct ixgb_option opt = {
@@ -303,7 +303,7 @@ ixgb_check_options(struct ixgb_adapter *adapter)
} else {
rx_ring->count = opt.def;
}
-   IXGB_ROUNDUP(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE);
+   rx_ring->count = ALIGN(rx_ring->count, 
IXGB_REQ_RX_DESCRIPTOR_MULTIPLE);
}
{ /* Receive Checksum Offload Enable */
struct ixgb_option opt = {


-- 
Milind Arun Choudhary
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-04-02 Thread Con Kolivas
On Thursday 29 March 2007 15:50, Mike Galbraith wrote:
> On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote:
> + * This contains a bitmap for each dynamic priority level with empty slots
> + * for the valid priorities each different nice level can have. It allows
> + * us to stagger the slots where differing priorities run in a way that
> + * keeps latency differences between different nice levels at a minimum.
> + * ie, where 0 means a slot for that priority, priority running from left
> to + * right:
> + * nice -20 
> + * nice -10 1001000100100010001001000100010010001000
> + * nice   0 0101010101010101010101010101010101010101
> + * nice   5 1101011010110101101011010110101101011011
> + * nice  10 0110111011011101110110111011101101110111
> + * nice  15 0101101101011011
> + * nice  19 1110

Try two instances of chew.c at _differing_ nice levels on one cpu on mainline, 
and then SD. This is why you can't renice X on mainline.

>   -Mike

-- 
-ck
/*
 * orignal idea by Chris Friesen.  Thanks.
 */

#include 
#include 
#include 

#define THRESHOLD_USEC 2000

unsigned long long stamp()
{
struct timeval tv;
gettimeofday(, 0);
return (unsigned long long) tv.tv_usec + ((unsigned long long) tv.tv_sec)*100;
}

int main()
{
unsigned long long thresh_ticks = THRESHOLD_USEC;
unsigned long long cur,last;
struct timespec ts;

sched_rr_get_interval(0, );
printf("pid %d, prio %3d, interval of %d nsec\n", getpid(), getpriority(PRIO_PROCESS, 0), ts.tv_nsec);

last = stamp();
while(1) {
cur = stamp();
unsigned long long delta = cur-last;
if (delta > thresh_ticks) {
printf("pid %d, prio %3d, out for %4llu ms\n", getpid(), getpriority(PRIO_PROCESS, 0), delta/1000);
cur = stamp();
}
last = cur;
}

return 0;
}


Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL

2007-04-02 Thread Con Kolivas
On Saturday 31 March 2007 19:28, Xenofon Antidides wrote:
> For long time now I use windows to work 
> problems. I cannot play wine games with audio, I
> cannot sample video, I cannot use skype, I cannot play
> midi. And even linux only things I try do I cannot
> share my X, I cannot use more than one vmware. All
> those is fix for me with SD.

Any semblance of cpu bandwidth and latency guarantees are easily shot on 
mainline by a single process going wild (eg open tab in firefox).

> I sorry I answer kernel 
> email and go away now for good.

respected; dropped from cc

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-04-02 Thread Con Kolivas
On Thursday 29 March 2007 18:18, Mike Galbraith wrote:
> Rereading to make sure I wasn't unclear anywhere...
>
> On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote:
> > I don't see what a < 95% load really means.
>
> Egad.  Here I'm pondering the numbers and light load as I'm typing, and
> my fingers (seemingly independent when mind wanders off) typed < 95% as
> in not fully committed, instead of "light".

95% of cases where load is less than 4; not 95% load.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ENOENT creating /dev/root on MTD RAM partition

2007-04-02 Thread John Williams

Olaf Hering wrote:

On Mon, Apr 02, John Williams wrote:


Any comments or suggestions on a possible cause or approach to track it
down would be greatly appreciated.


Just a guess:
Check if '/dev' exists. I think it is now possible to not add the built-in
cpio archive with the mandatory /dev, /dev/console and /root entries.


Thanks Olaf - you got me on the right track.  Turned out to be an arch 
link script error whereby the "rootfs" initcalls were not placed in the 
initcall table - thus the init ramfs wasn't yet mounted, and /dev wasn't 
there.


Cheers,

John
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hugetlb: Unable to handle kernel NULL pointer dereference

2007-04-02 Thread Nishanth Aravamudan
On 02.04.2007 [18:46:08 -0700], Nishanth Aravamudan wrote:
> Adam, David,
> 
> Just got the following Oops and recursive fault running `make func`
> (apparently the `shared` test in particular) with kernel HEAD at
> efab03d998da03f67836ffc664b04e0400f85448 on my x86_64. Will pull latest
> Linus and reboot, but haven't seen it posted yet. This occurs both with
> my branch and master from libhugetlbfs.

Bah, blame operator error, perhaps. Not able to reproduce with current
Linus tip.

Thanks,
Nish

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


hugetlb: Unable to handle kernel NULL pointer dereference

2007-04-02 Thread Nishanth Aravamudan
Adam, David,

Just got the following Oops and recursive fault running `make func`
(apparently the `shared` test in particular) with kernel HEAD at
efab03d998da03f67836ffc664b04e0400f85448 on my x86_64. Will pull latest
Linus and reboot, but haven't seen it posted yet. This occurs both with
my branch and master from libhugetlbfs.

[351419.028499] Unable to handle kernel NULL pointer dereference at 
0018 RIP: 
[351419.047110]  [] _write_lock_irqsave+0x1d/0x90
[351419.047121] PGD 0 
[351419.047124] Oops: 0002 [1] PREEMPT SMP 
[351419.047127] CPU 0 
[351419.047129] Modules linked in: tun
[351419.047135] Pid: 29122, comm: shared Not tainted 2.6.21-rc5-gefab03d9-dirty 
#25
[351419.047138] RIP: 0010:[]  [] 
_write_lock_irqsave+0x1d/0x90
[351419.047144] RSP: 0018:81000a5d3c58  EFLAGS: 00010017
[351419.047148] RAX: 81000a5d3fd8 RBX: 0018 RCX: 
81003fe71028
[351419.047151] RDX: 0217 RSI: 81000a5d3cd0 RDI: 
0001
[351419.047155] RBP: 81000a5d3c68 R08: 0005 R09: 
fffc
[351419.047158] R10: 0001 R11: 0246 R12: 
81003fe71000
[351419.047162] R13: 810037052080 R14: 55c0 R15: 
81000a5d3cb8
[351419.047166] FS:  2aea683c6030() GS:80728000() 
knlGS:556f16c0
[351419.047169] CS:  0010 DS: 002b ES: 002b CR0: 8005003b
[351419.047172] CR2: 0018 CR3: 00101000 CR4: 
06e0
[351419.047177] Process shared (pid: 29122, threadinfo 81000a5d2000, task 
81001130c740)
[351419.047179] Stack:  81000a5d3c78 0018 81000a5d3c78 
8016a7a9
[351419.047187]  81000a5d3c98 8014c404 81003fe71000 
81000a5d3c90
[351419.047193]  81000a5d3d08 801c2864 810037052100 
8100087238c8
[351419.047198] Call Trace:
[351419.047204]  [] _write_lock_irq+0x9/0x10
[351419.047210]  [] remove_from_page_cache+0x24/0x40
[351419.047216]  [] __unmap_hugepage_range+0x174/0x1b0
[351419.047220]  [] unmap_hugepage_range+0x47/0x70
[351419.047225]  [] unmap_vmas+0x11b/0x7f0
[351419.047230]  [] exit_mmap+0x87/0x130
[351419.047234]  [] mmput+0x37/0xb0
[351419.047238]  [] exit_mm+0xe5/0xf0
[351419.047242]  [] do_exit+0x238/0x8c0
[351419.047246]  [] _spin_unlock+0x14/0x40
[351419.047250]  [] do_group_exit+0x89/0x90
[351419.047255]  [] sys_exit_group+0x12/0x20
[351419.047262]  [] cstar_do_call+0x1b/0x65
[351419.047264] 
[351419.047265] 
[351419.047266] Code: f0 81 2b 00 00 00 01 0f 94 c0 84 c0 75 4e f0 81 03 00 00 
00 
[351419.047275] RIP  [] _write_lock_irqsave+0x1d/0x90
[351419.047280]  RSP 
[351419.047282] CR2: 0018
[351419.047287] Fixing recursive fault but reboot is needed!
[351419.047290] BUG: scheduling while atomic: shared/0x0003/29122
[351419.047292] 
[351419.047293] Call Trace:
[351419.047298]  [] __sched_text_start+0x5d/0x807
[351419.047304]  [] default_wake_function+0xd/0x10
[351419.047310]  [] autoremove_wake_function+0x11/0x40
[351419.047315]  [] __wake_up_common+0x44/0x80
[351419.047319]  [] do_exit+0x131/0x8c0
[351419.047323]  [] do_page_fault+0x7f6/0x8f0
[351419.047328]  [] wake_up_bit+0x28/0x40
[351419.047332]  [] invalidate_inode_buffers+0x13/0xd0
[351419.047336]  [] _spin_lock+0x16/0x80
[351419.047339]  [] _spin_lock+0x16/0x80
[351419.047343]  [] dput+0x22/0x160
[351419.047347]  [] error_exit+0x0/0x84
[351419.047351]  [] _write_lock_irqsave+0x1d/0x90
[351419.047355]  [] _write_lock_irq+0x9/0x10
[351419.047359]  [] remove_from_page_cache+0x24/0x40
[351419.047363]  [] __unmap_hugepage_range+0x174/0x1b0
[351419.047368]  [] unmap_hugepage_range+0x47/0x70
[351419.047371]  [] unmap_vmas+0x11b/0x7f0
[351419.047376]  [] exit_mmap+0x87/0x130
[351419.047379]  [] mmput+0x37/0xb0
[351419.047383]  [] exit_mm+0xe5/0xf0
[351419.047387]  [] do_exit+0x238/0x8c0
[351419.047390]  [] _spin_unlock+0x14/0x40
[351419.047394]  [] do_group_exit+0x89/0x90
[351419.047398]  [] sys_exit_group+0x12/0x20
[351419.047402]  [] cstar_do_call+0x1b/0x65
[351419.047404] 

Thanks,
Nish

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [RFC] HID bus design overview.

2007-04-02 Thread Li Yu
Nicolas Mailhot wrote:
>> Er, I also want to know what are drawbacks of "flip-flopping" ?
>> 
>
> This will cause major havoc as soon as hot-plugging and apps listening to
> HAL events (xorg eventually) enter in play.
>
>   

~_~

It really need some extra works in user space, but I do not think this
is so critical. These HAL events should not be frequently, and happen
when system boot early very likely. In fact, these works also exist with
blacklist means, but it migrate to HID driver developer, and from
runtime move to development-time. (Of course, you can do it by sysfs,
just like vmware, I think it is so).

Although I do not agree very much, since such many guru said the
"flip-flopping" is not good idea, It is likely appropriate, I also will
change code later,  this make the implementation more easier in fact.

May be, we need some means to change blacklist in runtime. and
loading/unloading such driver by specific script to do it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning: unable to open an initial console.

2007-04-02 Thread young dave

Hi,
Check if your U-boot enabled the udev, try diable the udev, then using
mknod to create the /dev/console.

Regards
dave


2007/4/2, Tom Strader <[EMAIL PROTECTED]>:

I checked /dev/ with U-boot and it shows the existence of /dev/console.

From U-boot prompt:

$ ls /dev
 crw---0 Mon Apr 02 17:52:27 2007 console
 crw-r--r--0 Mon Apr 02 17:52:27 2007 null
 crw-r--r--0 Mon Apr 02 17:52:27 2007 zero


Also, I added a printk in the jffs2_add_fd_to_list() routine in
fs/jffs2/nodelist.c to print out the dirent adds and it shows console
being added as follows:

...
add dirent "var", ino #14
add dirent "usr", ino #13
add dirent "tmp", ino #12
add dirent "sys", ino #11
add dirent "sbin", ino #10
add dirent "proc", ino #9
add dirent "mnt", ino #8
add dirent "linuxrc", ino #7
add dirent "lib", ino #6
add dirent "home", ino #5
add dirent "etc", ino #4
add dirent "dev", ino #3
add dirent "bin", ino #2
VFS: Mounted root (jffs2 filesystem).
Freeing init memory: 76K
add dirent "zero", ino #70
add dirent "null", ino #69
add dirent "console", ino #68
Warning: unable to open an initial console.
add dirent "watchdog", ino #262
add dirent "udevstart", ino #261
add dirent "udevsend", ino #260
...

Any other ideas?
Thanks,
Tom


-Original Message-
From: Chris Wedgwood [mailto:[EMAIL PROTECTED]
Sent: Monday, April 02, 2007 3:49 PM
To: Tom Strader
Cc: linux-kernel@vger.kernel.org
Subject: Re: Warning: unable to open an initial console.

On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote:

> I have seen quite a few posts regarding unable to open an initial
> console, but my system seems to have the necessary things in place
> so I come looking for help.

your rootfs/initramfs/initrd is missing a valid working /dev/console

> VFS: Mounted root (jffs2 filesystem).

check /dev/ on this filesystem


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)

2007-04-02 Thread Eric W. Biederman
Adrian Bunk <[EMAIL PROTECTED]> writes:

> On Sun, Apr 01, 2007 at 05:21:06PM +0200, Tilman Schmidt wrote:
>> I'm sorry to say this has now happened with kernel 2.6.21-rc5, too.
>> I started a kernel compilation in the evening and came back in the
>> morning to find all KDE decorations gone. All processes normally
>> running for a KDE session and labelled "[kinit]" in ps were gone
>> but everything else was running fine, and the system was still
>> usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained
>> nothing remotely suspicious. /var/log/messages had two lines I
>> never saw before:
>> 
>> Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning:
> vs-8115: get_num_ver: not directory or indirect item
>> Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning:
> vs-8115: get_num_ver: not directory or indirect item
>
> Reiserfs people Cc'ed for this.
>
>> But those didn't appear on previous occurrences of the "dying KDE"
>> problem so I guess they are not related.
>> 
>> This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110
>> (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk)
>> % uname -a
>> Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 i686
> i686 i386 GNU/Linux
>> % cat /proc/cmdline
>> root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED]
> nmi_watchdog=2 lapic 5
>> Kernel configuration mostly-modular, based on standard SuSE kernel's
>> /proc/config.gz, just compiling into the kernel everything I need to
>> boot without an initrd and omitting some parts I'm not interested in.
>> (.config attached.) What else might be relevant?
>> 
>> Again, this is a Heisenbug, ie. it's not reproducible and invariably
>> happens when I'm away from the machine. (Probably Murphy at work.)
>> It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and
>> once on 2.6.21-rc5, on a machine which spends about equal amounts
>> of time running the latest stable, rc, and mm kernels. OTOH, so far
>> it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have
>> I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4
>> and -rc4-mm releases that's not conclusive as those have only been
>> running for a very short time.
>
> We also have another report of crashes under KDE:
>
> Subject: crashes in KDE
> References : http://bugzilla.kernel.org/show_bug.cgi?id=8157
> Submitter  : Oliver Pinter <[EMAIL PROTECTED]>
> Status : unknown
>
> We also have one bug kwin ran into that got fixed after -rc5:
>
> Subject: kwin dies silently
> References : http://lkml.org/lkml/2007/2/28/112
> Submitter  : Sid Boyce <[EMAIL PROTECTED]>
>  Boris Mogwitz <[EMAIL PROTECTED]>
>  Michael Wu <[EMAIL PROTECTED]>
> Caused-By  : Eric W. Biederman <[EMAIL PROTECTED]>
>  commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1
> Fixed-By   : Eric W. Biederman <[EMAIL PROTECTED]>
> Commit : 14e9d5730adfca26452b3a2838a80af6950556f5
> Status : fixed in -rc6
>
> These might or might not be related issues.

The description above sounds like the kwin bug, except for the trigger.
(i.e. The set of processes that die are all largely part of the same
 process group, and they sound like the same set of processes).
So I'm guessing it is the same issue.

The crashes in KDE bug may also be the same problem there isn't enough
information to make a good guess.  So until -rc6 we get a test
report from -rc6 after or whatever Linus latest tree is I'm not inclined
to dig farther.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] vt: Expose system-wide UTF-8 default setting via sysfs

2007-04-02 Thread Antonino A. Daplas
Create a variable, default_utf8, that defines the system-wide default UTF-8
setting.  This variable can be altered via sysfs. If the variable is properly
set, this should mimimize breakage of UTF-8 encoded consoles when doing a
reset or echo -e '\033c' and of newly opened/allocated consoles.

This is based from patches by Jan Engelhardt and Paul LeoNerd Evans.

Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]>
---
> I think you're missing the whole point of console reset.  Its purpose is 
> to force the console into a known-good state.  The fewer pieces of state 
> it leaves unset, the better.  To some degree it's less important what 
> that state actually is.

Okay, you convinced me. Hopefully this is acceptable to all parties.

Andrew,

If everybody agrees, can you drop the previous patch I sent to you, and use
this instead?

Tony

 drivers/char/vt.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/char/vt.c b/drivers/char/vt.c
index 1bbb45b..8aca96f 100644
--- a/drivers/char/vt.c
+++ b/drivers/char/vt.c
@@ -157,6 +157,8 @@ static void blank_screen_t(unsigned long
 static void set_palette(struct vc_data *vc);
 
 static int printable;  /* Is console ready for printing? */
+static int default_utf8;
+module_param(default_utf8, int, S_IRUGO | S_IWUSR);
 
 /*
  * ignore_poke: don't unblank the screen when things are typed.  This is
@@ -1497,7 +1499,7 @@ static void reset_terminal(struct vc_dat
vc->vc_charset  = 0;
vc->vc_need_wrap= 0;
vc->vc_report_mouse = 0;
-   vc->vc_utf  = 0;
+   vc->vc_utf  = default_utf8;
vc->vc_utf_count= 0;
 
vc->vc_disp_ctrl= 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [RFC] HID bus design overview.

2007-04-02 Thread Li Yu
Marcel Holtmann wrote:
> The cleanest solution without a layer violation is that you can
> register a driver for a specific VID/PID and then report id (one or
> more). All
> reports with ids that we don't have a special driver for are handled by
> the default HID->input driver or handed over to hidraw if not parseable.
> The reports for ids with a special driver are handed over to the driver.
>
> And for hidraw it would be nice if we can apply filters for specific
> report ids to keep the round-trips and overhead at a minimum.
>
>   
If we don't use "flip-flopping" means, the common driver and specific
driver concepts also don't need. They are completely same driver for HID
bus, just one without some hooks, another without. The common event
processing is an API from HID core. so, here have not round-trips.

What's the position of hidraw? It only is used when all other driver is
not usable on some report? or, it should be stick every working device.

PS: In last broken "flip-flopping" resolution, the USBHID work also need
some changes ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL

2007-04-02 Thread Con Kolivas
On Thursday 29 March 2007 21:22, Ingo Molnar wrote:
> [ A quick guess: could SD's substandard interactivity in this test be
>   due to the SMP migration logic inconsistencies Mike noticed? This is
>   an SMP system and the hackbench workload is very scheduling intense
>   and tasks are frequently queued from one CPU to another. ]

I assume you put it on and endless loop since hackbench 10 runs for .5 second 
on my machine. Doubtful it's an SMP issue. update_if_moved should maintain 
cross cpu scheduling decisions. The same slowdown would happen on UP and is 
almost certainly due to the fact that hackbench 10 induces a load of _160_ on 
the machine.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: staircase deadline improvements

2007-04-02 Thread Con Kolivas
Staircase Deadline improvements.

Nice is better distributed for waking tasks with a per-static-prio prio_level.

SCHED_RR tasks were not being requeued on expiration.

Tighten up accounting.

Fix comment style.

Microoptimisation courtesy of Dmitry Adamushko <[EMAIL PROTECTED]>

Signed-off-by: Con Kolivas <[EMAIL PROTECTED]>

---
 kernel/sched.c |   97 +++--
 1 file changed, 60 insertions(+), 37 deletions(-)

Index: linux-2.6.21-rc5-mm3/kernel/sched.c
===
--- linux-2.6.21-rc5-mm3.orig/kernel/sched.c2007-04-02 10:37:07.0 
+1000
+++ linux-2.6.21-rc5-mm3/kernel/sched.c 2007-04-03 10:40:48.0 +1000
@@ -132,20 +132,20 @@ struct rq;
  * These are the runqueue data structures:
  */
 struct prio_array {
-   struct list_head queue[MAX_PRIO];
/* Tasks queued at each priority */
+   struct list_head queue[MAX_PRIO];
 
-   DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1);
/*
 * The bitmap of priorities queued for this array. While the expired
 * array will never have realtime tasks on it, it is simpler to have
 * equal sized bitmaps for a cheap array swap. Include 1 bit for
 * delimiter.
 */
+   DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1);
 
 #ifdef CONFIG_SMP
-   struct rq *rq;
/* For convenience looks back at rq */
+   struct rq *rq;
 #endif
 };
 
@@ -212,14 +212,14 @@ struct rq {
struct prio_array *active, *expired, arrays[2];
unsigned long *dyn_bitmap, *exp_bitmap;
 
-   int prio_level, best_static_prio;
/*
-* The current dynamic priority level this runqueue is at, and the
-* best static priority queued this major rotation.
+* The current dynamic priority level this runqueue is at per static
+* priority level, and the best static priority queued this rotation.
 */
+   int prio_level[PRIO_RANGE], best_static_prio;
 
-   unsigned long prio_rotation;
/* How many times we have rotated the priority queue */
+   unsigned long prio_rotation;
 
atomic_t nr_iowait;
 
@@ -707,19 +707,29 @@ static inline int first_prio_slot(struct
 static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
 {
DECLARE_BITMAP(tmp, PRIO_RANGE);
-   int search_prio;
+   int search_prio, uprio = USER_PRIO(p->static_prio);
 
-   if (p->static_prio < rq->best_static_prio)
+   /*
+* Only priorities equal to the prio_level and above for their
+* static_prio are acceptable, and only if it's not better than
+* a queued better static_prio's prio_level.
+*/
+   if (p->static_prio < rq->best_static_prio) {
search_prio = MAX_RT_PRIO;
-   else
-   search_prio = rq->prio_level;
+   if (likely(p->policy != SCHED_BATCH))
+   rq->best_static_prio = p->static_prio;
+   } else if (p->static_prio == rq->best_static_prio)
+   search_prio = rq->prio_level[uprio];
+   else {
+   search_prio = max(rq->prio_level[uprio],
+   rq->prio_level[USER_PRIO(rq->best_static_prio)]);
+   }
if (unlikely(p->policy == SCHED_BATCH)) {
search_prio = max(search_prio, p->static_prio);
return SCHED_PRIO(find_next_zero_bit(p->bitmap, PRIO_RANGE,
  USER_PRIO(search_prio)));
}
-   bitmap_or(tmp, p->bitmap, prio_matrix[USER_PRIO(p->static_prio)],
- PRIO_RANGE);
+   bitmap_or(tmp, p->bitmap, prio_matrix[uprio], PRIO_RANGE);
return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
USER_PRIO(search_prio)));
 }
@@ -745,14 +755,18 @@ static void queue_expired(struct task_st
 
if (src_rq == rq)
return;
-   if (p->rotation == src_rq->prio_rotation)
+   /*
+* Only need to set p->array when p->rotation == rq->prio_rotation as
+* they will be set in recalc_task_prio when != rq->prio_rotation.
+*/
+   if (p->rotation == src_rq->prio_rotation) {
p->rotation = rq->prio_rotation;
-   else
+   if (p->array == src_rq->expired)
+   p->array = rq->expired;
+   else
+   p->array = rq->active;
+   } else
p->rotation = 0;
-   if (p->array == src_rq->expired)
-   p->array = rq->expired;
-   else
-   p->array = rq->active;
 }
 #else
 static inline void update_if_moved(struct task_struct *p, struct rq *rq)
@@ -1671,16 +1685,16 @@ void fastcall sched_fork(struct task_str
 * total amount of pending timeslices in the system doesn't change,
 * resulting in more scheduling fairness.
 */
-   if (unlikely(p->time_slice < 2))
-   p->time_slice = 2;
-   p->time_slice 

Re: [PATCH] vt: Do not clear UTF when resetting console

2007-04-02 Thread H. Peter Anvin

Jan Engelhardt wrote:

On Apr 3 2007 08:16, Antonino A. Daplas wrote:

That would be the cleanest and purest behavior. But it's possible to set
one console to UTF-8 and another to legacy mode.


The question would be: why would you want to have mixed consoles?
Switching to UTF8 IMO does not take away any characters, and I mean
no-framebuffer 80x25 that is limited to 256 glyphs.



512, not 256.  However, the reason would be because you have an 
application (which might actually be running on another system 
entirely!) which expects the other behaviour.


Antonio wrote:

That would be the cleanest and purest behavior. But it's possible to set
one console to UTF-8 and another to legacy mode. So one can corrupt the
user's console just by issuing a reset or echo -e '\033c'. (Although one
can argue that users who know what UTF-8 is also knows how to set the
encoding back)

Until userspace is more capable of setting back the terminal to its
previous configuration, I would tend to agree with Jan, that we should
leave the current utf setting of that particular vc alone.


I think you're missing the whole point of console reset.  Its purpose is 
to force the console into a known-good state.  The fewer pieces of state 
it leaves unset, the better.  To some degree it's less important what 
that state actually is.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Do not clear UTF when resetting console

2007-04-02 Thread Antonino A. Daplas
On Tue, 2007-04-03 at 02:23 +0200, Jan Engelhardt wrote:
> On Apr 3 2007 08:16, Antonino A. Daplas wrote:
> >
> >That would be the cleanest and purest behavior. But it's possible to set
> >one console to UTF-8 and another to legacy mode.
> 
> The question would be: why would you want to have mixed consoles?
> Switching to UTF8 IMO does not take away any characters, and I mean
> no-framebuffer 80x25 that is limited to 256 glyphs.

As long as we provide the users the capability to support mixed
encodings, the why is not important, it will happen. If we want to be
more restrictive, I guess, we can remove support for

echo -e '\033%G' and '\033%@'

Tony


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Powerpc build unhappy in 2.6.20.4?

2007-04-02 Thread Tony Breeds
On Mon, Apr 02, 2007 at 03:14:14PM -0400, Rob Landley wrote:
 
> Sure, quite easily the source of the trouble.  Attached in both full .config 
> and mini.config formats.

Okay, I have no idea how it happend but you seem to have an invalid
config.  It looks to me like you need to select a platform.

One of the following:
CONFIG_PPC_PSERIES
CONFIG_PPC_MAPLE
CONFIG_PPC_IBM_CELL_BLADE
CONFIG_PPC_PS3
CONFIG_PPC_CHRP
CONFIG_PPC_EFIKA
CONFIG_PPC_PMAC

When did this config last build a zImage?  I'm guessing either CHRP or
PMAC?

Yours Tony

  linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] (re)register_binfmt returns with -EBUSY

2007-04-02 Thread Andrew Morton
On Mon, 2 Apr 2007 18:24:15 +0530
"kalash nainwal" <[EMAIL PROTECTED]> wrote:

> When a binary format is unregistered and re-registered,
> register_binfmt fails with -EBUSY. The reason is that
> unregister_binfmt does not set fmt->next to NULL, and seeing
> (fmt->next != NULL), register_binfmt fails with -EBUSY.
> 
> One can find his way around by explicitly setting fmt->next to NULL
> after unregistering, but that is kind of unclean (one should better be
> using only the interfaces, and not the interal members, isn't it?)
> 
> Attached one-liner can fix it (for 2.6.20).

Yes, that'll fix it.

But I wonder why register_binfmt() even checks that the to-be-registered
linux_binfmt has a non-null fmt->next?  Presumably that's there to catch
erroneous re-registration of an already-registered format.

All very odd.  It looks like that code should be converted to list_heads
anyway...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] x86_64: Implement SPARSE_VIRTUAL

2007-04-02 Thread Christoph Lameter
x86_64 implement SPARSE_VIRTUAL

x86_64 is using 2M page table entries to map its 1-1 kernel space.
We implement the virtual memmap also using 2M page table entries.
So there is no difference at all to FLATMEM. Both schemes require
a page table and a TLB for each 2MB. FLATMEM still references memory
since the mem_map pointer itself a variable. SPARSE_VIRTUAL uses a
constant for vmemmap. Thus no memory reference. SPARSE_VIRTUAL should
be superior to even FLATMEM.

With this SPARSEMEM becomes the most efficient way of handling
virt_to_page, pfn_to_page and friends for UP, SMP and NUMA on
x86_64.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5-mm3/include/asm-x86_64/page.h
===
--- linux-2.6.21-rc5-mm3.orig/include/asm-x86_64/page.h 2007-04-02 
12:25:03.0 -0700
+++ linux-2.6.21-rc5-mm3/include/asm-x86_64/page.h  2007-04-02 
12:27:16.0 -0700
@@ -127,6 +127,7 @@
 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 
 #define __HAVE_ARCH_GATE_AREA 1
+#define vmemmap ((struct page *)0xe200UL)
 
 #include 
 #include 
Index: linux-2.6.21-rc5-mm3/Documentation/x86_64/mm.txt
===
--- linux-2.6.21-rc5-mm3.orig/Documentation/x86_64/mm.txt   2007-04-02 
12:25:03.0 -0700
+++ linux-2.6.21-rc5-mm3/Documentation/x86_64/mm.txt2007-04-02 
12:27:16.0 -0700
@@ -9,6 +9,7 @@
 8100 - c0ff (=46 bits) direct mapping of all phys. 
memory
 c100 - c1ff (=40 bits) hole
 c200 - e1ff (=45 bits) vmalloc/ioremap space
+e200 - e2ff (=40 bits) virtual memory map
 ... unused hole ...
 8000 - 8280 (=40 MB)   kernel text mapping, from phys 0
 ... unused hole ...
Index: linux-2.6.21-rc5-mm3/arch/x86_64/Kconfig
===
--- linux-2.6.21-rc5-mm3.orig/arch/x86_64/Kconfig   2007-04-02 
12:27:13.0 -0700
+++ linux-2.6.21-rc5-mm3/arch/x86_64/Kconfig2007-04-02 12:28:13.0 
-0700
@@ -392,6 +392,12 @@
def_bool y
depends on (NUMA || EXPERIMENTAL)
 
+config SPARSE_VIRTUAL
+   def_bool y
+
+config ARCH_SUPPORTS_PMD_MAPPING
+   def_bool y
+
 config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] IA64: Implement SPARSE_VIRTUAL

2007-04-02 Thread Christoph Lameter
[IA64] Sparse virtual implementation

Equip IA64 sparsemem with a virtual memmap. This is similar to the existing
CONFIG_VMEMMAP functionality for discontig. It uses a page size mapping.

This is provided as a minimally intrusive solution. We split the
128TB VMALLOC area into two 64TB areas and use one for the virtual memmap.

I have another patch in testing that uses granule sized 16MB pages to map
the memmap but this would require changes to the interrupt vector table
and there are certain discussions that would need to take place
before we can accept such a large page size for the memmap. That version
is better because is improves IA64 performance by reducing TLB pressure.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5-mm2/arch/ia64/Kconfig
===
--- linux-2.6.21-rc5-mm2.orig/arch/ia64/Kconfig 2007-04-02 16:15:29.0 
-0700
+++ linux-2.6.21-rc5-mm2/arch/ia64/Kconfig  2007-04-02 16:15:50.0 
-0700
@@ -350,6 +350,10 @@ config ARCH_SPARSEMEM_ENABLE
def_bool y
depends on ARCH_DISCONTIGMEM_ENABLE
 
+config SPARSE_VIRTUAL
+   def_bool y
+   depends on ARCH_SPARSEMEM_ENABLE
+
 config ARCH_DISCONTIGMEM_DEFAULT
def_bool y if (IA64_SGI_SN2 || IA64_GENERIC || IA64_HP_ZX1 || 
IA64_HP_ZX1_SWIOTLB)
depends on ARCH_DISCONTIGMEM_ENABLE
Index: linux-2.6.21-rc5-mm2/include/asm-ia64/page.h
===
--- linux-2.6.21-rc5-mm2.orig/include/asm-ia64/page.h   2007-04-02 
16:15:29.0 -0700
+++ linux-2.6.21-rc5-mm2/include/asm-ia64/page.h2007-04-02 
16:15:50.0 -0700
@@ -106,6 +106,9 @@ extern int ia64_pfn_valid (unsigned long
 # define ia64_pfn_valid(pfn) 1
 #endif
 
+#define vmemmap ((struct page *)(RGN_BASE(RGN_GATE) + \
+   (1UL << (4*PAGE_SHIFT - 10
+
 #ifdef CONFIG_VIRTUAL_MEM_MAP
 extern struct page *vmem_map;
 #ifdef CONFIG_DISCONTIGMEM
Index: linux-2.6.21-rc5-mm2/include/asm-ia64/pgtable.h
===
--- linux-2.6.21-rc5-mm2.orig/include/asm-ia64/pgtable.h2007-04-02 
16:15:29.0 -0700
+++ linux-2.6.21-rc5-mm2/include/asm-ia64/pgtable.h 2007-04-02 
16:15:50.0 -0700
@@ -236,8 +236,13 @@ ia64_phys_addr_valid (unsigned long addr
 # define VMALLOC_END   vmalloc_end
   extern unsigned long vmalloc_end;
 #else
+#if defined(CONFIG_SPARSEMEM) && defined(CONFIG_SPARSE_VIRTUAL)
+/* SPARSE_VIRTUAL uses half of vmalloc... */
+# define VMALLOC_END   (RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 
10)))
+#else
 # define VMALLOC_END   (RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 
9)))
 #endif
+#endif
 
 /* fs/proc/kcore.c */
 #definekc_vaddr_to_offset(v) ((v) - RGN_BASE(RGN_GATE))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] Generic Virtual Memmap suport for SPARSEMEM V2

2007-04-02 Thread Christoph Lameter
Spare Virtual: Virtual Memmap support for SPARSEMEM V2

V1->V2
 - Support for PAGE_SIZE vmemmap which allows the general use of
   of virtual memmap on any MMU capable platform (enabled IA64
   support).
 - Fix various issues as suggested by Dave Hansen.
 - Add comments and error handling.

SPARSEMEM is a pretty nice framework that unifies quite a bit of
code over all the arches. It would be great if it could be the default
so that we can get rid of various forms of DISCONTIG and other variations
on memory maps. So far what has hindered this are the additional lookups
that SPARSEMEM introduces for virt_to_page and page_address. This goes
so far that the code to do this has to be kept in a separate function
and cannot be used inline.

This patch introduces virtual memmap support for sparsemem. virt_to_page
page_address and consorts become simple shift/add operations. No page flag
fields, no table lookups, nothing involving memory is required.

The two key operations pfn_to_page and page_to_page become:

#define pfn_to_page(pfn) (vmemmap + (pfn))
#define page_to_pfn(page)((page) - vmemmap)

In order for this to work we will have to use a virtual mapping.
These are usually for free since kernel memory is already mapped
via a 1-1 mapping requiring a page tabld. The virtual mapping must
be big enough to span all of memory that an arch can support which
may make a virtual memmap difficult to use on 32 bit platforms
that support 36 address bits.

However, if there is enough virtual space available and the arch
already maps its 1-1 kernel space using TLBs (f.e. true of IA64
and x86_64) then this technique makes sparsemem lookups even more
effiecient than CONFIG_FLATMEM. FLATMEM still needs to read the
contents of mem_map. mem_map is constant for a virtual memory map.

Maybe this patch will allow us to make SPARSEMEM the default
configuration that will work on UP, SMP and NUMA on most platforms?
Then we may hopefully be able to remove the various forms of support
for FLATMEM, DISCONTIG etc etc.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5-mm2/include/asm-generic/memory_model.h
===
--- linux-2.6.21-rc5-mm2.orig/include/asm-generic/memory_model.h
2007-04-02 15:13:20.0 -0700
+++ linux-2.6.21-rc5-mm2/include/asm-generic/memory_model.h 2007-04-02 
17:15:45.0 -0700
@@ -46,6 +46,14 @@
 __pgdat->node_start_pfn;   \
 })
 
+#elif defined(CONFIG_SPARSE_VIRTUAL)
+
+/*
+ * We have a virtual memmap that makes lookups very simple
+ */
+#define __pfn_to_page(pfn) (vmemmap + (pfn))
+#define __page_to_pfn(page)((page) - vmemmap)
+
 #elif defined(CONFIG_SPARSEMEM)
 /*
  * Note: section's mem_map is encorded to reflect its start_pfn.
Index: linux-2.6.21-rc5-mm2/mm/sparse.c
===
--- linux-2.6.21-rc5-mm2.orig/mm/sparse.c   2007-04-02 15:58:23.0 
-0700
+++ linux-2.6.21-rc5-mm2/mm/sparse.c2007-04-02 17:19:13.0 -0700
@@ -9,6 +9,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /*
  * Permanent SPARSEMEM data:
@@ -101,7 +103,7 @@ static inline int sparse_index_init(unsi
 
 /*
  * Although written for the SPARSEMEM_EXTREME case, this happens
- * to also work for the flat array case becase
+ * to also work for the flat array case because
  * NR_SECTION_ROOTS==NR_MEM_SECTIONS.
  */
 int __section_nr(struct mem_section* ms)
@@ -211,6 +213,214 @@ static int sparse_init_one_section(struc
return 1;
 }
 
+#ifdef CONFIG_SPARSE_VIRTUAL
+/*
+ * Virtual Memory Map support
+ *
+ * (C) 2007 sgi. Christoph Lameter <[EMAIL PROTECTED]>.
+ *
+ * Virtual memory maps allow VM primitives pfn_to_page, page_to_pfn,
+ * virt_to_page, page_address() etc that involve no memory accesses at all.
+ *
+ * However, virtual mappings need a page table and TLBs. Many Linux
+ * architectures already map their physical space using 1-1 mappings
+ * via TLBs. For those arches the virtual memmory map is essentially
+ * for free if we use the same page size as the 1-1 mappings. In that
+ * case the overhead consists of a few additional pages that are
+ * allocated to create a view of memory for vmemmap.
+ *
+ * Special Kconfig settings:
+ *
+ * CONFIG_ARCH_POPULATES_VIRTUAL_MEMMAP
+ *
+ * The architecture has its own functions to populate the memory
+ * map and provides a vmemmap_populate function.
+ *
+ * CONFIG_ARCH_SUPPORTS_PMD_MAPPING
+ *
+ * If not set then PAGE_SIZE mappings are generated which
+ * require one PTE/TLB per PAGE_SIZE chunk of the virtual memory map.
+ *
+ * If set then PMD_SIZE mappings are generated which are much
+ * lighter on the TLB. On some platforms these generate
+ * the same overhead as the 1-1 mappings.
+ */
+
+/*
+ * Allocate a block of memory to be used for the virtual memory map
+ * or the page tables that are used to create the 

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Mon, Apr 02, 2007 at 05:23:20PM -0700, Christoph Lameter wrote:
> On Mon, 2 Apr 2007, Siddha, Suresh B wrote:
> 
> > Set the node_possible_map at runtime. On a non NUMA system,
> > num_possible_nodes() will now say '1'
> 
> How does this relate to nr_node_ids?

With this patch, nr_node_ids on non NUMA will also be '1' and
as before nr_node_ids is same as num_possible_nodes()

thanks,
suresh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Do not clear UTF when resetting console

2007-04-02 Thread Jan Engelhardt

On Apr 3 2007 08:16, Antonino A. Daplas wrote:
>
>That would be the cleanest and purest behavior. But it's possible to set
>one console to UTF-8 and another to legacy mode.

The question would be: why would you want to have mixed consoles?
Switching to UTF8 IMO does not take away any characters, and I mean
no-framebuffer 80x25 that is limited to 256 glyphs.


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Christoph Lameter
On Mon, 2 Apr 2007, Siddha, Suresh B wrote:

> Set the node_possible_map at runtime. On a non NUMA system,
> num_possible_nodes() will now say '1'

How does this relate to nr_node_ids?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Do not clear UTF when resetting console

2007-04-02 Thread Antonino A. Daplas
On Mon, 2007-04-02 at 10:35 -0700, H. Peter Anvin wrote:
> Antonino A. Daplas wrote:
> > Resetting the console, either by ANSI escape sequences or by the reset 
> > utility,
> > will drop the console back to legacy (non-UTF-8) mode. Fix this by leaving 
> > the
> > field vc_data.vc_utf untouched in reset_terminal(). In addition, a global
> > variable (default_utf8) which defines system-wide UTF-8 setting is created.
> > This variable can be adjusted via sysfs.
> 
> If you're going to introduce a system-wide default, instead of issuing 
> the appropriate escape code, then I would argue it should still be 
> forced (to the default) when issuing a console reset.
> 

That would be the cleanest and purest behavior. But it's possible to set
one console to UTF-8 and another to legacy mode. So one can corrupt the
user's console just by issuing a reset or echo -e '\033c'. (Although one
can argue that users who know what UTF-8 is also knows how to set the
encoding back)

Until userspace is more capable of setting back the terminal to its
previous configuration, I would tend to agree with Jan, that we should
leave the current utf setting of that particular vc alone.

Tony


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] vt: Do not clear UTF when resetting console

2007-04-02 Thread Antonino A. Daplas
On Mon, 2007-04-02 at 21:10 +0200, Jan Engelhardt wrote:
> On Apr 2 2007 22:13, Antonino A. Daplas wrote:
> >Resetting the console, either by ANSI escape sequences or by the reset 
> >utility,
> >will drop the console back to legacy (non-UTF-8) mode. Fix this by leaving 
> >the
> >field vc_data.vc_utf untouched in reset_terminal(). In addition, a global
> >variable (default_utf8) which defines system-wide UTF-8 setting is created.
> >This variable can be adjusted via sysfs.
> >
> >This is based from patches by Jan Engelhardt and Paul LeoNerd Evans.
> >
> >Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]>
> 
> Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>
> 
> 
> >---
> >
> > drivers/char/vt.c |4 +++-
> > 1 files changed, 3 insertions(+), 1 deletions(-)
> 
> 
> BTW. Is it feasible to make utf8 the default (static int default_utf8 = 1)
> or is that likely to break some installs?

I guess it would, but most if not all major distribs are moving/have
moved to utf-8.

Tony


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Poor UDP performance using 2.6.21-rc5-rt5

2007-04-02 Thread David Sperry


> -Original Message-
> From: Ingo Molnar [mailto:[EMAIL PROTECTED]
> Sent: Monday, April 02, 2007 3:05 PM
> To: [EMAIL PROTECTED]
> Cc: Dave Sperry; linux-rt-users@vger.kernel.org; linux-
> [EMAIL PROTECTED]
> Subject: Re: Poor UDP performance using 2.6.21-rc5-rt5
> 
> 
> * [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
> > The Intel NIC seems to behave better under RT
> 
> yeah.
> 
> > I think there is some kind of bad behavior happening in the Nvidia
> > driver with respect to softirq-net-tx and IRQ-8406.
> 
> yes. Part of the problem is that the forcedeth.c driver does not fully
> support NAPI - today i've implemented those bits (see them below), based
> on your testcase. The other part is that the Intel NIC uses MSI, while
> foredeth uses fasteoi, correct? [you can see this in /proc/interrupts]

In my case forcedeth seems to be picking up MSI for eth2 & eth3

]$ cat /proc/interrupts
   CPU0   CPU1
  0:110  0   IO-APIC-edge  timer
  1:  0 10   IO-APIC-edge  i8042
  8:  0  0   IO-APIC-edge  rtc
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  0124   IO-APIC-edge  i8042
 20:  0  0   IO-APIC-fasteoi   libata
 21:  4   7755   IO-APIC-fasteoi   libata
 22:  0   5570   IO-APIC-fasteoi   ehci_hcd:usb2
 23:  0  1   IO-APIC-fasteoi   ohci_hcd:usb1, libata
8406:  7  15969   PCI-MSI-edge  eth3
8407:  8  17249   PCI-MSI-edge  eth2
8408:  0131   PCI-MSI-edge  eth1
8409:  0 85   PCI-MSI-edge  eth0
NMI:  0  0
LOC: 201594 202389
ERR:  0

Could this be part of my problem?

The lspci for the device is:

00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
Subsystem: Super Micro Computer Inc Unknown device 1611
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
SERR- TAbort-
SERR-  
> there are a few other things i'm working on to improve this. I've
> uploaded -rt9 which is the current state of affairs. Note that using
> -rt9 you'll likely only see IRQ-8406 overhead in the system, because
> i've added an optimization to do process the softirq-net-tx workload in
> the hardirq thread if the priority of the two is the same (which is the
> default behavior). But -rt9 is still work in progress that is not fully
> finished yet: in some cases i'm seeing 'fluctuating performance'
> problems on forcedeth that werent there before.

I tried -rt9 and saw some odd 'fluctuating performance'. I'll try it again
tomorrow when I am much closer to the box's power button.

Thanks again,
Dave



> 
>   Ingo
> 
> ->
> From: Ingo Molnar <[EMAIL PROTECTED]>
> Subject: [patch] forcedeth.c: improve NAPI handler
> 
> another forcedeth.c thing: i noticed that its NAPI handler does not do
> tx-ring processing. The patch below implements this - tested on
> DESC_VER_2 hardware, with CONFIG_FORCEDETH_NAPI=y.
> 
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> 
> Index: linux/drivers/net/forcedeth.c
> ===
> --- linux.orig/drivers/net/forcedeth.c
> +++ linux/drivers/net/forcedeth.c
> @@ -3118,9 +3118,17 @@ static int nv_napi_poll(struct net_devic
>   int retcode;
> 
>   if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) {
> + spin_lock_irqsave(>lock, flags);
> + nv_tx_done(dev);
> + spin_unlock_irqrestore(>lock, flags);
> +
>   pkts = nv_rx_process(dev, limit);
>   retcode = nv_alloc_rx(dev);
>   } else {
> + spin_lock_irqsave(>lock, flags);
> + nv_tx_done_optimized(dev, np->tx_ring_size);
> + spin_unlock_irqrestore(>lock, flags);
> +
>   pkts = nv_rx_process_optimized(dev, limit);
>   retcode = nv_alloc_rx_optimized(dev);
>   }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] Use a single loader for i386 and x86_64

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 16:43 -0300, Glauber de Oliveira Costa wrote:
> This patch moves lguest.c one level bellow, and enhances it with the
> ability to kick off 64 binaries. It would be much easier to just ifdef
> functions, but I have x86_64 machines loading 32-bit kernels as a longer
> goal, and that's why the patch features the load_elf_header() function.

Hi Glauber!

I've been writing documentation, and in the process completely
reorganised and cleaned up this file.  I have also been working on
getting rid of the gratuitous u32's for addresses and trying to cleanse
myself of 32-bit thinking!

I've now pushed the changes to the repo.  They should simplify this
patch...

Thanks!
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] cpuid: switch to cpuid_on_cpu()

2007-04-02 Thread H. Peter Anvin

Alexey Dobriyan wrote:

Now that cpuid_on_cpu() is in core, cpuid driver can be shrinked.

Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>


Hi Alexey,

This, and your other changes in this area does conflict with the work 
that I've been doing on extending the usability of the CPUID and MSR 
drivers (which is part of why this work has dragged out seemingly forever.)


I would really appreciate it if we could work together on this; there 
needs to be new paravirtualization entry points for this.  Consequently, 
I just updated and uploaded a git tree with the current status.  It 
still needs porting to x86-64, however.


The current cpuid/msr work is at:

http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-cpuidmsr.git;a=summary

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] Fix race between cat /proc/slab_allocators and rmmod

2007-04-02 Thread Andrew Morton
On Tue, 03 Apr 2007 09:09:55 +1000
Rusty Russell <[EMAIL PROTECTED]> wrote:

> On Mon, 2007-04-02 at 19:03 +0400, Alexey Dobriyan wrote:
> > Same story as with cat /proc/*/wchan race vs rmmod race, only
> > /proc/slab_allocators want more info than just symbol name.
> > 
> > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
> 
> All these look excellent.  I hope Andrew picked them up?
> 

I wasn't cc'ed on them, but I'll push them into the hole marked "In" anyway.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Warning: unable to open an initial console.

2007-04-02 Thread Tom Strader
I checked /dev/ with U-boot and it shows the existence of /dev/console.

>From U-boot prompt:

$ ls /dev
 crw---0 Mon Apr 02 17:52:27 2007 console
 crw-r--r--0 Mon Apr 02 17:52:27 2007 null
 crw-r--r--0 Mon Apr 02 17:52:27 2007 zero


Also, I added a printk in the jffs2_add_fd_to_list() routine in
fs/jffs2/nodelist.c to print out the dirent adds and it shows console
being added as follows:

...
add dirent "var", ino #14
add dirent "usr", ino #13
add dirent "tmp", ino #12
add dirent "sys", ino #11
add dirent "sbin", ino #10
add dirent "proc", ino #9
add dirent "mnt", ino #8
add dirent "linuxrc", ino #7
add dirent "lib", ino #6
add dirent "home", ino #5
add dirent "etc", ino #4
add dirent "dev", ino #3
add dirent "bin", ino #2
VFS: Mounted root (jffs2 filesystem).
Freeing init memory: 76K
add dirent "zero", ino #70
add dirent "null", ino #69
add dirent "console", ino #68
Warning: unable to open an initial console.
add dirent "watchdog", ino #262
add dirent "udevstart", ino #261
add dirent "udevsend", ino #260
...

Any other ideas?
Thanks,
Tom


-Original Message-
From: Chris Wedgwood [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 02, 2007 3:49 PM
To: Tom Strader
Cc: linux-kernel@vger.kernel.org
Subject: Re: Warning: unable to open an initial console.

On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote:

> I have seen quite a few posts regarding unable to open an initial
> console, but my system seems to have the necessary things in place
> so I come looking for help.

your rootfs/initramfs/initrd is missing a valid working /dev/console

> VFS: Mounted root (jffs2 filesystem).

check /dev/ on this filesystem


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: usb hid: reset NumLock

2007-04-02 Thread Pete Zaitcev
On Mon, 2 Apr 2007 16:48:24 +0200 (CEST), Jiri Kosina <[EMAIL PROTECTED]> wrote:
> On Sun, 1 Apr 2007, Pete Zaitcev wrote:

> could you please change the order of the two functions, so that you 
> don't have to put the forward declaration here?
>[...]
> I'd say this is a little bit overcommented.
>[...]
> So as soon as you have the VIDs and PIDs of the hardware which 
> requires this, could you please update the patch and send it to me again?

How about this?

diff --git a/drivers/usb/input/hid-core.c b/drivers/usb/input/hid-core.c
index 827a75a..23b1e70 100644
--- a/drivers/usb/input/hid-core.c
+++ b/drivers/usb/input/hid-core.c
@@ -545,6 +545,45 @@ void usbhid_init_reports(struct hid_device *hid)
warn("timeout initializing reports");
 }
 
+/*
+ * Reset LEDs which BIOS might have left on. For now, just NumLock (0x01).
+ */
+
+static int hid_find_field_early(struct hid_device *hid, unsigned int page,
+unsigned int hid_code, struct hid_field **pfield)
+{
+   struct hid_report *report;
+   struct hid_field *field;
+   struct hid_usage *usage;
+   int i, j;
+
+   list_for_each_entry(report, 
>report_enum[HID_OUTPUT_REPORT].report_list, list) {
+   for (i = 0; i < report->maxfield; i++) {
+   field = report->field[i];
+   for (j = 0; j < field->maxusage; j++) {
+   usage = >usage[j];
+   if ((usage->hid & HID_USAGE_PAGE) == page &&
+   (usage->hid & 0x) == hid_code) {
+   *pfield = field;
+   return j;
+   }
+   }
+   }
+   }
+   return -1;
+}
+
+static void usbhid_set_leds(struct hid_device *hid)
+{
+   struct hid_field *field;
+   int offset;
+
+   if ((offset = hid_find_field_early(hid, HID_UP_LED, 0x01, )) != 
-1) {
+   hid_set_field(field, offset, 0);
+   usbhid_submit_report(hid, field->report, USB_DIR_OUT);
+   }
+}
+
 #define USB_VENDOR_ID_GTCO 0x078c
 #define USB_DEVICE_ID_GTCO_90  0x0090
 #define USB_DEVICE_ID_GTCO_100 0x0100
@@ -765,6 +804,9 @@ void usbhid_init_reports(struct hid_device *hid)
 #define USB_VENDOR_ID_SONY 0x054c
 #define USB_DEVICE_ID_SONY_PS3_CONTROLLER  0x0268
 
+#define USB_VENDOR_ID_DELL 0x413c
+#define USB_DEVICE_ID_DELL_W7658   0x2005
+
 /*
  * Alphabetically sorted blacklist by quirk type.
  */
@@ -947,6 +989,8 @@ static const struct hid_blacklist {
 
{ USB_VENDOR_ID_CIDC, 0x0103, HID_QUIRK_IGNORE },
 
+   { USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_W7658, HID_QUIRK_RESET_LEDS },
+
{ 0, 0 }
 };
 
@@ -1334,6 +1378,8 @@ static int hid_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
 
usbhid_init_reports(hid);
hid_dump_device(hid);
+   if (hid->quirks & HID_QUIRK_RESET_LEDS)
+   usbhid_set_leds(hid);
 
if (!hidinput_connect(hid))
hid->claimed |= HID_CLAIMED_INPUT;
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 8c97d4d..3e8dcb0 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -269,6 +269,7 @@ struct hid_item {
 #define HID_QUIRK_SONY_PS3_CONTROLLER  0x0008
 #define HID_QUIRK_LOGITECH_S510_DESCRIPTOR 0x0010
 #define HID_QUIRK_DUPLICATE_USAGES 0x0020
+#define HID_QUIRK_RESET_LEDS   0x0040
 
 /*
  * This is the global environment of the parser. This information is

I wasn't sure where to place the function, so I just put it above its
user, to signify that it's special-casing. Also, it's unclear where
to put the quirk entry and defines. There's a comment saying that they
are sorted alphabetically by quirk, but apparently the order was violated
with more recent additions.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] Fix race between cat /proc/slab_allocators and rmmod

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 19:03 +0400, Alexey Dobriyan wrote:
> Same story as with cat /proc/*/wchan race vs rmmod race, only
> /proc/slab_allocators want more info than just symbol name.
> 
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>

All these look excellent.  I hope Andrew picked them up?

Thanks,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] Fix race between rmmod and cat /proc/kallsyms

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 19:01 +0400, Alexey Dobriyan wrote:
> +static inline int module_get_kallsym(unsigned int symnum, unsigned long 
> *value,
> + char *type, char *name,
> + char *module_name, int *exported)
>  {
> - return NULL;
> + return -ERANGE;
>  }

This would normally by -ENOSYS, but since the return value is used as a
binary anyway, it doesn't matter.

Thanks,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)

2007-04-02 Thread Adrian Bunk
On Sun, Apr 01, 2007 at 06:48:03PM +0200, Rafael J. Wysocki wrote:
> On Sunday, 1 April 2007 17:21, Tilman Schmidt wrote:
> > I'm sorry to say this has now happened with kernel 2.6.21-rc5, too.
> > I started a kernel compilation in the evening and came back in the
> > morning to find all KDE decorations gone. All processes normally
> > running for a KDE session and labelled "[kinit]" in ps were gone
> > but everything else was running fine, and the system was still
> > usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained
> > nothing remotely suspicious. /var/log/messages had two lines I
> > never saw before:
> > 
> > Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning: 
> > vs-8115: get_num_ver: not directory or indirect item
> > Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning: 
> > vs-8115: get_num_ver: not directory or indirect item
> > 
> > But those didn't appear on previous occurrences of the "dying KDE"
> > problem so I guess they are not related.
> > 
> > This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110
> > (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk)
> > % uname -a
> > Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 
> > i686 i686 i386 GNU/Linux
> > % cat /proc/cmdline
> > root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED] 
> > nmi_watchdog=2 lapic 5
> > Kernel configuration mostly-modular, based on standard SuSE kernel's
> > /proc/config.gz, just compiling into the kernel everything I need to
> > boot without an initrd and omitting some parts I'm not interested in.
> > (.config attached.) What else might be relevant?
> > 
> > Again, this is a Heisenbug, ie. it's not reproducible and invariably
> > happens when I'm away from the machine. (Probably Murphy at work.)
> > It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and
> > once on 2.6.21-rc5, on a machine which spends about equal amounts
> > of time running the latest stable, rc, and mm kernels. OTOH, so far
> > it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have
> > I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4
> > and -rc4-mm releases that's not conclusive as those have only been
> > running for a very short time.
> 
> I have a similar problem on x86_64 OpenSUSE 10.2, but it seems to happen
> when a sound (eg. notification) is played while the display is suspended
> (or "powered off").

Is it easily reproducible and still present with the latest -git?
If yes, can you bisect?

> IMO it's a SUSE bug.

We also have a report of KDE crashes on Debian [1].
And just a few days ago a kernel bug kwin ran into was fixed [2].

If the pattern is "works with 2.6.20 but does not work with 2.6.21-rc",
then it's most likely a kernel regression. 

> Greetings,
> Rafael

cu
Adrian

[1] http://bugzilla.kernel.org/show_bug.cgi?id=8157
[2] commit 14e9d5730adfca26452b3a2838a80af6950556f5

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-02 Thread Bjorn Helgaas
On Monday 02 April 2007 09:37, Christoph Lameter wrote:
> On Sun, 1 Apr 2007, Andi Kleen wrote:
> > Hmm, this means there is at least 2MB worth of struct page on every node?
> > Or do you have overlaps with other memory (I think you have)
> > In that case you have to handle the overlap in change_page_attr()
> 
> Correct. 2MB worth of struct page is 128 mb of memory. Are there nodes 
> with smaller amounts of memory?

Do you deal with max_addr= and mem=?

RHEL4 (2.6.9) blows up if max_addr= happens to leave you with CPU-only
nodes.  So hopefully you can deal with arbitrary-sized nodes caused by
max_addr= or mem=.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 3/4] [SCSI]stex: fix reset recovery for console device

2007-04-02 Thread Ed Lin


> -Original Message-
> From: James Bottomley [mailto:[EMAIL PROTECTED] 
> Sent: Monday, April 02, 2007 11:28 AM
> To: Ed Lin
> Cc: linux-scsi; linux-kernel; jeff; Promise_Linux
> Subject: RE: [PATCH 3/4] [SCSI]stex: fix reset recovery for 
> console device
> 
> 
> On Mon, 2007-04-02 at 11:14 -0700, Ed Lin wrote:
> > I just saw the routine name scsi_eh_try_stu, and didn't notice the
> > allow_restart (partly because I thought it was not harmful...).
> > But the TEST_UNIT_READY must stay.
> 
> Sure ... I was just checking since your change log implied you'd seen
> the problem from the error handler ... however, we can add it ...
> there's a possibility of getting spin up on init from sd anyway.
> 

You make the decision. But after reconsideration, I think it's better
to remove unused code. It also needs change since the patch about
id mapping is modified in another mail.

How about the attachment here?


s3
Description: s3


Re: [SLUB 2/2] i386 arch page size slab fixes

2007-04-02 Thread William Lee Irwin III
On Sat, Mar 31, 2007 at 11:31:07AM -0800, Christoph Lameter wrote:
> Patch by William Irwin with only very minor modifications by me which are
> 1. Removal of HIGHMEM64G slab caches. It seems that virtualization hosts
>require a a full pgd page.

The HIGHMEM64G slab allocations are meaningfully performant vs.
page-sized allocations where virtualization is absent. I would
personally rather whip Xen into shape enough to be able to handle the
minimal pgd allocations than retain the oversized pgd allocations even
in only the Xen case. Also, the entire unshared kernel pmd shenanigan
in Xen is an artifact of its recursive pagetable affair, which can also
be done away with a SMOP.


On Sat, Mar 31, 2007 at 11:31:07AM -0800, Christoph Lameter wrote:
> 2. Add missing virtualization hook. Seems that we need a new way
>of serializing paravirt_alloc(). It may need to do its own serialization.
> 3. Remove ARCH_USES_SLAB_PAGE_STRUCT

This doesn't quite cover all bases. The changes to pageattr.c and
fault.c are dubious and need verification at the very least. They were
largely slapped together to get the files past the compiler for the
performance comparisons that were never properly done.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SLUB 2/2] i386 arch page size slab fixes

2007-04-02 Thread Christoph Lameter
On Mon, 2 Apr 2007, William Lee Irwin III wrote:

> This doesn't quite cover all bases. The changes to pageattr.c and
> fault.c are dubious and need verification at the very least. They were
> largely slapped together to get the files past the compiler for the
> performance comparisons that were never properly done.

I looked through them but then I am no i386 specialist though. Looked 
fine.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] Simplify module_get_kallsym() by dropping length arg

2007-04-02 Thread Rusty Russell
On Mon, 2007-04-02 at 19:01 +0400, Alexey Dobriyan wrote:
> -
> [PATCH 1/5] Simplify module_get_kallsym() by dropping length arg
> 
> module_get_kallsym() could in theory truncate module symbol name to fit
> in buffer, but nobody does this. Always use KSYM_NAME_LEN + 1 bytes for name.
> 
> Suggested by lg^WRusty.
> 
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>

Acked-by: Rusty Russell <[EMAIL PROTECTED]>

Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/4] [SCSI]stex: fix id mapping issue

2007-04-02 Thread Ed Lin


> -Original Message-
> From: James Bottomley [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, March 31, 2007 7:22 AM
> To: Ed Lin
> Cc: linux-scsi; linux-kernel; jeff; Promise_Linux
> Subject: Re: [PATCH 1/4] [SCSI]stex: fix id mapping issue
> 
> 
> On Fri, 2007-03-30 at 15:21 -0700, Ed Lin wrote:
> > The internal id/lun mapping of st_vsc and st_vsc1 
> controllers is different
> > from st_shasta. The original driver code can only  map 
> first 16 'entities'
> > for st_vsc and st_vsc1 while there are actually 128 available.
> > 
> > Also the  ST_MAX_LUN_PER_TARGET should be 8, although this can do
> > no harm because inquiries beyond boundary are discarded by firmware.
> > 
> > The correct internal mapping should be:
> > id:0~15, lun:0~7 (st_shasta)
> > id:0, lun:0~127 (st_yosemite)
> > id:0~127, lun:0 (st_vsc and st_vsc1)
> > To scsi mid layer they are all channel:0~7, id:0~15, lun:0, 
> with a maximun
> > 'entity' number of 128. The RAID console only interfaces to 
> scsi mid layer
> > and is always mapped at channel:0, id:16, lun:0.
> 
> I'm with Christoph here ... if we're going to break the backwards
> compatibility of the mappings (which your code does) then we 
> could just
> dump channel and use the SCSI id and lun directly.
> 
> Understanding this code is predicated on this quirky definition in
> stex_queuecommand:
> 
>   id = cmd->device->id;
>   lun = cmd->device->channel; /* firmware lun issue work around */
>^^^
> 
> > @ -645,12 +645,16 @@ stex_queuecommand(struct scsi_cmnd *cmd,
> >  
> > req = stex_alloc_req(hba);
> >  
> > -   if (hba->cardtype == st_yosemite) {
> > -   req->lun = lun * (ST_MAX_TARGET_NUM - 1) + id;
> 
> This looks to be correct, it goes up id 0 to ST_MAX_TARGET_NUM -1 then
> takes the next channel.
> 
> > -   req->target = 0;
> > -   } else {
> > +   if (hba->cardtype == st_shasta) {
> > req->lun = lun;
> > req->target = id;
> > +   } else if (hba->cardtype == st_yosemite){
> > +   req->lun = id * ST_MAX_LUN_PER_TARGET + lun;
> > +   req->target = 0;
> > +   } else {
> > +   /* st_vsc and st_vsc1 */
> > +   req->lun = 0;
> > +   req->target = id * ST_MAX_LUN_PER_TARGET + lun;
> 
> These both look to be wrong.  You're taking the channel as the lowest
> common denominator, so your first target is on channel 1 id 
> 0, your next
> on channel 2, id 0 and so on.  That's really going to mess with the
> ordering (which will be user visible) is that really what you want?
> 

How about the attached one? 


s1
Description: s1


Re: [patch 6/13] signal/timer/event fds v10 - timerfd core ...

2007-04-02 Thread Thomas Gleixner
On Mon, 2007-04-02 at 15:46 -0700, Davide Libenzi wrote:
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Can you bring them into alphabetic order and check if the whole bunch is
really required ?

Otherwise it looks good !

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/13] signal/timer/event fds v10 - signalfd core ...

2007-04-02 Thread Davide Libenzi

ChangeLog:

v10 - Renamed from "aino" to "anon_inode"

--
This patch series implements the new signalfd() system call.
I took part of the original Linus code (and you know how
badly it can be broken :), and I added even more breakage ;)
Signals are fetched from the same signal queue used by the process,
so signalfd will compete with standard kernel delivery in dequeue_signal().
If you want to reliably fetch signals on the signalfd file, you need to
block them with sigprocmask(SIG_BLOCK).
This seems to be working fine on my Dual Opteron machine. I made a quick 
test program for it:

http://www.xmailserver.org/signafd-test.c

The signalfd() system call implements signal delivery into a file 
descriptor receiver. The signalfd file descriptor if created with the 
following API:

int signalfd(int ufd, const sigset_t *mask, size_t masksize);

The "ufd" parameter allows to change an existing signalfd sigmask, w/out 
going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a 
brand new signalfd file.
The "mask" allows to specify the signal mask of signals that we are 
interested in. The "masksize" parameter is the size of "mask".
The signalfd fd supports the poll(2) and read(2) system calls. The poll(2)
will return POLLIN when signals are available to be dequeued. As a direct
consequence of supporting the Linux poll subsystem, the signalfd fd can use
used together with epoll(2) too.
The read(2) system call will return a "struct signalfd_siginfo" structure
in the userspace supplied buffer. The return value is the number of bytes
copied in the supplied buffer, or -1 in case of error. The read(2) call
can also return 0, in case the sighand structure to which the signalfd
was attached, has been orphaned. The O_NONBLOCK flag is also supported, and
read(2) will return -EAGAIN in case no signal is available.
If the size of the buffer passed to read(2) is lower than
sizeof(struct signalfd_siginfo), -EINVAL is returned. A read from the
signalfd can also return -ERESTARTSYS in case a signal hits the process.
The format of the struct signalfd_siginfo is, and the valid fields depends
of the (->code & __SI_MASK) value, in the same way a struct siginfo would:

struct signalfd_siginfo {
__u32 signo;/* si_signo */
__s32 err;  /* si_errno */
__s32 code; /* si_code */
__u32 pid;  /* si_pid */
__u32 uid;  /* si_uid */
__s32 fd;   /* si_fd */
__u32 tid;  /* si_fd */
__u32 band; /* si_band */
__u32 overrun;  /* si_overrun */
__u32 trapno;   /* si_trapno */
__s32 status;   /* si_status */
__s32 svint;/* si_int */
__u64 svptr;/* si_ptr */
__u64 utime;/* si_utime */
__u64 stime;/* si_stime */
__u64 addr; /* si_addr */
};



Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.21-rc5.fds/fs/signalfd.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.21-rc5.fds/fs/signalfd.c  2007-04-02 15:06:29.0 -0700
@@ -0,0 +1,353 @@
+/*
+ *  fs/signalfd.c
+ *
+ *  Copyright (C) 2003  Linus Torvalds
+ *
+ *  Mon Mar 5, 2007: Davide Libenzi 
+ *  Changed ->read() to return a siginfo strcture instead of signal number.
+ *  Fixed locking in ->poll().
+ *  Added sighand-detach notification.
+ *  Added fd re-use in sys_signalfd() syscall.
+ *  Now using anonymous inode source.
+ *  Thanks to Oleg Nesterov for useful code review and suggestions.
+ *  More comments and suggestions from Arnd Bergmann.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct signalfd_ctx {
+   struct list_head lnk;
+   wait_queue_head_t wqh;
+   sigset_t sigmask;
+   struct task_struct *tsk;
+};
+
+struct signalfd_lockctx {
+   struct task_struct *tsk;
+   unsigned long flags;
+};
+
+/*
+ * Tries to acquire the sighand lock. We do not increment the sighand
+ * use count, and we do not even pin the task struct, so we need to
+ * do it inside an RCU read lock, and we must be prepared for the
+ * ctx->tsk going to NULL (in signalfd_deliver()), and for the sighand
+ * being detached. We return 0 if the sighand has been detached, or
+ * 1 if we were able to pin the sighand lock.
+ */
+static int signalfd_lock(struct signalfd_ctx *ctx, struct signalfd_lockctx *lk)
+{
+   struct sighand_struct *sighand = NULL;
+
+   rcu_read_lock();
+   lk->tsk = rcu_dereference(ctx->tsk);
+   if (likely(lk->tsk != NULL))
+   sighand = lock_task_sighand(lk->tsk, >flags);
+   rcu_read_unlock();
+
+   if (sighand && !ctx->tsk) {
+   unlock_task_sighand(lk->tsk, >flags);
+   sighand = NULL;
+   }
+
+   return sighand != NULL;
+}
+
+static void signalfd_unlock(struct signalfd_lockctx *lk)
+{

Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)

2007-04-02 Thread Siddha, Suresh B
On Fri, Mar 23, 2007 at 03:12:10PM +0100, Andi Kleen wrote:
> > But that is based on compile time option, isn't it? Perhaps I need
> > to use some other mechanism to find out the platform is not NUMA capable..
> 
> We can probably make it runtime on x86. That will be needed sooner or 
> later for correct NUMA hotplug support anyways.

How about this patch? Thanks.

---
From: Suresh Siddha <[EMAIL PROTECTED]>
[patch] x86_64: set node_possible_map at runtime.

Set the node_possible_map at runtime. On a non NUMA system,
num_possible_nodes() will now say '1'

Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
---

diff --git a/arch/x86_64/mm/k8topology.c b/arch/x86_64/mm/k8topology.c
index b5b8dba..d6f4447 100644
--- a/arch/x86_64/mm/k8topology.c
+++ b/arch/x86_64/mm/k8topology.c
@@ -49,11 +49,8 @@ int __init k8_scan_nodes(unsigned long start, unsigned long 
end)
int found = 0;
u32 reg;
unsigned numnodes;
-   nodemask_t nodes_parsed;
unsigned dualcore = 0;
 
-   nodes_clear(nodes_parsed);
-
if (!early_pci_allowed())
return -1;
 
@@ -102,7 +99,7 @@ int __init k8_scan_nodes(unsigned long start, unsigned long 
end)
   nodeid, (base>>8)&3, (limit>>8) & 3); 
return -1; 
}   
-   if (node_isset(nodeid, nodes_parsed)) { 
+   if (node_isset(nodeid, node_possible_map)) { 
printk(KERN_INFO "Node %d already present. Skipping\n", 
   nodeid);
continue;
@@ -155,7 +152,7 @@ int __init k8_scan_nodes(unsigned long start, unsigned long 
end)
 
prevbase = base;
 
-   node_set(nodeid, nodes_parsed);
+   node_set(nodeid, node_possible_map);
} 
 
if (!found)
diff --git a/arch/x86_64/mm/numa.c b/arch/x86_64/mm/numa.c
index 41b8fb0..5f7d4d8 100644
--- a/arch/x86_64/mm/numa.c
+++ b/arch/x86_64/mm/numa.c
@@ -383,6 +383,7 @@ static int __init numa_emulation(unsigned long start_pfn, 
unsigned long end_pfn)
   i,
   nodes[i].start, nodes[i].end,
   (nodes[i].end - nodes[i].start) >> 20);
+   node_set(i, node_possible_map);
node_set_online(i);
}
memnode_shift = compute_hash_shift(nodes, numa_fake);
@@ -405,6 +406,8 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 { 
int i;
 
+   nodes_clear(node_possible_map);
+
 #ifdef CONFIG_NUMA_EMU
if (numa_fake && !numa_emulation(start_pfn, end_pfn))
return;
@@ -432,6 +435,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
memnodemap[0] = 0;
nodes_clear(node_online_map);
node_set_online(0);
+   node_set(0, node_possible_map);
for (i = 0; i < NR_CPUS; i++)
numa_set_node(i, 0);
node_to_cpumask[0] = cpumask_of_cpu(0);
diff --git a/arch/x86_64/mm/srat.c b/arch/x86_64/mm/srat.c
index 2efe215..9f26e2b 100644
--- a/arch/x86_64/mm/srat.c
+++ b/arch/x86_64/mm/srat.c
@@ -25,7 +25,6 @@ int acpi_numa __initdata;
 
 static struct acpi_table_slit *acpi_slit;
 
-static nodemask_t nodes_parsed __initdata;
 static struct bootnode nodes[MAX_NUMNODES] __initdata;
 static struct bootnode nodes_add[MAX_NUMNODES];
 static int found_add_area __initdata;
@@ -43,7 +42,7 @@ static __init int setup_node(int pxm)
 static __init int conflicting_nodes(unsigned long start, unsigned long end)
 {
int i;
-   for_each_node_mask(i, nodes_parsed) {
+   for_each_node_mask(i, node_possible_map) {
struct bootnode *nd = [i];
if (nd->start == nd->end)
continue;
@@ -321,7 +320,7 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
}
nd = [node];
oldnode = *nd;
-   if (!node_test_and_set(node, nodes_parsed)) {
+   if (!node_test_and_set(node, node_possible_map)) {
nd->start = start;
nd->end = end;
} else {
@@ -344,7 +343,7 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
printk(KERN_NOTICE "SRAT: Hotplug region ignored\n");
*nd = oldnode;
if ((nd->start | nd->end) == 0)
-   node_clear(node, nodes_parsed);
+   node_clear(node, node_possible_map);
}
 }
 
@@ -356,7 +355,7 @@ static int nodes_cover_memory(void)
unsigned long pxmram, e820ram;
 
pxmram = 0;
-   for_each_node_mask(i, nodes_parsed) {
+   for_each_node_mask(i, node_possible_map) {
unsigned long s = nodes[i].start >> PAGE_SHIFT;
unsigned long e = nodes[i].end >> PAGE_SHIFT;
pxmram += e - s;
@@ -380,7 +379,7 @@ static int nodes_cover_memory(void)
 static void unparse_node(int node)
 {
int 

Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-02 Thread Martin Bligh

Christoph Lameter wrote:

On Mon, 2 Apr 2007, Martin Bligh wrote:


For 64GB you'd need 256M which would be a quarter of low mem. Probably takes
up too much of low mem.

Yup.


We could move whatever you currently use to handle that into i386 arch 
code. Or are there other platforms that do similar tricks with highmem?


We already have special hooks for node lookups in sparsemem. Move all of 
that off into some arch dir?


Well, all I did was basically an early vmalloc kind of thing.

You only need to allocate enough virtual space for how much memory
you actually *have*, not the full set. The problem on i386 is that
you just need to reserve that space early, in order to shuffle
everything else into fit. It's messy, but not hard.

M.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/9] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use

2007-04-02 Thread David Howells
Add an interface to the AF_RXRPC module so that the AFS filesystem module can
more easily make use of the services available.  AFS still opens a socket but
then uses the action functions in lieu of sendmsg() and registers an intercept
functions to grab messages before they're queued on the socket Rx queue.

This permits AFS (or whatever) to:

 (1) Avoid the overhead of using the recvmsg() call.

 (2) Use different keys directly on individual client calls on one socket
 rather than having to open a whole slew of sockets, one for each key it
 might want to use.

 (3) Avoid calling request_key() at the point of issue of a call or opening of
 a socket.  This is done instead by AFS at the point of open(), unlink() or
 other VFS operation and the key handed through.

 (4) Request the use of something other than GFP_KERNEL to allocate memory.

Furthermore:

 (*) The socket buffer markings used by RxRPC are made available for AFS so
 that it can interpret the cooked RxRPC messages itself.

 (*) rxgen (un)marshalling abort codes are made available.


The following documentation for the kernel interface is added to
Documentation/networking/rxrpc.txt:

=
AF_RXRPC KERNEL INTERFACE
=

The AF_RXRPC module also provides an interface for use by in-kernel utilities
such as the AFS filesystem.  This permits such a utility to:

 (1) Use different keys directly on individual client calls on one socket
 rather than having to open a whole slew of sockets, one for each key it
 might want to use.

 (2) Avoid having RxRPC call request_key() at the point of issue of a call or
 opening of a socket.  Instead the utility is responsible for requesting a
 key at the appropriate point.  AFS, for instance, would do this during VFS
 operations such as open() or unlink().  The key is then handed through
 when the call is initiated.

 (3) Request the use of something other than GFP_KERNEL to allocate memory.

 (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
 intercepted before they get put into the socket Rx queue and the socket
 buffers manipulated directly.

To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
bind an addess as appropriate and listen if it's to be a server socket, but
then it passes this to the kernel interface functions.

The kernel interface functions are as follows:

 (*) Begin a new client call.

struct rxrpc_call *
rxrpc_kernel_begin_call(struct socket *sock,
struct sockaddr_rxrpc *srx,
struct key *key,
unsigned long user_call_ID,
gfp_t gfp);

 This allocates the infrastructure to make a new RxRPC call and assigns
 call and connection numbers.  The call will be made on the UDP port that
 the socket is bound to.  The call will go to the destination address of a
 connected client socket unless an alternative is supplied (srx is
 non-NULL).

 If a key is supplied then this will be used to secure the call instead of
 the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
 secured in this way will still share connections if at all possible.

 The user_call_ID is equivalent to that supplied to sendmsg() in the
 control data buffer.  It is entirely feasible to use this to point to a
 kernel data structure.

 If this function is successful, an opaque reference to the RxRPC call is
 returned.  The caller now holds a reference on this and it must be
 properly ended.

 (*) End a client call.

void rxrpc_kernel_end_call(struct rxrpc_call *call);

 This is used to end a previously begun call.  The user_call_ID is expunged
 from AF_RXRPC's knowledge and will not be seen again in association with
 the specified call.

 (*) Send data through a call.

int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
   size_t len);

 This is used to supply either the request part of a client call or the
 reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
 data buffers to be used.  msg_iov may not be NULL and must point
 exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
 MSG_MORE if there will be subsequent data sends for this call.

 The msg must not specify a destination address, control data or any flags
 other than MSG_MORE.  len is the total amount of data to transmit.

 (*) Abort a call.

void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);

 This is used to abort a call if it's still in an abortable state.  The
 abort code specified will be placed in the ABORT message sent.

 (*) Intercept received RxRPC messages.

typedef void (*rxrpc_interceptor_t)(struct sock *sk,
 

[patch 11/13] signal/timer/event fds v10 - eventfd wire up i386 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the eventfd system call to the i386 architecture.



Signed-off-by: Davide Libenzi 


- Davide


Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S
===
--- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S  2007-04-02 
15:06:39.0 -0700
+++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S   2007-04-02 
15:06:44.0 -0700
@@ -321,3 +321,4 @@
.long sys_epoll_pwait
.long sys_signalfd  /* 320 */
.long sys_timerfd
+   .long sys_eventfd
Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 
15:06:39.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h  2007-04-02 
15:06:44.0 -0700
@@ -327,10 +327,11 @@
 #define __NR_epoll_pwait   319
 #define __NR_signalfd  320
 #define __NR_timerfd   321
+#define __NR_eventfd   322
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 322
+#define NR_syscalls 323
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)

2007-04-02 Thread Adrian Bunk
On Sun, Apr 01, 2007 at 05:21:06PM +0200, Tilman Schmidt wrote:
> I'm sorry to say this has now happened with kernel 2.6.21-rc5, too.
> I started a kernel compilation in the evening and came back in the
> morning to find all KDE decorations gone. All processes normally
> running for a KDE session and labelled "[kinit]" in ps were gone
> but everything else was running fine, and the system was still
> usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained
> nothing remotely suspicious. /var/log/messages had two lines I
> never saw before:
> 
> Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning: 
> vs-8115: get_num_ver: not directory or indirect item
> Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning: 
> vs-8115: get_num_ver: not directory or indirect item

Reiserfs people Cc'ed for this.

> But those didn't appear on previous occurrences of the "dying KDE"
> problem so I guess they are not related.
> 
> This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110
> (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk)
> % uname -a
> Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 i686 
> i686 i386 GNU/Linux
> % cat /proc/cmdline
> root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED] 
> nmi_watchdog=2 lapic 5
> Kernel configuration mostly-modular, based on standard SuSE kernel's
> /proc/config.gz, just compiling into the kernel everything I need to
> boot without an initrd and omitting some parts I'm not interested in.
> (.config attached.) What else might be relevant?
> 
> Again, this is a Heisenbug, ie. it's not reproducible and invariably
> happens when I'm away from the machine. (Probably Murphy at work.)
> It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and
> once on 2.6.21-rc5, on a machine which spends about equal amounts
> of time running the latest stable, rc, and mm kernels. OTOH, so far
> it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have
> I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4
> and -rc4-mm releases that's not conclusive as those have only been
> running for a very short time.

We also have another report of crashes under KDE:

Subject: crashes in KDE
References : http://bugzilla.kernel.org/show_bug.cgi?id=8157
Submitter  : Oliver Pinter <[EMAIL PROTECTED]>
Status : unknown

We also have one bug kwin ran into that got fixed after -rc5:

Subject: kwin dies silently
References : http://lkml.org/lkml/2007/2/28/112
Submitter  : Sid Boyce <[EMAIL PROTECTED]>
 Boris Mogwitz <[EMAIL PROTECTED]>
 Michael Wu <[EMAIL PROTECTED]>
Caused-By  : Eric W. Biederman <[EMAIL PROTECTED]>
 commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1
Fixed-By   : Eric W. Biederman <[EMAIL PROTECTED]>
Commit : 14e9d5730adfca26452b3a2838a80af6950556f5
Status : fixed in -rc6

These might or might not be related issues.

> HTH
> T.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 9/13] signal/timer/event fds v10 - timerfd compat code ...

2007-04-02 Thread Davide Libenzi
This patch implement the necessary compat code for the timerfd system call.


Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/fs/compat.c
===
--- linux-2.6.21-rc5.fds.orig/fs/compat.c   2007-04-02 15:06:36.0 
-0700
+++ linux-2.6.21-rc5.fds/fs/compat.c2007-04-02 15:06:41.0 -0700
@@ -2361,3 +2361,26 @@
 
 #endif /* CONFIG_SIGNALFD */
 
+#ifdef CONFIG_TIMERFD
+
+asmlinkage long compat_sys_timerfd(int ufd, int clockid, int flags,
+  const struct compat_itimerspec __user *utmr)
+{
+   long res;
+   struct itimerspec t;
+   struct itimerspec __user *ut;
+
+   res = -EFAULT;
+   if (get_compat_itimerspec(, utmr))
+   goto err_exit;
+   ut = compat_alloc_user_space(sizeof(*ut));
+   if (copy_to_user(ut, , sizeof(t)) )
+   goto err_exit;
+
+   res = sys_timerfd(ufd, clockid, flags, ut);
+err_exit:
+   return res;
+}
+
+#endif /* CONFIG_TIMERFD */
+
Index: linux-2.6.21-rc5.fds/include/linux/compat.h
===
--- linux-2.6.21-rc5.fds.orig/include/linux/compat.h2007-04-02 
15:06:12.0 -0700
+++ linux-2.6.21-rc5.fds/include/linux/compat.h 2007-04-02 15:06:41.0 
-0700
@@ -225,6 +225,11 @@
return lhs->tv_nsec - rhs->tv_nsec;
 }
 
+extern int get_compat_itimerspec(struct itimerspec *dst,
+const struct compat_itimerspec __user *src);
+extern int put_compat_itimerspec(struct compat_itimerspec __user *dst,
+const struct itimerspec *src);
+
 asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp);
 
 extern int compat_printk(const char *fmt, ...);
Index: linux-2.6.21-rc5.fds/kernel/compat.c
===
--- linux-2.6.21-rc5.fds.orig/kernel/compat.c   2007-04-02 15:06:12.0 
-0700
+++ linux-2.6.21-rc5.fds/kernel/compat.c2007-04-02 15:06:41.0 
-0700
@@ -475,8 +475,8 @@
return min_length;
 }
 
-static int get_compat_itimerspec(struct itimerspec *dst, 
-struct compat_itimerspec __user *src)
+int get_compat_itimerspec(struct itimerspec *dst,
+ const struct compat_itimerspec __user *src)
 { 
if (get_compat_timespec(>it_interval, >it_interval) ||
get_compat_timespec(>it_value, >it_value))
@@ -484,8 +484,8 @@
return 0;
 } 
 
-static int put_compat_itimerspec(struct compat_itimerspec __user *dst, 
-struct itimerspec *src)
+int put_compat_itimerspec(struct compat_itimerspec __user *dst,
+ const struct itimerspec *src)
 { 
if (put_compat_timespec(>it_interval, >it_interval) ||
put_compat_timespec(>it_value, >it_value))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning: unable to open an initial console.

2007-04-02 Thread Chris Wedgwood
On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote:

> I have seen quite a few posts regarding unable to open an initial
> console, but my system seems to have the necessary things in place
> so I come looking for help.

your rootfs/initramfs/initrd is missing a valid working /dev/console

> VFS: Mounted root (jffs2 filesystem).

check /dev/ on this filesystem
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 10/13] signal/timer/event fds v10 - eventfd core ...

2007-04-02 Thread Davide Libenzi
ChangeLog:

v10 - Renamed from "aino" to "anon_inode"

--
This is a very simple and light file descriptor, that can be used as
event wait/dispatch by userspace (both wait and dispatch) and by the
kernel (dispatch only). It can be used instead of pipe(2) in all cases
where those would simply be used to signal events. Their kernel overhead
is much lower than pipes, and they do not consume two fds. When used in
the kernel, it can offer an fd-bridge to enable, for example, functionalities
like KAIO or syslets/threadlets to signal to an fd the completion of certain
operations. But more in general, an eventfd can be used by the kernel to
signal readiness, in a POSIX poll/select way, of interfaces that would
otherwise be incompatible with it. The API is:

int eventfd(unsigned int count);

The eventfd API accepts an initial "count" parameter, and returns an
eventfd fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and 
write(2).
The POLLIN flag is raised when the internal counter is greater than zero.
The POLLOUT flag is raised when at least a value of "1" can be written to
the internal counter.
The POLLERR flag is raised when an overflow in the counter value is detected.
The write(2) operation can never overflow the counter, since it blocks
(unless O_NONBLOCK is set, in which case -EAGAIN is returned).
But the eventfd_signal() function can do it, since it's supposed to not
sleep during its operation.
The read(2) function reads the __u64 counter value, and reset the internal
value to zero. If the value read is equal to (__u64) -1, an overflow
happened on the internal counter (due to 2^64 eventfd_signal() posts
that has never been retired - unlickely, but possible).
The write(2) call writes an __u64 count value, and adds it
to the current counter. The eventfd fd supports O_NONBLOCK also.
On the kernel side, we have:

struct file *eventfd_fget(int fd);
int eventfd_signal(struct file *file, unsigned int n);

The eventfd_fget() should be called to get a struct file* from an eventfd
fd (this is an fget() + check of f_op being an eventfd fops pointer).
The kernel can then call eventfd_signal() every time it wants to post
an event to userspace. The eventfd_signal() function can be called from any
context.
An eventfd() simple test and bench is available here:

http://www.xmailserver.org/eventfd-bench.c

This is the eventfd-based version of pipetest-4 (pipe(2) based):

http://www.xmailserver.org/pipetest-4.c

Not that performance matters much in the eventfd case, but eventfd-bench
shows almost as double as performance than pipetest-4.




Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.21-rc5.fds/include/linux/syscalls.h
===
--- linux-2.6.21-rc5.fds.orig/include/linux/syscalls.h  2007-04-02 
15:06:37.0 -0700
+++ linux-2.6.21-rc5.fds/include/linux/syscalls.h   2007-04-02 
15:06:43.0 -0700
@@ -605,6 +605,7 @@
 asmlinkage long sys_signalfd(int ufd, sigset_t __user *user_mask, size_t 
sizemask);
 asmlinkage long sys_timerfd(int ufd, int clockid, int flags,
const struct itimerspec __user *utmr);
+asmlinkage long sys_eventfd(unsigned int count);
 
 int kernel_execve(const char *filename, char *const argv[], char *const 
envp[]);
 
Index: linux-2.6.21-rc5.fds/fs/eventfd.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.21-rc5.fds/fs/eventfd.c   2007-04-02 15:06:43.0 -0700
@@ -0,0 +1,233 @@
+/*
+ *  fs/eventfd.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct eventfd_ctx {
+   spinlock_t lock;
+   wait_queue_head_t wqh;
+   /*
+* Every time that a write(2) is performed on an eventfd, the
+* value of the __u64 being written is added to "count" and a
+* wakeup is performed on "wqh". A read(2) will return the "count"
+* value to userspace, and will reset "count" to zero. The kernel
+* size eventfd_signal() also, adds to the "count" counter and
+* issue a wakeup.
+*/
+   __u64 count;
+};
+
+/*
+ * Adds "n" to the eventfd counter "count". Returns "n" in case of
+ * success, or a value lower then "n" in case of coutner overflow.
+ * This function is supposed to be called by the kernel in paths
+ * that do not allow sleeping. In this function we allow the counter
+ * to reach the ULLONG_MAX value, and we signal this as overflow
+ * condition by returining a POLLERR to poll(2).
+ */
+int eventfd_signal(struct file *file, int n)
+{
+   struct eventfd_ctx *ctx = file->private_data;
+   unsigned long flags;
+
+   if (n < 0)
+   return -EINVAL;
+   spin_lock_irqsave(>lock, flags);
+   if (ULLONG_MAX - ctx->count < n)
+   n = (int) 

Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-02 Thread Christoph Lameter
On Mon, 2 Apr 2007, Martin Bligh wrote:

> > For 64GB you'd need 256M which would be a quarter of low mem. Probably takes
> > up too much of low mem.
> 
> Yup.

We could move whatever you currently use to handle that into i386 arch 
code. Or are there other platforms that do similar tricks with highmem?

We already have special hooks for node lookups in sparsemem. Move all of 
that off into some arch dir?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 5/13] signal/timer/event fds v10 - signalfd compat code ...

2007-04-02 Thread Davide Libenzi
This patch implement the necessary compat code for the signalfd system call.


Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/fs/compat.c
===
--- linux-2.6.21-rc5.fds.orig/fs/compat.c   2007-04-02 15:06:12.0 
-0700
+++ linux-2.6.21-rc5.fds/fs/compat.c2007-04-02 15:06:36.0 -0700
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2335,3 +2336,28 @@
 #endif /* TIF_RESTORE_SIGMASK */
 
 #endif /* CONFIG_EPOLL */
+
+#ifdef CONFIG_SIGNALFD
+
+asmlinkage long compat_sys_signalfd(int ufd,
+   const compat_sigset_t __user *sigmask,
+   compat_size_t sigsetsize)
+{
+   compat_sigset_t ss32;
+   sigset_t tmp;
+   sigset_t __user *ksigmask;
+
+   if (sigsetsize != sizeof(compat_sigset_t))
+   return -EINVAL;
+   if (copy_from_user(, sigmask, sizeof(ss32)))
+   return -EFAULT;
+   sigset_from_compat(, );
+   ksigmask = compat_alloc_user_space(sizeof(sigset_t));
+   if (copy_to_user(ksigmask, , sizeof(sigset_t)))
+   return -EFAULT;
+
+   return sys_signalfd(ufd, ksigmask, sizeof(sigset_t));
+}
+
+#endif /* CONFIG_SIGNALFD */
+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 3/13] signal/timer/event fds v10 - signalfd wire up i386 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the signalfd system call to the i386 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S
===
--- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S  2007-04-02 
15:06:12.0 -0700
+++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S   2007-04-02 
15:06:33.0 -0700
@@ -319,3 +319,4 @@
.long sys_move_pages
.long sys_getcpu
.long sys_epoll_pwait
+   .long sys_signalfd  /* 320 */
Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 
15:06:12.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h  2007-04-02 
15:06:33.0 -0700
@@ -325,10 +325,11 @@
 #define __NR_move_pages317
 #define __NR_getcpu318
 #define __NR_epoll_pwait   319
+#define __NR_signalfd  320
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 320
+#define NR_syscalls 321
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 6/13] signal/timer/event fds v10 - timerfd core ...

2007-04-02 Thread Davide Libenzi
ChangeLog:

v10 - Renamed from "aino" to "anon_inode"
- Prevented DoS by re-arming the timer on read (Thomas Gleixner)

--
This patch introduces a new system call for timers events delivered
though file descriptors. This allows timer event to be used with
standard POSIX poll(2), select(2) and read(2). As a consequence of
supporting the Linux f_op->poll subsystem, they can be used with
epoll(2) too.
The system call is defined as:

int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr);

The "ufd" parameter allows for re-use (re-programming) of an existing
timerfd w/out going through the close/open cycle (same as signalfd).
If "ufd" is -1, s new file descriptor will be created, otherwise the
existing "ufd" will be re-programmed.
The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME.
The time specified in the "utmr->it_value" parameter is the expiry
time for the timer.
If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute
time, otherwise it's a relative time.
If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0,
tv_nsec == 0), this is the period at which the following ticks should
be generated.
The "utmr->it_interval" should be set to zero if only one tick is requested.
Setting the "utmr->it_value" to zero will disable the timer, or will create
a timerfd without the timer enabled.
The function returns the new (or same, in case "ufd" is a valid timerfd
descriptor) file, or -1 in case of error.
As stated before, the timerfd file descriptor supports poll(2), select(2)
and epoll(2). When a timer event happened on the timerfd, a POLLIN mask
will be returned.
The read(2) call can be used, and it will return a u32 variable holding
the number of "ticks" that happened on the interface since the last call
to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN
will be returned if no ticks happened.
A quick test program, shows timerfd working correctly on my amd64 box:

http://www.xmailserver.org/timerfd-test.c




Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.21-rc5.fds/fs/timerfd.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.21-rc5.fds/fs/timerfd.c   2007-04-02 15:06:37.0 -0700
@@ -0,0 +1,233 @@
+/*
+ *  fs/timerfd.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi 
+ *
+ *
+ *  Thanks to Thomas Gleixner for code reviews and useful comments.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct timerfd_ctx {
+   struct hrtimer tmr;
+   ktime_t tintv;
+   spinlock_t lock;
+   wait_queue_head_t wqh;
+   int expired;
+};
+
+/*
+ * This gets called when the timer event triggers. We set the "expired"
+ * flag, but we do not re-arm the timer (in case it's necessary,
+ * tintv.tv64 != 0) until the timer is read.
+ */
+static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr)
+{
+   struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr);
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   ctx->expired = 1;
+   wake_up_locked(>wqh);
+   spin_unlock_irqrestore(>lock, flags);
+
+   return HRTIMER_NORESTART;
+}
+
+static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags,
+ const struct itimerspec *ktmr)
+{
+   enum hrtimer_mode htmode;
+   ktime_t texp;
+
+   htmode = (flags & TFD_TIMER_ABSTIME) ?
+   HRTIMER_MODE_ABS: HRTIMER_MODE_REL;
+
+   texp = timespec_to_ktime(ktmr->it_value);
+   ctx->expired = 0;
+   ctx->tintv = timespec_to_ktime(ktmr->it_interval);
+   hrtimer_init(>tmr, clockid, htmode);
+   ctx->tmr.expires = texp;
+   ctx->tmr.function = timerfd_tmrproc;
+   if (texp.tv64 != 0)
+   hrtimer_start(>tmr, texp, htmode);
+}
+
+static int timerfd_release(struct inode *inode, struct file *file)
+{
+   struct timerfd_ctx *ctx = file->private_data;
+
+   hrtimer_cancel(>tmr);
+   kfree(ctx);
+   return 0;
+}
+
+static unsigned int timerfd_poll(struct file *file, poll_table *wait)
+{
+   struct timerfd_ctx *ctx = file->private_data;
+   unsigned int events = 0;
+   unsigned long flags;
+
+   poll_wait(file, >wqh, wait);
+
+   spin_lock_irqsave(>lock, flags);
+   if (ctx->expired)
+   events |= POLLIN;
+   spin_unlock_irqrestore(>lock, flags);
+
+   return events;
+}
+
+static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count,
+   loff_t *ppos)
+{
+   struct timerfd_ctx *ctx = file->private_data;
+   ssize_t res;
+   u32 ticks = 0;
+   DECLARE_WAITQUEUE(wait, current);
+
+   if (count < sizeof(ticks))
+   return -EINVAL;
+   

[PATCH] SLAB: Mention slab name when listing corrupt objects

2007-04-02 Thread David Howells
Mention the slab name when listing corrupt objects.  Although the function
that released the memory is mentioned, that is frequently ambiguous as such
functions often release several pieces of memory.

Signed-Off-By: David Howells <[EMAIL PROTECTED]>
---

 mm/slab.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 57f7aa4..4cbac24 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1802,8 +1802,8 @@ static void check_poison_obj(struct kmem_cache *cachep, 
void *objp)
/* Print header */
if (lines == 0) {
printk(KERN_ERR
-   "Slab corruption: start=%p, len=%d\n",
-   realobj, size);
+   "Slab corruption: %s start=%p, 
len=%d\n",
+   cachep->name, realobj, size);
print_objinfo(cachep, objp, 0);
}
/* Hexdump the affected line */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 8/13] signal/timer/event fds v10 - timerfd wire up x86_64 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the timerfd system call to the x86_64 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S  2007-04-02 
15:06:34.0 -0700
+++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S   2007-04-02 
15:06:40.0 -0700
@@ -720,4 +720,5 @@
.quad sys_getcpu
.quad sys_epoll_pwait
.quad sys_signalfd  /* 320 */
+   .quad sys_timerfd
 ia32_syscall_end:  
Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h   2007-04-02 
15:06:34.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 
15:06:40.0 -0700
@@ -621,8 +621,10 @@
 __SYSCALL(__NR_move_pages, sys_move_pages)
 #define __NR_signalfd  280
 __SYSCALL(__NR_signalfd, sys_signalfd)
+#define __NR_timerfd   281
+__SYSCALL(__NR_timerfd, sys_timerfd)
 
-#define __NR_syscall_max __NR_signalfd
+#define __NR_syscall_max __NR_timerfd
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/13] signal/timer/event fds v10 - anonymous inode source ...

2007-04-02 Thread Davide Libenzi
ChangeLog:

v10 - Renamed from "aino" to "anon_inode"

--
This patch add an anonymous inode source, to be used for files that need 
and inode only in order to create a file*. We do not care of having an 
inode for each file, and we do not even care of having different names in 
the associated dentries (dentry names will be same for classes of file*).
This allow code reuse, and will be used by epoll, signalfd and timerfd 
(and whatever else there'll be).



Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.21-rc5.fds/fs/anon_inodes.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.21-rc5.fds/fs/anon_inodes.c   2007-04-01 16:04:32.0 
-0700
@@ -0,0 +1,200 @@
+/*
+ *  fs/anon_inodes.c
+ *
+ *  Copyright (C) 2007  Davide Libenzi 
+ *
+ *  Thanks to Arnd Bergmann for code review and suggestions.
+ *  More changes for Thomas Gleixner suggestions.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+static struct vfsmount *anon_inode_mnt __read_mostly;
+static struct inode *anon_inode_inode;
+static const struct file_operations anon_inode_fops;
+
+static int anon_inodefs_get_sb(struct file_system_type *fs_type, int flags,
+  const char *dev_name, void *data,
+  struct vfsmount *mnt)
+{
+   return get_sb_pseudo(fs_type, "anon_inode:", NULL, ANON_INODE_FS_MAGIC,
+mnt);
+}
+
+static int anon_inodefs_delete_dentry(struct dentry *dentry)
+{
+   /*
+* We faked vfs to believe the dentry was hashed when we created it.
+* Now we restore the flag so that dput() will work correctly.
+*/
+   dentry->d_flags |= DCACHE_UNHASHED;
+   return 1;
+}
+
+static struct file_system_type anon_inode_fs_type = {
+   .name   = "anon_inodefs",
+   .get_sb = anon_inodefs_get_sb,
+   .kill_sb= kill_anon_super,
+};
+static struct dentry_operations anon_inodefs_dentry_operations = {
+   .d_delete   = anon_inodefs_delete_dentry,
+};
+
+/**
+ * anon_inode_getfd - creates a new file instance by hooking it up to and
+ *anonymous inode, and a dentry that describe the "class"
+ *of the file
+ *
+ * @pfd: [out]   pointer to the file descriptor
+ * @dpinode: [out]   pointer to the inode
+ * @pfile:   [out]   pointer to the file struct
+ * @name:[in]name of the "class" of the new file
+ * @fops [in]file operations for the new file
+ * @priv [in]private data for the new file (will be file's 
private_data)
+ *
+ * Creates a new file by hooking it on a single inode. This is useful for files
+ * that do not need to have a full-fledged inode in order to operate correctly.
+ * All the files created with anon_inode_getfd() will share a single inode, by
+ * hence saving memory and avoiding code duplication for the file/inode/dentry
+ * setup.
+ */
+int anon_inode_getfd(int *pfd, struct inode **pinode, struct file **pfile,
+const char *name, const struct file_operations *fops,
+void *priv)
+{
+   struct qstr this;
+   struct dentry *dentry;
+   struct inode *inode;
+   struct file *file;
+   int error, fd;
+
+   if (IS_ERR(anon_inode_inode))
+   return -ENODEV;
+   file = get_empty_filp();
+   if (!file)
+   return -ENFILE;
+
+   inode = igrab(anon_inode_inode);
+   if (IS_ERR(inode)) {
+   error = PTR_ERR(inode);
+   goto err_put_filp;
+   }
+
+   error = get_unused_fd();
+   if (error < 0)
+   goto err_iput;
+   fd = error;
+
+   /*
+* Link the inode to a directory entry by creating a unique name
+* using the inode sequence number.
+*/
+   error = -ENOMEM;
+   this.name = name;
+   this.len = strlen(name);
+   this.hash = 0;
+   dentry = d_alloc(anon_inode_mnt->mnt_sb->s_root, );
+   if (!dentry)
+   goto err_put_unused_fd;
+   dentry->d_op = _inodefs_dentry_operations;
+   /* Do not publish this dentry inside the global dentry hash table */
+   dentry->d_flags &= ~DCACHE_UNHASHED;
+   d_instantiate(dentry, inode);
+
+   file->f_path.mnt = mntget(anon_inode_mnt);
+   file->f_path.dentry = dentry;
+   file->f_mapping = inode->i_mapping;
+
+   file->f_pos = 0;
+   file->f_flags = O_RDWR;
+   file->f_op = fops;
+   file->f_mode = FMODE_READ | FMODE_WRITE;
+   file->f_version = 0;
+   file->private_data = priv;
+
+   fd_install(fd, file);
+
+   *pfd = fd;
+   *pinode = inode;
+   *pfile = file;
+   return 0;
+
+err_put_unused_fd:
+   put_unused_fd(fd);
+err_iput:
+   iput(inode);
+err_put_filp:
+   put_filp(file);
+   return error;
+}
+

[patch 4/13] signal/timer/event fds v10 - signalfd wire up x86_64 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the signalfd system call to the x86_64 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h   2007-04-02 
15:06:12.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 
15:06:34.0 -0700
@@ -619,8 +619,10 @@
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
 #define __NR_move_pages279
 __SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_signalfd  280
+__SYSCALL(__NR_signalfd, sys_signalfd)
 
-#define __NR_syscall_max __NR_move_pages
+#define __NR_syscall_max __NR_signalfd
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR
Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S  2007-04-02 
15:06:12.0 -0700
+++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S   2007-04-02 
15:06:34.0 -0700
@@ -714,9 +714,10 @@
.quad compat_sys_get_robust_list
.quad sys_splice
.quad sys_sync_file_range
-   .quad sys_tee
+   .quad sys_tee   /* 315 */
.quad compat_sys_vmsplice
.quad compat_sys_move_pages
.quad sys_getcpu
.quad sys_epoll_pwait
+   .quad sys_signalfd  /* 320 */
 ia32_syscall_end:  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/9] AF_RXRPC: Key facility changes for AF_RXRPC

2007-04-02 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells <[EMAIL PROTECTED]>
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents" for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 13/13] signal/timer/event fds v10 - KAIO eventfd support example ...

2007-04-02 Thread Davide Libenzi
ChangeLog:

v10 - Added the "aio_flags" field (in place of the old "aio_reserved3")
  and introduced a new IOCB_FLAG_RESFD flag to tell that the
  "aio_resfd" field is valid.

--
This is an example about how to add eventfd support to the current KAIO code,
in order to enable KAIO to post readiness events to a pollable fd
(hence compatible with POSIX select/poll). The KAIO code simply signals
the eventfd fd when events are ready, and this triggers a POLLIN in the fd.
This patch uses a reserved for future use member of the struct iocb to pass
an eventfd file descriptor, that KAIO will use to post events every time
a request completes. At that point, an aio_getevents() will return the
completed result to a struct io_event.
I made a quick test program to verify the patch, and it runs fine here:

http://www.xmailserver.org/eventfd-aio-test.c

The test program uses poll(2), but it'd, of course, work with select and
epoll too.
This can allow to schedule both block I/O and other poll-able devices requests,
and wait for results using select/poll/epoll.
In a typical scenario, an application would submit KAIO request using 
aio_submit(),
and will also use epoll_ctl() on the whole other class of devices (that
with the addition of signals, timers and user events, now it's pretty much
complete), and then would:

epoll_wait(...);
for_each_event {
if (curr_event_is_kaiofd) {
aio_getevents();
dispatch_aio_events();
} else {
dispatch_epoll_event();
}
}



Signed-off-by: Davide Libenzi 



- Davide



Index: linux-2.6.21-rc5.fds/fs/aio.c
===
--- linux-2.6.21-rc5.fds.orig/fs/aio.c  2007-04-02 15:06:11.0 -0700
+++ linux-2.6.21-rc5.fds/fs/aio.c   2007-04-02 15:06:47.0 -0700
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -421,6 +422,7 @@
req->private = NULL;
req->ki_iovec = NULL;
INIT_LIST_HEAD(>ki_run_list);
+   req->ki_eventfd = ERR_PTR(-EINVAL);
 
/* Check if the completion queue has enough free space to
 * accept an event from this io.
@@ -462,6 +464,8 @@
 {
assert_spin_locked(>ctx_lock);
 
+   if (!IS_ERR(req->ki_eventfd))
+   fput(req->ki_eventfd);
if (req->ki_dtor)
req->ki_dtor(req);
if (req->ki_iovec != >ki_inline_vec)
@@ -946,6 +950,14 @@
return 1;
}
 
+   /*
+* Check if the user asked us to deliver the result through an
+* eventfd. The eventfd_signal() function is safe to be called
+* from IRQ context.
+*/
+   if (!IS_ERR(iocb->ki_eventfd))
+   eventfd_signal(iocb->ki_eventfd, 1);
+
info = >ring_info;
 
/* add a completion event to the ring buffer.
@@ -1530,8 +1542,7 @@
ssize_t ret;
 
/* enforce forwards compatibility on users */
-   if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2 ||
-iocb->aio_reserved3)) {
+   if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2)) {
pr_debug("EINVAL: io_submit: reserve field set\n");
return -EINVAL;
}
@@ -1555,6 +1566,19 @@
fput(file);
return -EAGAIN;
}
+   if (iocb->aio_flags & IOCB_FLAG_RESFD) {
+   /*
+* If the IOCB_FLAG_RESFD flag of aio_flags is set, get an
+* instance of the file* now. The file descriptor must be
+* an eventfd() fd, and will be signaled for each completed
+* event using the eventfd_signal() function.
+*/
+   req->ki_eventfd = eventfd_fget((int) iocb->aio_resfd);
+   if (unlikely(IS_ERR(req->ki_eventfd))) {
+   ret = PTR_ERR(req->ki_eventfd);
+   goto out_put_req;
+   }
+   }
 
req->ki_filp = file;
ret = put_user(req->ki_key, _iocb->aio_key);
Index: linux-2.6.21-rc5.fds/include/linux/aio.h
===
--- linux-2.6.21-rc5.fds.orig/include/linux/aio.h   2007-04-02 
15:06:11.0 -0700
+++ linux-2.6.21-rc5.fds/include/linux/aio.h2007-04-02 15:06:47.0 
-0700
@@ -119,6 +119,12 @@
 
struct list_headki_list;/* the aio core uses this
 * for cancellation */
+
+   /*
+* If the aio_resfd field of the userspace iocb is not zero,
+* this is the underlying file* to deliver event to.
+*/
+   struct file *ki_eventfd;
 };
 
 #define is_sync_kiocb(iocb)((iocb)->ki_key == KIOCB_SYNC_KEY)
Index: linux-2.6.21-rc5.fds/include/linux/aio_abi.h

[patch 12/13] signal/timer/event fds v10 - eventfd wire up x86_64 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the eventfd system call to the x86_64 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S
===
--- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S  2007-04-02 
15:06:40.0 -0700
+++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S   2007-04-02 
15:06:46.0 -0700
@@ -721,4 +721,5 @@
.quad sys_epoll_pwait
.quad sys_signalfd  /* 320 */
.quad sys_timerfd
+   .quad sys_eventfd
 ia32_syscall_end:  
Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h   2007-04-02 
15:06:40.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 
15:06:46.0 -0700
@@ -623,8 +623,10 @@
 __SYSCALL(__NR_signalfd, sys_signalfd)
 #define __NR_timerfd   281
 __SYSCALL(__NR_timerfd, sys_timerfd)
+#define __NR_eventfd   282
+__SYSCALL(__NR_eventfd, sys_eventfd)
 
-#define __NR_syscall_max __NR_timerfd
+#define __NR_syscall_max __NR_eventfd
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 7/13] signal/timer/event fds v10 - timerfd wire up i386 arch ...

2007-04-02 Thread Davide Libenzi
This patch wire the timerfd system call to the i386 architecture.



Signed-off-by: Davide Libenzi 


- Davide



Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S
===
--- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S  2007-04-02 
15:06:33.0 -0700
+++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S   2007-04-02 
15:06:39.0 -0700
@@ -320,3 +320,4 @@
.long sys_getcpu
.long sys_epoll_pwait
.long sys_signalfd  /* 320 */
+   .long sys_timerfd
Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h
===
--- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 
15:06:33.0 -0700
+++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h  2007-04-02 
15:06:39.0 -0700
@@ -326,10 +326,11 @@
 #define __NR_getcpu318
 #define __NR_epoll_pwait   319
 #define __NR_signalfd  320
+#define __NR_timerfd   321
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 321
+#define NR_syscalls 322
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/9] AF_RXRPC: Make it possible to merely try to cancel timers and delayed work

2007-04-02 Thread David Howells
Export try_to_del_timer_sync() for use by the RxRPC module.

Add a try_to_cancel_delayed_work() so that it is possible to merely attempt to
cancel a delayed work timer.

Signed-Off-By: David Howells <[EMAIL PROTECTED]>
---

 include/linux/workqueue.h |   21 +
 kernel/timer.c|2 ++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..40a61ae 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -204,4 +204,25 @@ static inline int cancel_delayed_work(struct delayed_work 
*work)
return ret;
 }
 
+/**
+ * try_to_cancel_delayed_work - Try to kill pending scheduled, delayed work
+ * @work: the work to cancel
+ *
+ * Try to kill off a pending schedule_delayed_work().
+ * - The timer may still be running afterwards, and if so, the work may still
+ *   be pending
+ * - Returns -1 if timer still active, 1 if timer removed, 0 if not scheduled
+ * - Can be called from the work routine; if it's still pending, just return
+ *   and it'll be called again.
+ */
+static inline int try_to_cancel_delayed_work(struct delayed_work *work)
+{
+   int ret;
+
+   ret = try_to_del_timer_sync(>timer);
+   if (ret > 0)
+   work_release(>work);
+   return ret;
+}
+
 #endif
diff --git a/kernel/timer.c b/kernel/timer.c
index 440048a..ba4d6e0 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/9] AF_RXRPC: Add blkcipher accessors for using kernel data directly

2007-04-02 Thread David Howells
Add blkcipher accessors for using kernel data directly without the use of
scatter lists.

Also add a CRYPTO_ALG_DMA algorithm capability flag to permit or deny the use
of DMA and hardware accelerators.  A hardware accelerator may not be used to
access any arbitrary piece of kernel memory lest it not be in a DMA'able
region.  Only software algorithms may do that.

If kernel data is going to be accessed directly, then CRYPTO_ALG_DMA must, for
instance, be passed in the mask of crypto_alloc_blkcipher(), but not the type.

This is used by AF_RXRPC to do quick encryptions, where the size of the data
being encrypted or decrypted is 8 bytes or, occasionally, 16 bytes (ie: one or
two chunks only), and since these data are generally on the stack they may be
split over two pages.  Because they're so small, and because they may be
misaligned, setting up a scatter-gather list is overly expensive.  It is very
unlikely that a hardware FCrypt PCBC engine will be encountered (there is not,
as far as I know, any such thing), and even if one is encountered, the
setup/teardown costs for such small transactions will almost certainly be
prohibitive.

Encrypting and decrypting whole packets, on the other hand, is done through the
scatter-gather list interface as the amount of data is sufficient that the
expense of doing virtual address to page calculations is sufficiently small by
comparison.

Signed-Off-By: David Howells <[EMAIL PROTECTED]>
---

 crypto/blkcipher.c |2 +
 crypto/pcbc.c  |   62 +
 include/linux/crypto.h |  118 
 3 files changed, 181 insertions(+), 1 deletions(-)

diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index b5befe8..4498b2d 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -376,6 +376,8 @@ static int crypto_init_blkcipher_ops(struct crypto_tfm 
*tfm, u32 type, u32 mask)
crt->setkey = setkey;
crt->encrypt = alg->encrypt;
crt->decrypt = alg->decrypt;
+   crt->encrypt_kernel = alg->encrypt_kernel;
+   crt->decrypt_kernel = alg->decrypt_kernel;
 
addr = (unsigned long)crypto_tfm_ctx(tfm);
addr = ALIGN(addr, align);
diff --git a/crypto/pcbc.c b/crypto/pcbc.c
index 5174d7f..fa76111 100644
--- a/crypto/pcbc.c
+++ b/crypto/pcbc.c
@@ -126,6 +126,36 @@ static int crypto_pcbc_encrypt(struct blkcipher_desc *desc,
return err;
 }
 
+static int crypto_pcbc_encrypt_kernel(struct blkcipher_desc *desc,
+ u8 *dst, const u8 *src,
+ unsigned int nbytes)
+{
+   struct blkcipher_walk walk;
+   struct crypto_blkcipher *tfm = desc->tfm;
+   struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm);
+   struct crypto_cipher *child = ctx->child;
+   void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx->xor;
+
+   BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) &
+  CRYPTO_ALG_DMA);
+
+   if (nbytes == 0)
+   return 0;
+
+   memset(, 0, sizeof(walk));
+   walk.src.virt.addr = (u8 *) src;
+   walk.dst.virt.addr = (u8 *) dst;
+   walk.nbytes = nbytes;
+   walk.total = nbytes;
+   walk.iv = desc->info;
+
+   if (walk.src.virt.addr == walk.dst.virt.addr)
+   nbytes = crypto_pcbc_encrypt_inplace(desc, , child, xor);
+   else
+   nbytes = crypto_pcbc_encrypt_segment(desc, , child, xor);
+   return 0;
+}
+
 static int crypto_pcbc_decrypt_segment(struct blkcipher_desc *desc,
   struct blkcipher_walk *walk,
   struct crypto_cipher *tfm,
@@ -211,6 +241,36 @@ static int crypto_pcbc_decrypt(struct blkcipher_desc *desc,
return err;
 }
 
+static int crypto_pcbc_decrypt_kernel(struct blkcipher_desc *desc,
+ u8 *dst, const u8 *src,
+ unsigned int nbytes)
+{
+   struct blkcipher_walk walk;
+   struct crypto_blkcipher *tfm = desc->tfm;
+   struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm);
+   struct crypto_cipher *child = ctx->child;
+   void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx->xor;
+
+   BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) &
+   CRYPTO_ALG_DMA);
+
+   if (nbytes == 0)
+   return 0;
+
+   memset(, 0, sizeof(walk));
+   walk.src.virt.addr = (u8 *) src;
+   walk.dst.virt.addr = (u8 *) dst;
+   walk.nbytes = nbytes;
+   walk.total = nbytes;
+   walk.iv = desc->info;
+
+   if (walk.src.virt.addr == walk.dst.virt.addr)
+   nbytes = crypto_pcbc_decrypt_inplace(desc, , child, xor);
+   else
+   nbytes = crypto_pcbc_decrypt_segment(desc, , child, xor);
+   return 0;
+}
+
 static void xor_byte(u8 *a, const u8 *b, unsigned int bs)
 {
do {
@@ -313,6 +373,8 @@ static struct crypto_instance 

  1   2   3   4   5   6   7   8   9   >