Re: [PATCH net-next 0/7] sctp: network namespace support Part 2: per net tunables
From: ebied...@xmission.com (Eric W. Biederman) Date: Tue, 07 Aug 2012 10:17:02 -0700 Since I am motivated to get things done, and since there has been much grumbling about my patches not implementing tunables, I have added tunable support on top of my last patchset. I have performed basic testing on the these patches and nothing appears amis. The sm statemachine is a major tease as it has all of these association and endpoint pointers in the common set of function parameters that turn out to be NULL at the most inconvinient times. So I added to the common parameter list a struct net pointer, that is never NULL. Now that I have the ACKs from Vlad, I'm applying all of your work, thanks Eric. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return
I really wish I could have nailed this down better, but I've had a hard time reliably reproducing the problem during bisection, and I haven't seen anyone report a similar sounding problem. Here's what I've seen: since 3.5 I've been having spurious delays on my nfs clients, noticable particularly when I open an mbox file in mutt over an nfs v4 mount from a v3.5 or later server. The servers I've reproduced this on are all uni-proc 32-bit systems... but then I haven't tried SMP or 64-bit systems yet, it may or may not exist there. When the delay occurs, it's quite noticable. I've never seen one that takes less than 40 seconds to unstick. I wrote a quick and dirty reproduction tool, based on the syscalls mutt was doing that triggered the problem, attached to this message. To use it, compile the file as utime-test on an exported volume, then execute with (cd /some/mount/point strace -T ./utime-test) from a nfs4 client. For whatever, reason I frequently find the second call to utime takes an irritatingly long time to return and I see something like: utime(utime-test.c, [2012/08/14-22:47:21, 2012/08/14-17:25:21]) = 0 70.510913 in the strace output. I've reproduced this on Debian Squeeze / nfs-utils 1.2.2 based servers (legacy idmapper, no user-space nfsidmap), as well as Debian Wheezy / nfs-utils 1.2.6 (uses keyutils upcalls) servers, so I doubt it's a user-space related issue... Attempts to bisect have been muddled, I'll keep trying in the interim, but the best I've been able to pin things down is that issue was probably introduced in the 419f4319495043a9507ac3e616be9ca60af09744 merge. I can't repo on a kernel based on fb21affa49204acd409328415b49bfe90136653c. (I say based on, because I have to apply the patch from http://marc.info/?l=linux-nfsm=133950479803025 or face additional problems.) I'll try to get full rcpdebug traces on client and server as the delay is occuring in the hopes that helps pin things down, and post them separately. -- Jamie Heilman http://audible.transient.net/~jamie/ #include sys/types.h #include sys/stat.h #include fcntl.h #include unistd.h #include time.h #include utime.h #include stdio.h #define F_PATH utime-test.c int main() { struct stat sb; struct utimbuf ub; int fd; if (stat(F_PATH, sb) == -1) { perror(NULL); return -1; } ub.modtime = sb.st_mtime; ub.actime = time(NULL); if (utime(F_PATH, ub) == -1) { perror(NULL); return -2; } if ((fd = open(F_PATH, O_RDONLY)) == -1) { perror(NULL); return -3; } close(fd); if (stat(F_PATH, sb) == -1) { perror(NULL); return -1; } ub.modtime = sb.st_mtime; ub.actime = time(NULL); if (utime(F_PATH, ub) == -1) { perror(NULL); return -2; } return 0; }
Re: [PATCH 2/2] iommu/tegra: smmu: Use __debugfs_create_dir
Hi, Thank you for review. Already sent the another version of patch(v2: *1), but I tried to answer the remaining questions inlined. On Wed, 8 Aug 2012 17:31:02 +0200 Felipe Balbi ba...@ti.com wrote: * PGP Signed by an unknown key Hi, On Wed, Aug 08, 2012 at 06:11:29PM +0300, Felipe Balbi wrote: Hi, On Wed, Aug 08, 2012 at 09:24:33AM +0300, Hiroshi Doyu wrote: The commit c3b1a35 debugfs: make sure that debugfs_create_file() gets used only for regulars doesn't allow to use debugfs_create_file() for dir. Use the version with data, __debugfs_create_dir(). Signed-off-by: Hiroshi Doyu hd...@nvidia.com Reported-by: Laxman Dewangan ldewan...@nvidia.com --- drivers/iommu/tegra-smmu.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 5e51fb7..41aff7a 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -1035,9 +1035,7 @@ static void smmu_debugfs_create(struct smmu_device *smmu) int i; struct dentry *root; - root = debugfs_create_file(dev_name(smmu-dev), - S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO, - NULL, smmu, NULL); + root = __debugfs_create_dir(dev_name(smmu-dev), NULL, smmu); why would the directory need extra data ? Looking in mainline, tegra-smmu.c doesn't have any debugfs support, where can I see the patches adding debugfs to tegra-smmu ? It doesn't look correct that the directory will have a data field. Looking at linux-next I found the commit which added it: FYI: The original tegra smmu debugfs patch is: http://lists.linuxfoundation.org/pipermail/iommu/2012-August/004507.html @@ -892,6 +909,164 @@ static struct iommu_ops smmu_iommu_ops = { .pgsize_bitmap = SMMU_IOMMU_PGSIZES, }; +/* Should be in the order of enum */ +static const char * const smmu_debugfs_mc[] = { mc, }; +static const char * const smmu_debugfs_cache[] = { tlb, ptc, }; + +static ssize_t smmu_debugfs_stats_write(struct file *file, + const char __user *buffer, + size_t count, loff_t *pos) +{ + struct smmu_device *smmu; + struct dentry *dent; + int i, cache, mc; + enum { + _OFF = 0, + _ON, + _RESET, + }; + const char * const command[] = { + [_OFF] = off, + [_ON] = on, + [_RESET]= reset, + }; + char str[] = reset; + u32 val; + size_t offs; + + count = min_t(size_t, count, sizeof(str)); + if (copy_from_user(str, buffer, count)) + return -EINVAL; + + for (i = 0; i ARRAY_SIZE(command); i++) + if (strncmp(str, command[i], + strlen(command[i])) == 0) + break; + + if (i == ARRAY_SIZE(command)) + return -EINVAL; + + dent = file-f_dentry; + cache = (int)dent-d_inode-i_private; cache you can figure out by the filename. In fact, it would be much better to have tlc and ptc directories with a status filename which you write ON/OFF/RESET or enable/disable/reset to trigger what you need. Actually I also considered {ptc,tlb} directories, but I thought that those might be residual, or nested one more unnecessarily. The current usage is: $ echo reset /sys/kernel/debug/smmu/mc/{tlb,ptc} $ echo on /sys/kernel/debug/smmu/mc/{tlb,ptc} $ echo off /sys/kernel/debug/smmu/mc/{tlb,ptc} $ cat /sys/kernel/debug/smmu/mc/{tlb,ptc} hit:0014910c miss:00014d22 The above format is: hit:HIT countSPCmiss:MISS countSPCCR+LF fscanf(fp, hit:%lx miss:%lx, hit, miss); If {ptc,tlb} was dir, the usage would be: $ echo reset /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ echo on /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ echo off /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ cat /sys/kernel/debug/smmu/mc/{tlb,ptc}/data hit:0014910c miss:00014d22 One advantage of dirs could be, you may be able to check the current status by reading status. It might be less likely read back practically because if writing succeeded, the state should be what was written. For that to work, you should probably hold tlb and ptc on an array of some sort, and pass those as data to their respective status files as the data field. If tlb and ptc are properly defined structures which can point back to smmu, then you have everything you need. I also considered to introduce new structure like what you suggested below, but I felt that the parent-chile relationships are already in directory hierarchy, and I wanted to avoid the residual data with introducing new structures. Instead of introducing new structure, those parent-child relationships are always gotton from debugfs directory
Re: [REGRESION] Suspend hangs with 3.6-rc1 on Lenovo T60 notebook
Hi, On 08/15/2012 07:13 AM, Miklos Szeredi wrote: Suspend oopses in generic_ide_suspend() because dev_get_drvdata() returns NULL (dev-p-driver_data == NULL) and this function is not prepared for this. I bisected it to 0998d063 (device-core: Ensure drvdata = NULL when no driver is bound). Reverting it fixes suspend. First of all, thanks for reporting and bisecting this. With that said, I must say that this is very weird. The patch in question: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=0998d063 Only makes dev-drvdata NULL in 2 cases: 1) The probe method of the driver fails 2) The driver has been detached from the device by calling one of: device_release_driver() or driver_detach() Note that in both code paths dev-driver also gets set to NULL, and other generic ide driver callbacks very much depend on that not being NULL, ie: static int generic_ide_remove(struct device *dev) { ide_drive_t *drive = to_ide_device(dev); struct ide_driver *drv = to_ide_driver(dev-driver); if (drv-remove) drv-remove(drive); return 0; } Also how can a drivers suspend callback get called if dev-driver is NULL, since that callback would normally be reached through dev-driver, so something weird is going on here ... I hope one of the ide guys can shed some light on this. Regards, Hans -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] drivers/mfd: Add realtek pcie card reader driver
On Tuesday 14 August 2012, wei_w...@realsil.com.cn wrote: + +void rtsx_pci_start_run(struct rtsx_pcr *pcr) +{ + /* If pci device removed, don't queue idle work any more */ + if (pcr-remove_pci) + return; + + if (pcr-state != PDEV_STAT_RUN) { + pcr-state = PDEV_STAT_RUN; + pcr-ops-enable_auto_blink(pcr); + } + + mod_timer(pcr-idle_timer, jiffies + msecs_to_jiffies(200)); +} +EXPORT_SYMBOL_GPL(rtsx_pci_start_run); One more comment on the mod_timer/queue_work combination: I just saw that Tejun Heo posted a series to introduce a new mod_delayed_work() helper. Once that goes in, it would be best to start using it here. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] extcon-max8997: remove usage of ret in max8997_muic_handle_charger_type_detach
actually we can do returns with error or success with out ret in this function, so remove the ret variable, and reduce a very little (4byte) space on stack of this function Signed-off-by: Devendra Naga develkernel412...@gmail.com --- compile tested. drivers/extcon/extcon-max8997.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/extcon/extcon-max8997.c b/drivers/extcon/extcon-max8997.c index ef9090a..77b66b0 100644 --- a/drivers/extcon/extcon-max8997.c +++ b/drivers/extcon/extcon-max8997.c @@ -271,8 +271,6 @@ out: static int max8997_muic_handle_charger_type_detach( struct max8997_muic_info *info) { - int ret = 0; - switch (info-pre_charger_type) { case MAX8997_CHARGER_TYPE_USB: extcon_set_cable_state(info-edev, USB, false); @@ -290,11 +288,11 @@ static int max8997_muic_handle_charger_type_detach( extcon_set_cable_state(info-edev, Fast-charger, false); break; default: - ret = -EINVAL; + return -EINVAL; break; } - return ret; + return 0; } static int max8997_muic_handle_charger_type(struct max8997_muic_info *info, -- 1.7.12.rc2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Nouveau] [bisected] nouveau: Failed to idle channel x after resume
On Mon, Aug 13, 2012 at 9:49 PM, Maxim Levitsky maximlevit...@gmail.com wrote: On Mon, 2012-08-13 at 18:22 +0200, Sven Joachim wrote: On 2012-08-08 08:18 +0200, Sven Joachim wrote: On 2012-08-08 08:08 +0200, Ben Skeggs wrote: On Wed, Aug 08, 2012 at 08:00:21AM +0200, Sven Joachim wrote: Not for me on my GeForce 8500 GT, and I still cannot suspend more than once, subsequent attempts fail: , | Aug 8 07:49:16 turtle kernel: [ 91.697068] nouveau W[ | PGRAPH][:01:00.0][0x0200502d][880037be1d40] parent failed | suspend, -16 | Aug 8 07:49:16 turtle kernel: [ 91.697078] nouveau [ DRM][:01:00.0] resuming display... ` Interesting. Were there any messages prior to that? Nothing interesting: , | Aug 8 07:49:16 turtle kernel: [ 89.655362] nouveau [ DRM][:01:00.0] suspending fbcon... | Aug 8 07:49:16 turtle kernel: [ 89.655367] nouveau [ DRM][:01:00.0] suspending display... | Aug 8 07:49:16 turtle kernel: [ 89.696888] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... | Aug 8 07:49:16 turtle kernel: [ 89.696909] nouveau [ DRM][:01:00.0] evicting buffers... | Aug 8 07:49:16 turtle kernel: [ 89.696913] nouveau [ DRM][:01:00.0] suspending client object trees... ` I guess the the fifo code detected a timeout when trying to save the graphics context, I have I have other patches in my tree (I'll push them soon, tied up with other work atm) that might help here. Thanks, I'll try them when they are available. With current nouveau master (drm/nouveau: fix find/replace bug in license header) suspending works again, thanks! However, it is a bit slow, taking between two and five seconds: , | Aug 13 18:17:56 turtle kernel: [ 678.524814] PM: Syncing filesystems ... done. | Aug 13 18:18:09 turtle kernel: [ 678.639202] Freezing user space processes ... (elapsed 0.01 seconds) done. | Aug 13 18:18:09 turtle kernel: [ 678.649954] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. | Aug 13 18:18:09 turtle kernel: [ 678.663298] Suspending console(s) (use no_console_suspend to debug) | Aug 13 18:18:09 turtle kernel: [ 678.680884] sd 0:0:0:0: [sda] Synchronizing SCSI cache | Aug 13 18:18:09 turtle kernel: [ 678.681000] sd 0:0:0:0: [sda] Stopping disk | Aug 13 18:18:09 turtle kernel: [ 678.695141] parport_pc 00:07: disabled | Aug 13 18:18:09 turtle kernel: [ 678.695204] serial 00:06: disabled | Aug 13 18:18:09 turtle kernel: [ 678.695209] serial 00:06: wake-up capability disabled by ACPI | Aug 13 18:18:09 turtle kernel: [ 678.695235] nouveau [ DRM][:01:00.0] suspending fbcon... | Aug 13 18:18:09 turtle kernel: [ 678.695239] nouveau [ DRM][:01:00.0] suspending display... | Aug 13 18:18:09 turtle kernel: [ 678.742111] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... | Aug 13 18:18:09 turtle kernel: [ 678.742189] nouveau [ DRM][:01:00.0] evicting buffers... | Aug 13 18:18:09 turtle kernel: [ 682.357319] nouveau [ DRM][:01:00.0] suspending client object trees... | Aug 13 18:18:09 turtle kernel: [ 683.526646] PM: suspend of devices complete after 4863.181 msecs ` With the 3.4.8 kernel, suspending takes little more than one second. Cheers, Sven I confirm exactly the same thing. Here suspend takes more that 10 seconds: [ 2165.363878] nouveau [ DRM][:01:00.0] suspending fbcon... [ 2165.363885] nouveau [ DRM][:01:00.0] suspending display... [ 2165.475791] sd 0:0:0:0: [sda] Stopping disk [ 2166.396877] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... [ 2166.396926] nouveau [ DRM][:01:00.0] evicting buffers... [ 2174.809084] nouveau [ DRM][:01:00.0] suspending client object trees... [ 2177.950222] nouveau :01:00.0: power state changed by ACPI to D3 Best regards, Maxim Levitsky ___ Nouveau mailing list nouv...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau In my case suspend also takes longer than usual, in the order of 10 seconds. @Ben: Have you been able to reproduce this? -- Far away from the primal instinct, the song seems to fade away, the river get wider between your thoughts and the things we do and say. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mtd: kill MTD_NAND_VERIFY_WRITE
Hi Huang, On Tue, 14 Aug 2012 22:38:45 -0400 Huang Shijie shij...@gmail.com wrote: diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig index 588e989..0ca7257 100644 --- a/drivers/mtd/nand/Kconfig +++ b/drivers/mtd/nand/Kconfig @@ -22,15 +22,6 @@ menuconfig MTD_NAND if MTD_NAND -config MTD_NAND_VERIFY_WRITE - bool Verify NAND page writes - help - This adds an extra check when data is written to the flash. The - NAND flash device internally checks only bits transitioning - from 1 to 0. There is a rare possibility that even though the - device thinks the write was successful, a bit could have been - flipped accidentally due to device wear or something else. - There are some defconfig files which set CONFIG_MTD_NAND_VERIFY_WRITE. I guess you should submit an accompanying patch that removes CONFIG_MTD_NAND_VERIFY_WRITE from all defconfig files. (also, trimmed the CC list for this specific discussion, seems unrelated to all of the parties) Regards, Shmulik -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] iommu/tegra: smmu: Use __debugfs_create_dir
Hi, On Wed, Aug 15, 2012 at 09:34:21AM +0300, Hiroshi Doyu wrote: @@ -892,6 +909,164 @@ static struct iommu_ops smmu_iommu_ops = { .pgsize_bitmap = SMMU_IOMMU_PGSIZES, }; +/* Should be in the order of enum */ +static const char * const smmu_debugfs_mc[] = { mc, }; +static const char * const smmu_debugfs_cache[] = { tlb, ptc, }; + +static ssize_t smmu_debugfs_stats_write(struct file *file, + const char __user *buffer, + size_t count, loff_t *pos) +{ + struct smmu_device *smmu; + struct dentry *dent; + int i, cache, mc; + enum { + _OFF = 0, + _ON, + _RESET, + }; + const char * const command[] = { + [_OFF] = off, + [_ON] = on, + [_RESET]= reset, + }; + char str[] = reset; + u32 val; + size_t offs; + + count = min_t(size_t, count, sizeof(str)); + if (copy_from_user(str, buffer, count)) + return -EINVAL; + + for (i = 0; i ARRAY_SIZE(command); i++) + if (strncmp(str, command[i], + strlen(command[i])) == 0) + break; + + if (i == ARRAY_SIZE(command)) + return -EINVAL; + + dent = file-f_dentry; + cache = (int)dent-d_inode-i_private; cache you can figure out by the filename. In fact, it would be much better to have tlc and ptc directories with a status filename which you write ON/OFF/RESET or enable/disable/reset to trigger what you need. Actually I also considered {ptc,tlb} directories, but I thought that those might be residual, or nested one more unnecessarily. The current usage is: $ echo reset /sys/kernel/debug/smmu/mc/{tlb,ptc} $ echo on /sys/kernel/debug/smmu/mc/{tlb,ptc} $ echo off /sys/kernel/debug/smmu/mc/{tlb,ptc} $ cat /sys/kernel/debug/smmu/mc/{tlb,ptc} hit:0014910c miss:00014d22 The above format is: hit:HIT countSPCmiss:MISS countSPCCR+LF if you're just printing hit and miss count, wouldn't it be a bit more human-friendly to print it in decimal rather than hex ? no strong feelings against either way, just thought I'd mention it. fscanf(fp, hit:%lx miss:%lx, hit, miss); If {ptc,tlb} was dir, the usage would be: $ echo reset /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ echo on /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ echo off /sys/kernel/debug/smmu/mc/{tlb,ptc}/status $ cat /sys/kernel/debug/smmu/mc/{tlb,ptc}/data hit:0014910c miss:00014d22 One advantage of dirs could be, you may be able to check the current status by reading status. It might be less likely read back practically because if writing succeeded, the state should be what was written. sure. For that to work, you should probably hold tlb and ptc on an array of some sort, and pass those as data to their respective status files as the data field. If tlb and ptc are properly defined structures which can point back to smmu, then you have everything you need. I also considered to introduce new structure like what you suggested below, but I felt that the parent-chile relationships are already in directory hierarchy, and I wanted to avoid the residual data with introducing new structures. Instead of introducing new structure, those parent-child relationships are always gotton from debugfs directory hierarchy on demand. That was why I wanted to put data in debugfs dir. With debugfs directories having private data, the connections between entities would be kept in filesystem. fair enough. I've already sent another version of patch(v2, *1), which passes all necessary data to a file, put in a structure. This v2 patch may be a little bit simliear to what Felipe suggested below. I looked over that, but I'm not sure you should introduce that smmu_debugfs_info structure. Look at what we do on drivers/usb/dwc3/debugfs.c, we don't add any extra structures for debugfs, we use what the driver already has (struct dwc3-only, currently). If we were to add debgufs support for each USB endpoint, I would pass struct dwc3_ep as data for the files. See that I would still be able to access struct dwc3, should I need it, because struct dwc3_ep knows which struct dwc3 it belongs to. That's what I meant when I suggested adding more structures, not something for debugfs-only, but something for the whole driver to use. Just re-design the driver a little bit and you won't need to allocate memory when creating debugfs directories, because the data you need is already available. -- balbi signature.asc Description: Digital signature
Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.
On Tue, 2012-08-14 at 13:50 -0700, Trevor Brandt wrote: diff --git a/init/Kconfig b/init/Kconfig index 3f42cd6..768dc76 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -27,6 +27,13 @@ config IRQ_WORK bool depends on HAVE_IRQ_WORK +config REALTIME_SCHED + bool Realtime Scheduler if EXPERT + default y + help + This option enables support for the realtime scheduler and the + corresponding scheduling classes SCHED_FIFO and SCHED_RR. + menu General setup config EXPERIMENTAL If you inverted that, it could be a proper default n new feature [1]. However, if weight loss is the goal, why not go whole hog, and create sched/thin.c containing no lard... or just integrate an existing thin scheduler as a config option? Whole body replacement is a very radical diet, but somehow seems less so than chopping off fingers and toes. -Mike (that SMP could select to greatly simplify RT) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] lis3: add generic DT matching code
On 07-08-12 20:49, Daniel Mack wrote: : I fixed all these issues now and attached a v4. Sorry for the late reply, I had read the v3 but didn't find time to send comments. They are all addressed in v4. For both [PATCH v4 1/2] and [PATCH v3 2/2], here is my: Reviewed-by: Éric Piel eric.p...@tremplin-utc.net Cheers, Éric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch net-next v2 00/15] net: introduce upper device lists and remove dev-master
Tue, Aug 14, 2012 at 10:32:53PM CEST, bhutchi...@solarflare.com wrote: On Tue, 2012-08-14 at 16:19 -0400, Andy Gospodarek wrote: On Tue, Aug 14, 2012 at 05:05:33PM +0200, Jiri Pirko wrote: Hi all. Recent discussion around [net-next] bonding: don't allow the master to become its slave forced me to think about upper-lower device connections. This patchset adds a possibility to record upper device linkage. All upper-lower devices are converted to use this mechanism right after. That leads to dev-master removal because this info becomes redundant since unique links have the same value. After all changes, there is no longer possible to do: bond-someotherdevice-samebond Also I think that drivers like cxgb3, qlcnic, qeth would benefit by this in future by being able to get more appropriate info about l3 addresses. v1-v2: - s/unique/master/ better naming + stays closer to the history - fixed vlan err goto - original patch 15 (WARN_ON change) is squashed into the first patch - fixed netdev_unique_upper_dev_get_rcu() case of upper==NULL I just started to review v1 when v2 came out, but luckily the changes were not too significant that I need to start all over. The first note is that I didn't like the use of the term 'upper' -- it seems like 'stacked' might be a better alternative as these are stacked devices. When linking any two devices in a stack, one will be upper and the other lower. The lower device might itself be stacked on top of a further device, so 'stacked' is not a useful distinguishing adjective in variable names. It might be a useful term in the commit messages and kernel-doc, though. One thing to note is that I don't see any clear changelog that states the current goals for this. You have stated in several places that it will no longer be possible to create bonds of bonds, but there are probably a few more things it might be wise to intentionally outlaw. What about teams of teams? Or teams of bonds? Or bonds of teams? Bonds of vlans? [...] It doesn't disallow bonds of bonds (unless I'm missing something). It disallows loops that involve any or all of those types of stacked devices. Exactly. It's every upper driver responsibility to check which device it allows to be added as lower. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] procfs: Add ability to plug in auxiliary fdinfo providers
On Wed, Aug 15, 2012 at 01:07:03AM +0100, Al Viro wrote: On Wed, Aug 15, 2012 at 02:21:47AM +0400, Cyrill Gorcunov wrote: Hmm, in very first versions I've been using one -show method, but then I thought that this is not very correlate with seq-files idea where for each record show/next sequence is called. I'll update (this for sure will make code simplier, and I'll have to check for seq-file overflow after seq_printf call to not continue printing data for too long if buffer already out of space). Al, I'll cook the whole series tomorrow and resend it for review, also I guess the new show_fdinfo() member in file-operations should be guarded with CONFIG_PROC_FS, right? I seriously doubt that it's worth bothering. If somebody cares, they can add making it conditional later. That's what I've beed testing, does it looks good for you? --- From: Cyrill Gorcunov gorcu...@openvz.org Subject: procfs: Add ability to plug in auxiliary fdinfo providers This patch brings ability to print out auxiliary data associated with file in procfs interface /proc/pid/fdinfo/fd. Inparticular further patches make eventfd, evenpoll, signalfd and fsnotify to print additional information complete enough to restore these objects after checkpoint. To simplify the code we add show_fdinfo callback into struct file_operations (as Al proposed). Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org CC: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/proc/fd.c | 51 --- include/linux/fs.h |3 +++ 2 files changed, 39 insertions(+), 15 deletions(-) Index: linux-2.6.git/fs/proc/fd.c === --- linux-2.6.git.orig/fs/proc/fd.c +++ linux-2.6.git/fs/proc/fd.c @@ -15,11 +15,11 @@ #include fd.h struct proc_fdinfo { - loff_t f_pos; - int f_flags; + struct file *f_file; + int f_flags; }; -static int fdinfo_open_helper(struct inode *inode, int *f_flags, struct path *path) +static int fdinfo_open_helper(struct inode *inode, int *f_flags, struct file **f_file, struct path *path) { struct files_struct *files = NULL; struct task_struct *task; @@ -49,6 +49,10 @@ static int fdinfo_open_helper(struct ino *path = fd_file-f_path; path_get(fd_file-f_path); } + if (f_file) { + *f_file = fd_file; + get_file(fd_file); + } ret = 0; } spin_unlock(files-file_lock); @@ -61,28 +65,44 @@ static int fdinfo_open_helper(struct ino static int seq_show(struct seq_file *m, void *v) { struct proc_fdinfo *fdinfo = m-private; - seq_printf(m, pos:\t%lli\nflags:\t0%o\n, - (long long)fdinfo-f_pos, - fdinfo-f_flags); - return 0; + int ret; + + ret = seq_printf(m, pos:\t%lli\nflags:\t0%o\n, +(long long)fdinfo-f_file-f_pos, +fdinfo-f_flags); + + if (!ret fdinfo-f_file-f_op-show_fdinfo) + ret = fdinfo-f_file-f_op-show_fdinfo(m, fdinfo-f_file); + + return ret; } static int seq_fdinfo_open(struct inode *inode, struct file *file) { - struct proc_fdinfo *fdinfo = NULL; - int ret = -ENOENT; + struct proc_fdinfo *fdinfo; + struct seq_file *m; + int ret; fdinfo = kzalloc(sizeof(*fdinfo), GFP_KERNEL); if (!fdinfo) return -ENOMEM; - ret = fdinfo_open_helper(inode, fdinfo-f_flags, NULL); - if (!ret) { - ret = single_open(file, seq_show, fdinfo); - if (!ret) - fdinfo = NULL; + ret = fdinfo_open_helper(inode, fdinfo-f_flags, fdinfo-f_file, NULL); + if (ret) + goto err_free; + + ret = single_open(file, seq_show, fdinfo); + if (ret) { + put_filp(fdinfo-f_file); + goto err_free; } + m = file-private_data; + m-private = fdinfo; + + return ret; + +err_free: kfree(fdinfo); return ret; } @@ -92,6 +112,7 @@ static int seq_fdinfo_release(struct ino struct seq_file *m = file-private_data; struct proc_fdinfo *fdinfo = m-private; + put_filp(fdinfo-f_file); kfree(fdinfo); return single_release(inode, file); @@ -173,7 +194,7 @@ static const struct dentry_operations ti static int proc_fd_link(struct dentry *dentry, struct path *path) { - return fdinfo_open_helper(dentry-d_inode, NULL, path); + return fdinfo_open_helper(dentry-d_inode, NULL, NULL, path);
[no subject]
Please pull to get these two bug fixes. Thanks! The following changes since commit 1a9b4993b70fb1884716902774dc9025b457760d: Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus (2012-08-01 16:47:15 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master for you to fetch changes up to 2856cc2e4d0852c3ddaae9dcb19cb9396512eb08: sparc64: Be less verbose during vmemmap population. (2012-08-15 00:37:29 -0700) David S. Miller (1): sparc64: Be less verbose during vmemmap population. Jiri Kosina (1): sparc64: do not clobber personality flags in sys_sparc64_personality() arch/sparc/kernel/sys_sparc_64.c | 10 +- arch/sparc/mm/init_64.c | 28 +++- 2 files changed, 28 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch net-next v2 01/15] net: introduce upper device lists
Wed, Aug 15, 2012 at 12:33:44AM CEST, bhutchi...@solarflare.com wrote: On Tue, 2012-08-14 at 17:05 +0200, Jiri Pirko wrote: This lists are supposed to serve for storing pointers to all upper devices. Eventually it will replace dev-master pointer which is used for bonding, bridge, team but it cannot be used for vlan, macvlan where there might be multiple upper present. In case the upper link is replacement for dev-master, it is marked with master flag. Something I found interesting is that the dev-master pointer and now netdev_master_upper_dev_get{,_rcu}() are hardly used by the stackled drivers that set the master. They also have to set an rx_handler on the lower device (which is itself mutually exclusive) which gets its own context pointer (rx_handler_data). Instead, the master pointer is mostly used by device drivers to find out about a bridge or bonding device above *their* devices. And that seems to work only for those specific device drivers, not e.g. openvswitch or team. I wonder if we could find a better way to encapsulate the things they want do do, in a later step (not holding up this change!). Yes. I was thinking about this as well. I believe that we should follow up with this. [...] +static int __netdev_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev, bool master) +{ + struct netdev_upper *upper; + + ASSERT_RTNL(); + + if (dev == upper_dev) + return -EBUSY; + /* +* To prevent loops, check if dev is not upper device to upper_dev. +*/ + if (__netdev_has_upper_dev(upper_dev, dev, true)) + return -EBUSY; [...] I think we will also need to limit the depth of the device stack so we don't run out of stack space here. __netif_receive() implements a kind of tail recursion whenever a packet is passed up, but __netdev_has_upper_dev() can't avoid doing real recursion (without the addition of a flag to net_device so it can mark its progress). You are probably right. I'm not sure how to handle this correctly though. Adding some hard limit number might not be correct. The problem could be also resolved by adding another struct list_head into struct upper and use this inside __netdev_has_upper_dev(). But that does not seem right to me as well (Considering the fact that walking through the tree could be in future done under _rcu). Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 2/2] apple_gmux: Add support for newer hardware
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/14/2012 07:11 PM, Bernhard Froemel wrote: Need to play around further.. I think I solved the communication problems concerning byte writes to the gmux device. This: http://luna.vmars.tuwien.ac.at/~froemel/rmbp/patch-apple-gmux_v2.txt works reliable for me (50+ suspend/resume cycles, many brightness changes) *without* delays. I looked once more through Apple's original driver and noticed that their cmd_done waits until bit 1 is set (not cleared!) and also does the final read from GMUX_PORT_DPM_RADDR only if bit 1 is not set. Also it seems that in case of byte writes the old interface should be followed as well (Apple, why?!). Bernhard -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlArU9cACgkQ6iVUjPs37Jk6iACfWuZ7zpbc1vFLgJR29UroJeL2 HvMAnja8D7/o+aqywr/qRtNrB/o217Ci =gcjo -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT] Sparc
From: David Miller da...@davemloft.net Date: Wed, 15 Aug 2012 00:44:57 -0700 (PDT) Please pull to get these two bug fixes. Thanks! Sorry, I botched the original Subject, fixed now. The following changes since commit 1a9b4993b70fb1884716902774dc9025b457760d: Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus (2012-08-01 16:47:15 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git master for you to fetch changes up to 2856cc2e4d0852c3ddaae9dcb19cb9396512eb08: sparc64: Be less verbose during vmemmap population. (2012-08-15 00:37:29 -0700) David S. Miller (1): sparc64: Be less verbose during vmemmap population. Jiri Kosina (1): sparc64: do not clobber personality flags in sys_sparc64_personality() arch/sparc/kernel/sys_sparc_64.c | 10 +- arch/sparc/mm/init_64.c | 28 +++- 2 files changed, 28 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line unsubscribe sparclinux in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] SubmittingPatches: clarify SOB tag usage when evolving submissions
On Sun, Aug 12, 2012 at 4:49 PM, Rob Landley r...@landley.net wrote: The analogy I made was with a magazine editor, fighting off sturgeon's law in the slush pile, cherry-picking a few submissions to polish up and include in the next issue of the magazine. In this context, a personalized rejection letter to a new author is actually encouragement, and the editor's only authority is veto power with bounceback negotiation. I won't accept this, but if you change it like so then maybe... Neat analogy! Developers wishing to contribute changes to the evolution of a second patch submission must supply their own Siged-off-by tag to the original authors and must submit their changes on a public mailing list or ensure that these submission are recorded somewhere publicly. Should != must. Agreed. I believe must is good here. To date a few of these type of contributors have expressed different preferences for whether or not their own SOB tag should be used for a second code submission. Lets keep things simple and only require the contributor's SOB tag if so desired explicitly. It is not technically required if there already is a public record of their contribution somewhere. Heh. technically required. As if there's a process separate from the people implementing it. It depends on the web of trust. Its all subjective but if we want to be safe we air on the side of caution, all in fluffy theory. In practice obviously it does not matter -- unless someone ends up fighting for something they really care about. Speaking of which, did anybody ever explicitly document the four level developer - maintainer - lieutenant - architect thing, and how each level owes you a _response_? No, and in fact I actually think our Signed-off-by language could be strengthened if we had a bit more meaning for how valuable a Signed-off-by tag is for each of these. Right now, its pooo. --- This v2 has Singed/Signed typo fixes. Documentation/SubmittingPatches | 15 +++ 1 file changed, 15 insertions(+) You realize this is a political document as much as technical, right? No. I did not think about that actually. All I wanted to do when I wrote this patch is end a common type of disagreement I have seen over the years with different developers taking different positions on whether or not their Singed-off-by should be used if they sent a small patch to a not-yet-upstream driver. Its really quite simple so best is to document the simplest way for us to evolve IMHO. I really do not give a rats ass about the political nature of this. I just want us to get work done faster and more efficiently without spending energy on stupid questions like this one. Making those longer and more specific is seldom a good idea. I agree. Whoever decides the worthiness of this will decide whether or not this merits integration. I don't give a shit, if not merged at least the patch is out there and we've talked about it. In this case though I think it would help us evolve faster. diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index c379a2a..3154565 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -366,6 +366,21 @@ and protect the submitter from complaints. Note that under no circumstances can you change the author's identity (the From header), as it is the one which appears in the changelog. +If you are submitting a large change (for example a new driver) at times +you may be asked to make quite a lot of modifications prior to getting +your change accepted. At times you may even receive patches from developers +who not only wish to tell you what you should change to get your changes +upstream but actually send you patches. If those patches were made publicly +and they do contain a Signed-off-by tag you are not expected to provide I would add a comma: tag, but for a patch that attempts to clarify, I don't find it very helpful. +their own Signed-off-by tag on the second iteration of the patch so long +as there is a public record somewhere that can be used to show the +contributor had sent their changes with their own Signed-off-by tag. Are you expecting another SCO, or is this just the standard bueaucratic once a procedure is in place we must continue to elaborate it until it describes approved methods of breathing? We should not have a document which claims a correct way kernel developers should wipe their asses. No. But if we throw feces at each others, perhaps a document would help explain what is fair play. The signed-off-by was a way of saying I claim to be authorized to submit this code, so if you find out later it's plaguraized you can blame me. Having someone to blame makes lawyers happy, and we were being sued by a troll at the time. True. As long as the mechanism's there, additional whatevered-by lines provide an easy who do I cc if I bisect a bug to this patch and want answers.
RE: [PATCH v7 0/8] Raid: enable talitos xor offload for improving performance
-Original Message- From: dan.j.willi...@gmail.com [mailto:dan.j.willi...@gmail.com] On Behalf Of Dan Williams Sent: Wednesday, August 15, 2012 4:02 AM To: Liu Qiang-B32616 Cc: dan.j.willi...@intel.com; vinod.k...@intel.com; a...@arndb.de; herb...@gondor.apana.org.au; gre...@linuxfoundation.org; linuxppc- d...@lists.ozlabs.org; linux-kernel@vger.kernel.org; linux- cry...@vger.kernel.org; Ira W. Snyder Subject: Re: [PATCH v7 0/8] Raid: enable talitos xor offload for improving performance On Tue, Aug 14, 2012 at 2:04 AM, Liu Qiang-B32616 b32...@freescale.com wrote: Hi Vinod, Would you like to apply this series from patch 2/8 to 7/8) in your tree? The link as below, http://patchwork.ozlabs.org/patch/176023/ http://patchwork.ozlabs.org/patch/176024/ http://patchwork.ozlabs.org/patch/176025/ http://patchwork.ozlabs.org/patch/176026/ http://patchwork.ozlabs.org/patch/176027/ http://patchwork.ozlabs.org/patch/176028/ Hi, sorry for the recent silence I've been transitioning and am now just catching up. I'll take a look and then it's fine for these to go through Vinod's tree. Hello Dan, Please review, this issue has been continued since many years. I hope we can fix it this time. Thanks. -- Dan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: yama_ptrace_access_check(): possible recursive locking detected
On Tue, 2012-08-14 at 22:56 -0700, Kees Cook wrote: So Oleg's suggestion of removing the locking around the reading of -comm is wrong since it really does need the lock. There's tons of code reading comm without locking.. you're saying that all is broken? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] s390: Always use long for ssize_t to match size_t
On Sun, Aug 12, 2012 at 12:01:34PM +0200, Geert Uytterhoeven wrote: On s390x-linux-gcc, __SIZE_TYPE__ expands to long unsigned int for both 32-bit s390 and 64-bit s390x, as gcc-4.6.3-nolibc/s390x-linux/lib/gcc/s390x-linux/4.6.3/plugin/include/config/s390/linux.h has #define SIZE_TYPE (TARGET_64BIT ? long unsigned int : long unsigned int) To match this, __kernel_size_t is always set to long unsigned int. But while __kernel_ssize_t is long on 64-bit s390x, it is int on 32-bit s390, causing compiler warnings like: fs/quota/quota_tree.c:372:4: warning: format '%zd' expects argument of type 'signed size_t', but argument 4 has type 'ssize_t' [-Wformat] To fix this, __kernel_ssize_t should be long, irrespective of word size. Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org Applied. Thanks Geert! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 0/2] USB gadget - configfs
On Tue, Jul 10, 2012 at 10:54:44AM +0200, Andrzej Pietrasiewicz wrote: Dear Joel, Thank you for your review. @Sebastian, Alan, Felipe: Thank you, too. On Monday, July 02, 2012 11:09 AM Joel Becker wrote: snip As a prerequisite it adds an operation to configfs. The operation allows checking if it is ok to remove a pseudo directory corresponding to a configfs item/group. I NAK'd that patch because you should be using configfs_depend_item(). If you have trouble with that, let's talk. Now I see the configfs_depend_item() is the way to go. I am in doubt, though, so could you please throw some light on it? Here is why: As an example I did a quick-and-dirty port of f_mass_storage to the new, configfs-based approach. The business logic of this function is that once a lun is opened, it must not be changed (deleted, in particular) until it is closed. The moment the lun is opened is defined by a write to a configfs file attribute of a lun config item: +-/lunX | | | +-file | | | +-nofua | | | +-removable | | | +-ro So, the config item corresponding to the lun becomes depended on during the write file operation, the same with undepend. Can this be expressed with configfs_depend/undepend_item()? Your code in fs/configfs/dir.c contains a warning not to call the configfs_depend_item() from a configfs callback. In this case, is store_attribute a configfs callback? Hey Andrzej, I'm sorry it took me so long to write back. I wanted a chance to read and understand your code, so I could make some intelligent comments. But first, a small aside. Rather than filp_open() a filename within the kernel, with all of its attendant state problems, you should just take a file descriptor. You can then let userspace permissions and other things Just Work. See fs/ocfs2/heartbeat.c:o2hb_region_dev_write() for an example of o2hb doing exactly this. It takes a file descriptor and fgets() the filp. The process writing the fd can actually close the file right after; the heartbeat code has its reference. Let's continue with that example. Just like your code, when o2hb is given its fd, it starts up the heartbeat infrastructure. So not only does it hold a reference on the filp, it is starting threads and such. However, it also does not block deletion of the object. If you rmdir() the config_item, it will shut down the threads and drop the filp. This is analogous to your problem. What happens if you remove a heartbeat device underneath a running ocfs2 filesystem? Why, it must crash! We don't want that, so when the ocfs2 filesystem is mounted, it uses configfs_depend_item() to pin that heartbeat device. This is what I mean by pinning outside the configfs callback. Yes, attribute store is a callback. So what should you do? This is where my understanding of your setup logic fails me. At first I thought fsg_bind_function() was the right place, because it is where you expect the LUNs to already be configured. But it is, in turn, called underneath another configfs callback (ufg_gadget_grp_store_connect()). Can you help me understand the userspace steps that are used to set up a gadget? The way I read the code, there is some software in the gadget that sets up the LUN mappings; that is, the host has no idea lun01 is backed by a file named foo. So, if you had a gadget that just exposed a single LUN, it would have some userspace software at startup that sets fua=1, removable=0, ro=0, file=foo. At some future point, the host connects to the gadget. At this point, lun01 is connected to the host, and it had better not disappear. What part of the code reacts to the host connect? This is the open of the LUN where I think you should be locking out. Joel -- Only a life lived for others is a life worth while. -Albert Einstein http://www.jlbec.org/ jl...@evilplan.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] HID: picoLCD: disable version check during probe
On Mon, 30 Jul 2012, Bruno Prémont wrote: Commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] introduced new locking around proce/remove functions that prevent any report/reply from hardware to reach driver until it returned from probe. As such, the ask-reply way to checking picoLCD firmware version during probe is bound to timeout and let probe fail. Disabling the check lets driver sucessfully probe again. Signed-off-by: Bruno Prémont bonb...@linux-vserver.org --- drivers/hid/hid-picolcd_core.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/hid/hid-picolcd_core.c b/drivers/hid/hid-picolcd_core.c index 2d7ef68..42d0791 100644 --- a/drivers/hid/hid-picolcd_core.c +++ b/drivers/hid/hid-picolcd_core.c @@ -478,13 +478,13 @@ static int picolcd_probe_lcd(struct hid_device *hdev, struct picolcd_data *data) { int error; - error = picolcd_check_version(hdev); +/* error = picolcd_check_version(hdev); if (error) return error; if (data-version[0] != 0 data-version[1] != 3) hid_info(hdev, Device with untested firmware revision, please submit /sys/kernel/debug/hid/%s/rdesc for this device.\n, - dev_name(hdev-dev)); + dev_name(hdev-dev)); */ Please just remove it altogether, I don't see a reason to keep the commented-out code in the in-tree driver. Once the locking mess is sorted out, we can re-introduce it again as necessary. Thanks. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 15/21] net sched: Pass the skb into change so it can access NETLINK_CB
On Mon, 2012-08-13 at 13:18 -0700, Eric W. Biederman wrote: From: Eric W. Biederman ebied...@xmission.com cls_flow.c plays with uids and gids. Unless I misread that code it is possible for classifiers to depend on the specific uid and gid values. Therefore I need to know the user namespace of the netlink socket that is installing the packet classifiers. Pass in the rtnetlink skb so I can access the NETLINK_CB of the passed packet. In particular I want access to sk_user_ns(NETLINK_CB(in_skb).ssk). Pass in not the user namespace but the incomming rtnetlink skb into the the classifier change routines as that is generally the more useful parameter. Cc: Jamal Hadi Salim j...@mojatatu.com Acked-by: Serge Hallyn serge.hal...@canonical.com Signed-off-by: Eric W. Biederman ebied...@xmission.com Acked-by: Jamal Hadi Salim j...@mojatatu.com cheers, jamal -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:auto-latest 18/37] kernel/sched/core.c:6460:1: error: 'SD_PREFER_LOCAL' undeclared (first use in this function)
On Wed, 2012-08-15 at 08:57 +0800, Alex Shi wrote: Sorry for this mistaken! The following is fixing patch Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] lis3: add generic DT matching code
On 15.08.2012 09:13, Éric Piel wrote: On 07-08-12 20:49, Daniel Mack wrote: : I fixed all these issues now and attached a v4. Sorry for the late reply, I had read the v3 but didn't find time to send comments. They are all addressed in v4. For both [PATCH v4 1/2] and [PATCH v3 2/2], here is my: Reviewed-by: Éric Piel eric.p...@tremplin-utc.net Thanks! Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [discussion]sched: a rough proposal to enable power saving in scheduler
On Tue, 2012-08-14 at 15:35 +0800, Alex Shi wrote: Any comments for this rough proposal, specially for the assumptions? Let me read it first ;-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SubmittingPatches: clarify SOB tag usage when evolving submissions
I think it would be nice to have another tag for people who fix bugs in the original patch. The Reviewed-by tag implies approval of the whole patch and anyway reviewers don't normally comment unless they see a bug. Maybe something like: Contributor: Your Name em...@address.com So the tags for developers would be: Signed-off-by: The patch went through you. Legal responsibility. Acked-by: You know what you are talking about and approve. Reviewed-by: You reviewed the patch and approve. Contributor: You noticed or fixed a bug in the patch. regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
On Mon, 30 Jul 2012, Bruno Prémont wrote: Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] I have now applied the series to my 'picolcd' branch, except for 6/7, please see the comment I sent to it separately. Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. Would be nice to have this sorted out before the next merge window indeed, so that it can go in together with the rest of the changes. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Could you please stick some printk() just before and after this usbhid_restart_out_queue() call, so that we know that it's this triggering it? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.5-rc6 configfs BUG_ON
On Sun, Jul 15, 2012 at 07:58:31PM -0400, Dave Jones wrote: On Sun, Jul 15, 2012 at 01:58:46PM -0700, Joel Becker wrote: Call Trace: [811ec74f] d_kill+0xaf/0x120 [811ec8a2] dput+0xe2/0x1d0 [811dfaea] path_put+0x1a/0x30 [811d9705] vfs_fstatat+0x55/0x70 [811d973e] vfs_lstat+0x1e/0x20 [811d99ca] sys_newlstat+0x1a/0x40 [81696aed] system_call_fastpath+0x1a/0x1f Code: f6 43 40 01 75 eb 48 8b 7b 50 e8 41 4d f6 ff 48 8b 3d c2 bf 7f 01 48 89 de e8 a2 52 f6 ff 4c 89 e7 e8 1a b5 f9 ff 5b 41 5c 5d c3 0f 0b be 9c 00 00 00 48 c7 c7 30 a2 9e 81 e8 42 2e e1 ff eb a7 RIP [812562fb] configfs_d_iput+0x8b/0xa0 RSP 88014325be08 ---[ end trace 65e130035db36f7d ]--- 58 if (sd) { 59 BUG_ON(sd-s_dentry != dentry); Dave, What were you doing at the time? Is this ocfs2, dlm, target, or what? Joel See http://codemonkey.org.uk/projects/trinity/ Running that with -c newlstat should trigger it. (load configfs module first, no further config necessary) Cool tool. I presume that -c newlstat just randomly picks paths to newlstat. I think I see what's happening, but I can't see what gets the dirent into that state. Now to get a VM I can break. Joel -- The trouble with being punctual is that nobody's there to appreciate it. - Franklin P. Jones http://www.jlbec.org/ jl...@evilplan.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] MIPS: fix module.c build for 32 bit
Rusty Russell ru...@rustcorp.com.au wrote: For a build fix Linux hasn't pulled the asm-generic cleanup patch yet - you missed the merge window, I think. Jonas detected the problem in linux-next. David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] HID: multitouch: Add support for eGalax 0x73f7
On Thu, 9 Aug 2012, Thierry Reding wrote: Signed-off-by: Thierry Reding thierry.red...@avionic-design.de --- drivers/hid/hid-ids.h| 1 + drivers/hid/hid-multitouch.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 5a91bf6..9614a65 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -270,6 +270,7 @@ #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_72FA0x72fa #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_73020x7302 #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_73490x7349 +#define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_73F70x73f7 #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_A0010xa001 #define USB_VENDOR_ID_ELECOM 0x056e diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 59c8b5c..115dca2 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -885,6 +885,9 @@ static const struct hid_device_id mt_devices[] = { USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_7349) }, { .driver_data = MT_CLS_EGALAX_SERIAL, MT_USB_DEVICE(USB_VENDOR_ID_DWAV, + USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_73F7) }, + { .driver_data = MT_CLS_EGALAX_SERIAL, + MT_USB_DEVICE(USB_VENDOR_ID_DWAV, USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_A001) }, /* Elo TouchSystems IntelliTouch Plus panel */ Applied, thanks Thierry. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/1] HID:hid-multitouch: Add ELAN production request when resume.
On Tue, 14 Aug 2012, Scott Liu wrote: Add ELAN production request when resume It would be nice to have some more explanation in the changelog (some variation of the comment in mt_resume() just before usb_control_msg() is issued should be sufficient). Thanks. Signed-off-by: Scott Liu scott@emc.com.tw Suggested-off-by: Benjamin Tissoires benjamin.tissoi...@enac.fr --- drivers/hid/hid-multitouch.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 59c8b5c..e824c37 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -767,6 +767,32 @@ static int mt_reset_resume(struct hid_device *hdev) mt_set_input_mode(hdev); return 0; } + +static int mt_resume(struct hid_device *hdev) +{ + struct usb_interface *intf; + struct usb_host_interface *interface; + struct usb_device *dev; + + if (hdev-bus != BUS_USB) + return 0; + + intf = to_usb_interface(hdev-dev.parent); + interface = intf-cur_altsetting; + dev = hid_to_usb_dev(hdev); + + /* Some Elan legacy devices require SET_IDLE to be set on resume. + * It should be safe to send it to other devices too. + * Tested on 3M, Stantum, Cypress, Zytronic, eGalax, and Elan panels. */ + + usb_control_msg(dev, usb_sndctrlpipe(dev, 0), + HID_REQ_SET_IDLE, + USB_TYPE_CLASS | USB_RECIP_INTERFACE, + 0, interface-desc.bInterfaceNumber, + NULL, 0, USB_CTRL_SET_TIMEOUT); + + return 0; +} #endif static void mt_remove(struct hid_device *hdev) @@ -1092,6 +1118,7 @@ static struct hid_driver mt_driver = { .event = mt_event, #ifdef CONFIG_PM .reset_resume = mt_reset_resume, + .resume = mt_resume, #endif }; -- 1.7.9.5 -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: twofish - add x86_64/avx assembler implementation
Quoting Johannes Goetzfried johannes.goetzfr...@informatik.stud.uni-erlangen.de: This patch adds a x86_64/avx assembler implementation of the Twofish block cipher. The implementation processes eight blocks in parallel (two 4 block chunk AVX operations). The table-lookups are done in general-purpose registers. For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way module are called. A good performance increase is provided for blocksizes greater or equal to 128B. Patch has been tested with tcrypt and automated filesystem tests. Tcrypt benchmark results: Intel Core i5-2500 CPU (fam:6, model:42, step:7) I started thinking about the performance on AMD Bulldozer. vmovq/vmovd/vpextr*/vpinsr* between FPU and general purpose registers on AMD CPU is alot slower (latencies from 8 to 12 cycles) than on Intel sandy-bridge (where instructions have latency of 1 to 2). See: http://www.agner.org/optimize/instruction_tables.pdf It would be really good, if implementation could be tested on AMD CPU to determinate, if it causes performance regression. However I don't have access to machine with such CPU. -Jussi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] iio: adc: add new lp8788 adc driver
On 08/10/2012 09:06 AM, Kim, Milo wrote: [...] + switch (mask) { + case IIO_CHAN_INFO_RAW: + *val = result; + return IIO_VAL_INT; + case IIO_CHAN_INFO_SCALE: + *val = adc_const[id] * ((result * 1000 + 500) / 1000); This looks wrong. The IIO_CHAN_INFO_SCALE attribute is the factor by which IIO_CHAN_INFO_RAW needs to be multiplied to get the value in the proper unit, which is specified in the IIO ABI spec. E.g. milli volts for voltages. What you return here seems to be the IIO_CHAN_INFO_PROCESSED attribute. Which basically is raw * scale. + *val2 = 0; + return IIO_VAL_INT_PLUS_MICRO; + default: + break; + } + +err: + return -EINVAL; +} + [...] +} + +static struct iio_chan_spec lp8788_adc_channels[] = { const + [LPADC_VBATT_5P5] = LP8788_CHAN(VBATT_5P5, IIO_VOLTAGE), + [LPADC_VIN_CHG] = LP8788_CHAN(VIN_CHG, IIO_VOLTAGE), + [LPADC_IBATT] = LP8788_CHAN(IBATT, IIO_CURRENT), + [LPADC_IC_TEMP] = LP8788_CHAN(IC_TEMP, IIO_TEMP), + [LPADC_VBATT_6P0] = LP8788_CHAN(VBATT_6P0, IIO_VOLTAGE), + [LPADC_VBATT_5P0] = LP8788_CHAN(VBATT_5P0, IIO_VOLTAGE), + [LPADC_ADC1] = LP8788_CHAN(ADC1, IIO_VOLTAGE), + [LPADC_ADC2] = LP8788_CHAN(ADC2, IIO_VOLTAGE), + [LPADC_VDD] = LP8788_CHAN(VDD, IIO_VOLTAGE), + [LPADC_VCOIN] = LP8788_CHAN(VCOIN, IIO_VOLTAGE), + [LPADC_VDD_LDO] = LP8788_CHAN(VDD_LDO, IIO_VOLTAGE), + [LPADC_ADC3] = LP8788_CHAN(ADC3, IIO_VOLTAGE), + [LPADC_ADC4] = LP8788_CHAN(ADC4, IIO_VOLTAGE), +}; + -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 1/4] mm: introduce compaction and migration for virtio ballooned pages
On Tue, Aug 14, 2012 at 05:00:49PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:35:25PM +0300, Michael S. Tsirkin wrote: +/* __isolate_lru_page() counterpart for a ballooned page */ +bool isolate_balloon_page(struct page *page) +{ + if (WARN_ON(!movable_balloon_page(page))) Looks like this actually can happen if the page is leaked between previous movable_balloon_page and here. + return false; Yes, it surely can happen, and it does not harm to catch it here, print a warn and return. If it is legal, why warn? For that matter why test here at all? As this is a public symbol, and despite the usage we introduce is sane, the warn was placed as an insurance policy to let us know about any insane attempt to use the procedure in the future. That was due to a nice review nitpick, actually. Even though the code already had a test to properly avoid this race you mention, I thought that sustaining the warn was a good thing. As I told you, despite real, I've never got (un)lucky enough to stumble across that race window while testing the patch. If your concern is about being too much verbose on logging, under certain conditions, perhaps we can change that test to a WARN_ON_ONCE() ? Mel, what are your thoughts here? I viewed it as being defensive programming. VM_BUG_ON would be less useful as it can be compiled out. If the race can be routinely hit then multiple warnings is instructive in itself. I have no strong feelings about this though. I see little harm in making the check but in light of this conversation add a short comment explaining that the check should be redundant. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mtd: kill MTD_NAND_VERIFY_WRITE
于 2012年08月15日 15:06, Shmulik Ladkani 写道: Hi Huang, On Tue, 14 Aug 2012 22:38:45 -0400 Huang Shijieshij...@gmail.com wrote: diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig index 588e989..0ca7257 100644 --- a/drivers/mtd/nand/Kconfig +++ b/drivers/mtd/nand/Kconfig @@ -22,15 +22,6 @@ menuconfig MTD_NAND if MTD_NAND -config MTD_NAND_VERIFY_WRITE - bool Verify NAND page writes - help - This adds an extra check when data is written to the flash. The - NAND flash device internally checks only bits transitioning - from 1 to 0. There is a rare possibility that even though the - device thinks the write was successful, a bit could have been - flipped accidentally due to device wear or something else. - There are some defconfig files which set CONFIG_MTD_NAND_VERIFY_WRITE. I guess you should submit an accompanying patch that removes CONFIG_MTD_NAND_VERIFY_WRITE from all defconfig files. thanks a lot. I will send out a separate patch to fix it. Huang Shijie (also, trimmed the CC list for this specific discussion, seems unrelated to all of the parties) Regards, Shmulik __ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 02/16] user_ns: use new hashtable implementation
Yes hash_32 seems reasonable for the uid hash. With those long hash chains I wouldn't like to be on a machine with 10,000 processes with each with a different uid, and a processes calling setuid in the fast path. The uid hash that we are playing with is one that I sort of wish that the hash table could grow in size, so that we could scale up better. Since uids are likely to be allocated in dense blocks, maybe an unhashed multi-level lookup scheme might be appropriate. Index an array with the low 8 (say) bits of the uid. Each item can be either: 1) NULL = free entry. 2) a pointer to a uid structure (check uid value). 3) a pointer to an array to index with the next 8 bits. (2) and (3) can be differentiated by the low address bit. I think that is updateable with cmpxchg. Clearly this is a bad algorithm if uids are all multiples of 2^24 but that is true or any hash function. David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/iio/adc/at91_adc.c: use devm_ functions
On 08/14/2012 10:32 PM, Jonathan Cameron wrote: Lars-Peter, Are you happy with this updated version? Can't immediately find any response from you to it. I think it is ok, you can add my Reviewed-by: Lars-Peter Clausen l...@metafoo.de. One minor nitpick though. Jonathan From: Julia Lawall julia.law...@lip6.fr The various devm_ functions allocate memory that is released when a driver detaches. This patch uses these functions for data that is allocated in the probe function of a platform device and is only freed in the remove function. The call to platform_get_resource(pdev, IORESOURCE_MEM, 0) is moved coser to the call to devm_request_and_ioremap, which is th first use of the result of platform_get_resource. This does not use devm_request_irq to ensure that free_irq is executed before its idev argument is freed. Signed-off-by: Julia Lawall julia.law...@lip6.fr --- drivers/iio/adc/at91_adc.c | 41 - 1 file changed, 8 insertions(+), 33 deletions(-) diff --git a/drivers/iio/adc/at91_adc.c b/drivers/iio/adc/at91_adc.c index f61780a..3506e3d 100644 --- a/drivers/iio/adc/at91_adc.c +++ b/drivers/iio/adc/at91_adc.c @@ -545,13 +545,6 @@ static int __devinit at91_adc_probe(struct platform_device *pdev) goto error_free_device; } -res = platform_get_resource(pdev, IORESOURCE_MEM, 0); -if (!res) { -dev_err(pdev-dev, No resource defined\n); -ret = -ENXIO; -goto error_ret; -} - platform_set_drvdata(pdev, idev); idev-dev.parent = pdev-dev; @@ -566,18 +559,13 @@ static int __devinit at91_adc_probe(struct platform_device *pdev) goto error_free_device; } -if (!request_mem_region(res-start, resource_size(res), -AT91 adc registers)) { -dev_err(pdev-dev, Resources are unavailable.\n); -ret = -EBUSY; -goto error_free_device; -} +res = platform_get_resource(pdev, IORESOURCE_MEM, 0); -st-reg_base = ioremap(res-start, resource_size(res)); +st-reg_base = devm_request_and_ioremap(pdev-dev, res); if (!st-reg_base) { dev_err(pdev-dev, Failed to map registers.\n); devm_request_and_ioremap will already print a error messages on it's own if something goes wrong. So strictly speaking this one is redundant, but I don't think it is necessary to do a resend just for this, maybe you can remove the extra dev_err when you apply the patch. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 10/65] ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+
Hi Ben, On Wed, Aug 15, 2012 at 05:29:26AM +0100, Ben Hutchings wrote: On Mon, 2012-08-13 at 15:13 -0700, Greg Kroah-Hartman wrote: From: Greg KH gre...@linuxfoundation.org 3.4-stable review patch. If anyone has any objections, please let me know. -- From: Will Deacon will.dea...@arm.com commit a76d7bd96d65fa5119adba97e1b58d95f2e78829 upstream. The open-coded mutex implementation for ARMv6+ cores suffers from a severe lack of barriers, so in the uncontended case we don't actually protect any accesses performed during the critical section. Furthermore, the code is largely a duplication of the ARMv6+ atomic_dec code but optimised to remove a branch instruction, as the mutex fastpath was previously inlined. Now that this is executed out-of-line, we can reuse the atomic access code for the locking (in fact, we use the xchg code as this produces shorter critical sections). This patch uses the generic xchg based implementation for mutexes on ARMv6+, which introduces barriers to the lock/unlock operations and also has the benefit of removing a fair amount of inline assembly code. [...] I understand that a further fix is needed on top of this http://article.gmane.org/gmane.linux.ports.arm.kernel/181693 but it's not in Linus's tree yet. Is it better to apply this on its own or to wait for the complete fix? The additional patch should also be CC'd to stable and is sitting in -tip somewhere I believe, so it shouldn't be long before it does hit mainline. Without this patch there's a memory-ordering bug (which we seem to have hit once in 5 years). With the patch there's a mutex lockup issue on SMP systems that I can provoke with enough hackbenching, so you may want to hold off for now. Will -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 07/11] mm: Allocate kernel pages to the right memcg
On 08/14/2012 07:16 PM, Mel Gorman wrote: On Thu, Aug 09, 2012 at 05:01:15PM +0400, Glauber Costa wrote: When a process tries to allocate a page with the __GFP_KMEMCG flag, the page allocator will call the corresponding memcg functions to validate the allocation. Tasks in the root memcg can always proceed. To avoid adding markers to the page - and a kmem flag that would necessarily follow, as much as doing page_cgroup lookups for no reason, As you already guessed, doing a page_cgroup in the page allocator free path would be a no-go. Specifically yes, but in general, you will be able to observe that I am taking all the possible measures to make sure existing paths are disturbed as little as possible. Thanks for your review here diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b956cec..da341dc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2532,6 +2532,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, struct page *page = NULL; int migratetype = allocflags_to_migratetype(gfp_mask); unsigned int cpuset_mems_cookie; +void *handle = NULL; gfp_mask = gfp_allowed_mask; @@ -2543,6 +2544,13 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, return NULL; /* + * Will only have any effect when __GFP_KMEMCG is set. + * This is verified in the (always inline) callee + */ +if (!memcg_kmem_new_page(gfp_mask, handle, order)) memcg_kmem_new_page takes a void * parameter already but here you are passing in a void **. This probably happens to work because you do this struct mem_cgroup **handle = (struct mem_cgroup **)_handle; but that appears to defeat the purpose of having an opaque type as a handle. You have to treat it different then passing it into the commit function because it expects a void *. The motivation for an opaque type is completely unclear to me and how it is managed with a mix of void * and void ** is very confusing. okay. The opaque exists because I am doing speculative charging. I believe it to be a better and less complicated approach then letting a page appear and then charging it. Besides being consistent with the rest of memcg, it won't create unnecessary disturbance in the page allocator when the allocation is to fail. Now, tasks can move between memcgs, so we can't rely on grabbing it from current in commit_page, so we pass it around as a handle. Also, even if the task could not move, we already got it once from the task, and that is not for free. Better save it. Aside from the handle needed, the cost is more or less the same compared to doing it in one pass. All we do by using speculative charging is to split the cost in two, and doing it from two places. We'd have to charge + update page_cgroup anyway. As for the type, do you think using struct mem_cgroup would be less confusing? On a similar note I spotted #define memcg_kmem_on 1 . That is also different just for the sake of it. The convension is to do something like this /* This helps us to avoid #ifdef CONFIG_NUMA */ #ifdef CONFIG_NUMA #define NUMA_BUILD 1 #else #define NUMA_BUILD 0 #endif For simple defines, yes. But a later patch will turn this into a static branch test. memcg_kmem_on will be always 0 when compile-disabled, but when enable will expand to static_branch(...). memcg_kmem_on was difficult to guess based on its name. I thought initially that it would only be active if a memcg existed or at least something like mem_cgroup_disabled() but it's actually enabled if CONFIG_MEMCG_KMEM is set. For now. And I thought that adding the static branch in this patch would only confuse matters. The placeholder is there, but it is later patched to the final thing. With that explained, if you want me to change it to something else, I can do it. Should I ? I also find it *very* strange to have a function named as if it is an allocation-style function when it in fact it's looking up a mem_cgroup and charging it (and uncharging it in the error path if necessary). If it was called memcg_kmem_newpage_charge I might have found it a little better. I don't feel strongly about names in general. I can change it. Will update to memcg_kmem_newpage_charge() and memcg_kmem_page_uncharge(). This whole operation also looks very expensive (cgroup lookups, RCU locks taken etc) but I guess you're willing to take that cost in the same of isolating containers from each other. However, I strongly suggest that this overhead is measured in advance. It should not stop the series being merged as such but it should be understood because if the cost is high then this feature will be avoided like the plague. I am skeptical that distributions would enable this by default, at least not without support for cgroup_disable=kmem Enabling this feature will bring you nothing, therefore, no (or little) overhead. Nothing of this will be patched in until the first memcg gets kmem limited. The
Re: ext4fs error ext4_mb_generate_buddy:741:group 16, 8160 clusters in bitmap, 4064 in gd (with repro)
On Thu, 9 Aug 2012, Theodore Ts'o wrote: Date: Thu, 9 Aug 2012 13:06:40 -0400 From: Theodore Ts'o ty...@mit.edu To: Lukas Czerner lczer...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com, Linux Kernel mailinlinux-e...@vger.kernel.orgg List linux-kernel@vger.kernel.org, linux-e...@vger.kernel.org Subject: Re: ext4fs error ext4_mb_generate_buddy:741:group 16, 8160 clusters in bitmap, 4064 in gd (with repro) On Thu, Aug 09, 2012 at 12:00:09PM +0200, Paolo Bonzini wrote: Here is how to reproduce it. It happens during fstrim. I found other occurrences of the error in the mailing list, but they were not related to trim so they may be something different. modprobe scsi_debug dev_size_mb=256 lbpws=1 dd if=/dev/zero of=/dev/sdb bs=1M fdisk /dev/sdb create a new partition accepting all defaults fdisk -lu /dev/sdb|tail -1 should show: /dev/sdb1 57 524285 262114+ 83 Linux mkfs.ext4 /dev/sdb1 mkdir test mount /dev/sdb1 test fstrim ./test I can confirm that this accurately reproduces file system corruption using a 3.5 kernel. It looks like some block allocation bitmap blocks is getting trimmed when it shouldn't have been. Lukas, can you take a look at this? - Ted Hi Ted, sorry for the delay, I've just got back from my vacation. I'll take a look at it. Thanks! -Lukas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/iio/adc/at91_adc.c: use devm_ functions
devm_request_and_ioremap will already print a error messages on it's own if something goes wrong. So strictly speaking this one is redundant, but I don't think it is necessary to do a resend just for this, maybe you can remove the extra dev_err when you apply the patch. Thanks for pointing that out. I will get rid of the messages in the future. That seems easier than figuring out how to adapt the message to the new function. julia -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 1/1] HID:hid-multitouch: Add ELAN production request when resume.
Add ELAN production request when resume. Some Elan legacy devices require SET_IDLE to be set on resume. It should be safe to send it to other devices too. Tested on 3M, Stantum, Cypress, Zytronic, eGalax, and Elan panels. Signed-off-by: Scott Liu scott@emc.com.tw Suggested-off-by: Benjamin Tissoires benjamin.tissoi...@enac.fr --- drivers/hid/hid-multitouch.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 59c8b5c..e824c37 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -767,6 +767,32 @@ static int mt_reset_resume(struct hid_device *hdev) mt_set_input_mode(hdev); return 0; } + +static int mt_resume(struct hid_device *hdev) +{ + struct usb_interface *intf; + struct usb_host_interface *interface; + struct usb_device *dev; + + if (hdev-bus != BUS_USB) + return 0; + + intf = to_usb_interface(hdev-dev.parent); + interface = intf-cur_altsetting; + dev = hid_to_usb_dev(hdev); + + /* Some Elan legacy devices require SET_IDLE to be set on resume. +* It should be safe to send it to other devices too. +* Tested on 3M, Stantum, Cypress, Zytronic, eGalax, and Elan panels. */ + + usb_control_msg(dev, usb_sndctrlpipe(dev, 0), + HID_REQ_SET_IDLE, + USB_TYPE_CLASS | USB_RECIP_INTERFACE, + 0, interface-desc.bInterfaceNumber, + NULL, 0, USB_CTRL_SET_TIMEOUT); + + return 0; +} #endif static void mt_remove(struct hid_device *hdev) @@ -1092,6 +1118,7 @@ static struct hid_driver mt_driver = { .event = mt_event, #ifdef CONFIG_PM .reset_resume = mt_reset_resume, + .resume = mt_resume, #endif }; -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 06/11] memcg: kmem controller infrastructure
On 08/14/2012 10:58 PM, Greg Thelen wrote: On Mon, Aug 13 2012, Glauber Costa wrote: + WARN_ON(mem_cgroup_is_root(memcg)); + size = (1 order) PAGE_SHIFT; + memcg_uncharge_kmem(memcg, size); + mem_cgroup_put(memcg); Why do we need ref-counting here ? kmem res_counter cannot work as reference ? This is of course the pair of the mem_cgroup_get() you commented on earlier. If we need one, we need the other. If we don't need one, we don't need the other =) The guarantee we're trying to give here is that the memcg structure will stay around while there are dangling charges to kmem, that we decided not to move (remember: moving it for the stack is simple, for the slab is very complicated and ill-defined, and I believe it is better to treat all kmem equally here) By keeping memcg structures hanging around until the last referring kmem page is uncharged do such zombie memcg each consume a css_id and thus put pressure on the 64k css_id space? I imagine in pathological cases this would prevent creation of new cgroups until these zombies are dereferenced. Yes, but although this patch makes it more likely, it doesn't introduce that. If the tasks, for instance, grab a reference to the cgroup dentry in the filesystem (like their CWD, etc), they will also keep the cgroup around. Is there any way to see how much kmem such zombie memcg are consuming? I think we could find these with for_each_mem_cgroup_tree(root_mem_cgroup). Yes, just need an interface for that. But I think it is something that can be addressed orthogonaly to this work, in a separate patch, not as some fundamental limitation. Basically, I'm wanting to know where kernel memory has been allocated. For live memcg, an admin can cat memory.kmem.usage_in_bytes. But for zombie memcg, I'm not sure how to get this info. It looks like the root_mem_cgroup memory.kmem.usage_in_bytes is not hierarchically charged. Not sure what you mean by not being hierarchically charged. It should be, when use_hierarchy = 1. As a matter of fact, I just tested it, and I do see kmem being charged all the way to the root cgroup when hierarchy is used. (we just can't limit it there) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 07/11] mm: Allocate kernel pages to the right memcg
On Thu 09-08-12 17:01:15, Glauber Costa wrote: [...] diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b956cec..da341dc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2532,6 +2532,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, struct page *page = NULL; int migratetype = allocflags_to_migratetype(gfp_mask); unsigned int cpuset_mems_cookie; + void *handle = NULL; gfp_mask = gfp_allowed_mask; @@ -2543,6 +2544,13 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, return NULL; /* + * Will only have any effect when __GFP_KMEMCG is set. + * This is verified in the (always inline) callee + */ + if (!memcg_kmem_new_page(gfp_mask, handle, order)) + return NULL; When the previous patch introduced this function I thought the handle obfuscantion is to prevent from spreading struct mem_cgroup inside the page allocator but memcg_kmem_commit_page uses the type directly. So why that obfuscation? Even handle as a name sounds unnecessarily confusing. I would go with struct mem_cgroup **memcgp or even return the pointer on success or NULL otherwise. [...] +EXPORT_SYMBOL(__free_accounted_pages); Why exported? Btw. this is called from call_rcu context but it itself calls call_rcu down the chain in mem_cgroup_put. Is it safe? [...] +EXPORT_SYMBOL(free_accounted_pages); here again -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 0/8] procfs, fdinfo updated
Hi guys, here is an updated series. As being discussed with Al the fdinfo helper provided via file_operations. Also I've dropped CONFIG_CHECKPOINT_RESTORE wrap from inside of particular subsystems, thus this new feature will be available by default. I've tested the whole series but additional review would be appreciated. Please tell me wht you think. Cyrill -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 6/8] fs, eventfd: Add procfs fdinfo helper
This allow us to print out raw counter value. The /proc/pid/fdinfo/fd output is | pos: 0 | flags: 04002 | eventfd-count: 5a Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org CC: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/eventfd.c | 20 1 file changed, 20 insertions(+) Index: linux-2.6.git/fs/eventfd.c === --- linux-2.6.git.orig/fs/eventfd.c +++ linux-2.6.git/fs/eventfd.c @@ -19,6 +19,8 @@ #include linux/export.h #include linux/kref.h #include linux/eventfd.h +#include linux/proc_fs.h +#include linux/seq_file.h struct eventfd_ctx { struct kref kref; @@ -284,7 +286,25 @@ static ssize_t eventfd_write(struct file return res; } +#ifdef CONFIG_PROC_FS +static int eventfd_show_fdinfo(struct seq_file *m, struct file *f) +{ + struct eventfd_ctx *ctx = f-private_data; + int ret; + + spin_lock_irq(ctx-wqh.lock); + ret = seq_printf(m, eventfd-count: %16llx\n, +(unsigned long long)ctx-count); + spin_unlock_irq(ctx-wqh.lock); + + return ret; +} +#endif + static const struct file_operations eventfd_fops = { +#ifdef CONFIG_PROC_FS + .show_fdinfo= eventfd_show_fdinfo, +#endif .release= eventfd_release, .poll = eventfd_poll, .read = eventfd_read, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 5/8] fs, notify: Add procfs fdinfo helper v3
This allow us to print out fsnotify details such as watchee inode, device, mask and file handle. For example for inotify objects the output is | pos: 0 | flags: 0200 | inotify wd:3 ino: 9e7e sdev: 800013 mask: 800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle: 7e9e640d1b6d | inotify wd:2 ino: a111 sdev: 800013 mask: 800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle: 11a120542153 | inotify wd:1 ino:6b149 sdev: 800013 mask: 800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle: 49b1060023552153 For fanotify it is like | pos: 0 | flags: 02 | fanotify ino:68f71 sdev: 800013 mask:1 ignored_mask: 4000 fhandle-bytes:8 fhandle-type:1 f_handle: 718f0600b9f42053 | fanotify mnt_id: 13 mask:1 ignored_mask: 4000 To minimize impact on general fsnotify code the new functionality is gathered in fs/notify/fdinfo.c file. v2: - append missing colons to terms v3: - continue from pervious position in list on -next Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org CC: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/notify/Makefile |2 fs/notify/fanotify/fanotify_user.c |4 fs/notify/fdinfo.c | 167 + fs/notify/fdinfo.h | 22 fs/notify/inotify/inotify_user.c |4 5 files changed, 198 insertions(+), 1 deletion(-) Index: linux-2.6.git/fs/notify/Makefile === --- linux-2.6.git.orig/fs/notify/Makefile +++ linux-2.6.git/fs/notify/Makefile @@ -1,5 +1,5 @@ obj-$(CONFIG_FSNOTIFY) += fsnotify.o notification.o group.o inode_mark.o \ - mark.o vfsmount_mark.o + mark.o vfsmount_mark.o fdinfo.o obj-y += dnotify/ obj-y += inotify/ Index: linux-2.6.git/fs/notify/fanotify/fanotify_user.c === --- linux-2.6.git.orig/fs/notify/fanotify/fanotify_user.c +++ linux-2.6.git/fs/notify/fanotify/fanotify_user.c @@ -17,6 +17,7 @@ #include asm/ioctls.h #include ../../mount.h +#include ../fdinfo.h #define FANOTIFY_DEFAULT_MAX_EVENTS16384 #define FANOTIFY_DEFAULT_MAX_MARKS 8192 @@ -446,6 +447,9 @@ static long fanotify_ioctl(struct file * } static const struct file_operations fanotify_fops = { +#ifdef CONFIG_PROC_FS + .show_fdinfo= fanotify_show_fdinfo, +#endif .poll = fanotify_poll, .read = fanotify_read, .write = fanotify_write, Index: linux-2.6.git/fs/notify/fdinfo.c === --- /dev/null +++ linux-2.6.git/fs/notify/fdinfo.c @@ -0,0 +1,167 @@ +#include linux/file.h +#include linux/fs.h +#include linux/fsnotify_backend.h +#include linux/idr.h +#include linux/init.h +#include linux/inotify.h +#include linux/kernel.h +#include linux/namei.h +#include linux/sched.h +#include linux/types.h +#include linux/seq_file.h +#include linux/exportfs.h +#include linux/proc_fs.h + +#include inotify/inotify.h +#include ../fs/mount.h + +struct inode_file_handle { + struct file_handle h; + struct fid fid; +} __packed; + +#if defined(CONFIG_PROC_FS) + +#if defined(CONFIG_INOTIFY_USER) || defined(CONFIG_FANOTIFY) + +#ifdef CONFIG_EXPORTFS +static int inotify_encode_target(struct inode *inode, struct inode_file_handle *fhandle) +{ + int ret, size; + + size = sizeof(fhandle-fid) 2; + ret = export_encode_inode_fh(inode, fhandle-fid, size); + BUG_ON(ret != FILEID_INO32_GEN); + + fhandle-h.handle_type = FILEID_INO32_GEN; + fhandle-h.handle_bytes = size * sizeof(u32); + + return 0; +} +#else +static int inotify_encode_target(struct inode *inode, struct inode_file_handle *fhandle) +{ + fhandle-h.handle_type = FILEID_ROOT; + fhandle-h.handle_bytes = 0; + return 0; +} +#endif /* CONFIG_EXPORTFS */ + +static int show_fdinfo(struct seq_file *m, struct file *f, + int (*show)(struct seq_file *m, struct fsnotify_mark *mark)) +{ + struct fsnotify_group *group = f-private_data; + struct fsnotify_mark *mark; + int ret = 0; + + spin_lock(group-mark_lock); + list_for_each_entry(mark, group-marks_list, g_list) { + ret = show(m, mark); + if (ret) + break; + } + spin_unlock(group-mark_lock); + return ret; +} + +#ifdef CONFIG_INOTIFY_USER + +static int
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Wed, Aug 15, 2012 at 10:05:28AM +0100, Mel Gorman wrote: On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. To clarify, are you concerned about cost of rcu_read_lock for non balloon pages? -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 8/8] fdinfo: Show sigmask for signalfd fd v2
Signed-off-by: Pavel Emelyanov xe...@parallels.com Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org --- fs/proc/array.c |2 +- fs/signalfd.c | 26 ++ include/linux/proc_fs.h |3 +++ 3 files changed, 30 insertions(+), 1 deletion(-) Index: linux-2.6.git/fs/proc/array.c === --- linux-2.6.git.orig/fs/proc/array.c +++ linux-2.6.git/fs/proc/array.c @@ -220,7 +220,7 @@ static inline void task_state(struct seq seq_putc(m, '\n'); } -static void render_sigset_t(struct seq_file *m, const char *header, +void render_sigset_t(struct seq_file *m, const char *header, sigset_t *set) { int i; Index: linux-2.6.git/fs/signalfd.c === --- linux-2.6.git.orig/fs/signalfd.c +++ linux-2.6.git/fs/signalfd.c @@ -29,6 +29,7 @@ #include linux/anon_inodes.h #include linux/signalfd.h #include linux/syscalls.h +#include linux/proc_fs.h void signalfd_cleanup(struct sighand_struct *sighand) { @@ -46,6 +47,7 @@ void signalfd_cleanup(struct sighand_str } struct signalfd_ctx { + seqcount_t cnt; sigset_t sigmask; }; @@ -227,7 +229,28 @@ static ssize_t signalfd_read(struct file return total ? total: ret; } +#ifdef CONFIG_PROC_FS +static int signalfd_show_fdinfo(struct seq_file *m, struct file *f) +{ + struct signalfd_ctx *ctx = f-private_data; + sigset_t sigmask; + unsigned seq; + + do { + seq = read_seqcount_begin(ctx-cnt); + sigmask = ctx-sigmask; + } while (read_seqcount_retry(ctx-cnt, seq)); + + signotset(sigmask); + render_sigset_t(m, sigmask:\t, sigmask); + return 0; +} +#endif + static const struct file_operations signalfd_fops = { +#ifdef CONFIG_PROC_FS + .show_fdinfo= signalfd_show_fdinfo, +#endif .release= signalfd_release, .poll = signalfd_poll, .read = signalfd_read, @@ -259,6 +282,7 @@ SYSCALL_DEFINE4(signalfd4, int, ufd, sig return -ENOMEM; ctx-sigmask = sigmask; + seqcount_init(ctx-cnt); /* * When we call this, the initialization must be complete, since @@ -279,7 +303,9 @@ SYSCALL_DEFINE4(signalfd4, int, ufd, sig return -EINVAL; } spin_lock_irq(current-sighand-siglock); + write_seqcount_begin(ctx-cnt); ctx-sigmask = sigmask; + write_seqcount_end(ctx-cnt); spin_unlock_irq(current-sighand-siglock); wake_up(current-sighand-signalfd_wqh); Index: linux-2.6.git/include/linux/proc_fs.h === --- linux-2.6.git.orig/include/linux/proc_fs.h +++ linux-2.6.git/include/linux/proc_fs.h @@ -290,4 +290,7 @@ static inline struct net *PDE_NET(struct return pde-parent-data; } +#include asm/signal.h + +void render_sigset_t(struct seq_file *m, const char *header, sigset_t *set); #endif /* _LINUX_PROC_FS_H */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/8] procfs: Add ability to plug in auxiliary fdinfo providers
This patch brings ability to print out auxiliary data associated with file in procfs interface /proc/pid/fdinfo/fd. In particular further patches make eventfd, evenpoll, signalfd and fsnotify to print additional information complete enough to restore these objects after checkpoint. To simplify the code we add show_fdinfo callback inside struct file_operations (as Al proposed and Pavel are proposing). Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org CC: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/proc/fd.c | 51 --- include/linux/fs.h |3 +++ 2 files changed, 39 insertions(+), 15 deletions(-) Index: linux-2.6.git/fs/proc/fd.c === --- linux-2.6.git.orig/fs/proc/fd.c +++ linux-2.6.git/fs/proc/fd.c @@ -15,11 +15,11 @@ #include fd.h struct proc_fdinfo { - loff_t f_pos; - int f_flags; + struct file *f_file; + int f_flags; }; -static int fdinfo_open_helper(struct inode *inode, int *f_flags, struct path *path) +static int fdinfo_open_helper(struct inode *inode, int *f_flags, struct file **f_file, struct path *path) { struct files_struct *files = NULL; struct task_struct *task; @@ -49,6 +49,10 @@ static int fdinfo_open_helper(struct ino *path = fd_file-f_path; path_get(fd_file-f_path); } + if (f_file) { + *f_file = fd_file; + get_file(fd_file); + } ret = 0; } spin_unlock(files-file_lock); @@ -61,28 +65,44 @@ static int fdinfo_open_helper(struct ino static int seq_show(struct seq_file *m, void *v) { struct proc_fdinfo *fdinfo = m-private; - seq_printf(m, pos:\t%lli\nflags:\t0%o\n, - (long long)fdinfo-f_pos, - fdinfo-f_flags); - return 0; + int ret; + + ret = seq_printf(m, pos:\t%lli\nflags:\t0%o\n, +(long long)fdinfo-f_file-f_pos, +fdinfo-f_flags); + + if (!ret fdinfo-f_file-f_op-show_fdinfo) + ret = fdinfo-f_file-f_op-show_fdinfo(m, fdinfo-f_file); + + return ret; } static int seq_fdinfo_open(struct inode *inode, struct file *file) { - struct proc_fdinfo *fdinfo = NULL; - int ret = -ENOENT; + struct proc_fdinfo *fdinfo; + struct seq_file *m; + int ret; fdinfo = kzalloc(sizeof(*fdinfo), GFP_KERNEL); if (!fdinfo) return -ENOMEM; - ret = fdinfo_open_helper(inode, fdinfo-f_flags, NULL); - if (!ret) { - ret = single_open(file, seq_show, fdinfo); - if (!ret) - fdinfo = NULL; + ret = fdinfo_open_helper(inode, fdinfo-f_flags, fdinfo-f_file, NULL); + if (ret) + goto err_free; + + ret = single_open(file, seq_show, fdinfo); + if (ret) { + put_filp(fdinfo-f_file); + goto err_free; } + m = file-private_data; + m-private = fdinfo; + + return ret; + +err_free: kfree(fdinfo); return ret; } @@ -92,6 +112,7 @@ static int seq_fdinfo_release(struct ino struct seq_file *m = file-private_data; struct proc_fdinfo *fdinfo = m-private; + put_filp(fdinfo-f_file); kfree(fdinfo); return single_release(inode, file); @@ -173,7 +194,7 @@ static const struct dentry_operations ti static int proc_fd_link(struct dentry *dentry, struct path *path) { - return fdinfo_open_helper(dentry-d_inode, NULL, path); + return fdinfo_open_helper(dentry-d_inode, NULL, NULL, path); } static struct dentry * Index: linux-2.6.git/include/linux/fs.h === --- linux-2.6.git.orig/include/linux/fs.h +++ linux-2.6.git/include/linux/fs.h @@ -1775,6 +1775,8 @@ struct block_device_operations; #define HAVE_COMPAT_IOCTL 1 #define HAVE_UNLOCKED_IOCTL 1 +struct seq_file; + struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); @@ -1803,6 +1805,7 @@ struct file_operations { int (*setlease)(struct file *, long, struct file_lock **); long (*fallocate)(struct file *file, int mode, loff_t offset, loff_t len); + int (*show_fdinfo)(struct seq_file *m, struct file *f); }; struct inode_operations { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at
[patch 2/8] procfs: Convert /proc/pid/fdinfo/ handling routines to seq-file
This patch converts /proc/pid/fdinfo/ handling routines to seq-file which is needed to extend seq operations and plug in auxiliary fdinfo provides from subsystems like eventfd/eventpoll/fsnotify. Note the proc_fd_link no longer call for proc_fd_info, simply because proc_fd_info is converted to seq_fdinfo_open (which is seq-file open() prototype). Also, to eliminate code duplication (and Pavel's concerns) the fdinfo_open_helper function introduced which is used in both seq_fdinfo_open and proc_fd_link. Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org Acked-by: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/proc/fd.c | 123 +++ 1 file changed, 75 insertions(+), 48 deletions(-) Index: linux-2.6.git/fs/proc/fd.c === --- linux-2.6.git.orig/fs/proc/fd.c +++ linux-2.6.git/fs/proc/fd.c @@ -6,61 +6,104 @@ #include linux/namei.h #include linux/pid.h #include linux/security.h +#include linux/file.h +#include linux/seq_file.h #include linux/proc_fs.h #include internal.h #include fd.h -#define PROC_FDINFO_MAX 64 +struct proc_fdinfo { + loff_t f_pos; + int f_flags; +}; -static int proc_fd_info(struct inode *inode, struct path *path, char *info) +static int fdinfo_open_helper(struct inode *inode, int *f_flags, struct path *path) { - struct task_struct *task = get_proc_task(inode); struct files_struct *files = NULL; - int fd = proc_fd(inode); - struct file *file; + struct task_struct *task; + int ret = -ENOENT; + task = get_proc_task(inode); if (task) { files = get_files_struct(task); put_task_struct(task); } + if (files) { - /* -* We are not taking a ref to the file structure, so we must -* hold -file_lock. -*/ - spin_lock(files-file_lock); - file = fcheck_files(files, fd); - if (file) { - unsigned int f_flags; - struct fdtable *fdt; - - fdt = files_fdtable(files); - f_flags = file-f_flags ~O_CLOEXEC; - if (close_on_exec(fd, fdt)) - f_flags |= O_CLOEXEC; + int fd = proc_fd(inode); + struct file *fd_file; + spin_lock(files-file_lock); + fd_file = fcheck_files(files, fd); + if (fd_file) { + if (f_flags) { + struct fdtable *fdt = files_fdtable(files); + + *f_flags = fd_file-f_flags ~O_CLOEXEC; + if (close_on_exec(fd, fdt)) + *f_flags |= O_CLOEXEC; + } if (path) { - *path = file-f_path; - path_get(file-f_path); + *path = fd_file-f_path; + path_get(fd_file-f_path); } - if (info) - snprintf(info, PROC_FDINFO_MAX, -pos:\t%lli\n -flags:\t0%o\n, -(long long) file-f_pos, -f_flags); - spin_unlock(files-file_lock); - put_files_struct(files); - return 0; + ret = 0; } spin_unlock(files-file_lock); put_files_struct(files); } - return -ENOENT; + + return ret; } +static int seq_show(struct seq_file *m, void *v) +{ + struct proc_fdinfo *fdinfo = m-private; + seq_printf(m, pos:\t%lli\nflags:\t0%o\n, + (long long)fdinfo-f_pos, + fdinfo-f_flags); + return 0; +} + +static int seq_fdinfo_open(struct inode *inode, struct file *file) +{ + struct proc_fdinfo *fdinfo = NULL; + int ret = -ENOENT; + + fdinfo = kzalloc(sizeof(*fdinfo), GFP_KERNEL); + if (!fdinfo) + return -ENOMEM; + + ret = fdinfo_open_helper(inode, fdinfo-f_flags, NULL); + if (!ret) { + ret = single_open(file, seq_show, fdinfo); + if (!ret) + fdinfo = NULL; + } + + kfree(fdinfo); + return ret; +} + +static int seq_fdinfo_release(struct inode *inode, struct file *file) +{ + struct seq_file *m = file-private_data; + struct proc_fdinfo *fdinfo = m-private; + + kfree(fdinfo);
[patch 7/8] fs, epoll: Add procfs fdinfo helper v2
This allow us to print out eventpoll target file descriptor, events and data, the /proc/pid/fdinfo/fd consists of | pos: 0 | flags: 02 | tfd:5 events: 1d data: This feature is CONFIG_CHECKPOINT_RESTORE only. Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org CC: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com CC: Matthew Helsley matt.hels...@gmail.com --- fs/eventpoll.c | 28 1 file changed, 28 insertions(+) Index: linux-2.6.git/fs/eventpoll.c === --- linux-2.6.git.orig/fs/eventpoll.c +++ linux-2.6.git/fs/eventpoll.c @@ -38,6 +38,8 @@ #include asm/io.h #include asm/mman.h #include linux/atomic.h +#include linux/proc_fs.h +#include linux/seq_file.h /* * LOCKING: @@ -783,8 +785,34 @@ static unsigned int ep_eventpoll_poll(st return pollflags != -1 ? pollflags : 0; } +#ifdef CONFIG_PROC_FS +static int ep_show_fdinfo(struct seq_file *m, struct file *f) +{ + struct eventpoll *ep = f-private_data; + struct rb_node *rbp; + int ret; + + mutex_lock(ep-mtx); + for (rbp = rb_first(ep-rbr); rbp; rbp = rb_next(rbp)) { + struct epitem *epi = rb_entry(rbp, struct epitem, rbn); + + ret = seq_printf(m, tfd: %8d events: %8x data: %16llx\n, +epi-ffd.fd, epi-event.events, +(long long)epi-event.data); + if (ret) + break; + } + mutex_unlock(ep-mtx); + + return ret; +} +#endif + /* File callbacks that implement the eventpoll file behaviour */ static const struct file_operations eventpoll_fops = { +#ifdef CONFIG_PROC_FS + .show_fdinfo= ep_show_fdinfo, +#endif .release= ep_eventpoll_release, .poll = ep_eventpoll_poll, .llseek = noop_llseek, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/8] procfs: Move /proc/pid/fd[info] handling code to fd.[ch]
This patch prepares the ground for further extension of /proc/pid/fd[info] handling code by moving fdinfo handling code into fs/proc/fd.c. I think such move makes both fs/proc/base.c and fs/proc/fd.c easier to read. Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org Acked-by: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/proc/Makefile |2 fs/proc/base.c | 388 - fs/proc/fd.c | 351 +++ fs/proc/fd.h | 14 + fs/proc/internal.h | 48 ++ 5 files changed, 416 insertions(+), 387 deletions(-) Index: linux-2.6.git/fs/proc/Makefile === --- linux-2.6.git.orig/fs/proc/Makefile +++ linux-2.6.git/fs/proc/Makefile @@ -8,7 +8,7 @@ proc-y := nommu.o task_nommu.o proc-$(CONFIG_MMU) := mmu.o task_mmu.o proc-y += inode.o root.o base.o generic.o array.o \ - proc_tty.o + proc_tty.o fd.o proc-y += cmdline.o proc-y += consoles.o proc-y += cpuinfo.o Index: linux-2.6.git/fs/proc/base.c === --- linux-2.6.git.orig/fs/proc/base.c +++ linux-2.6.git/fs/proc/base.c @@ -90,6 +90,7 @@ #endif #include trace/events/oom.h #include internal.h +#include fd.h /* NOTE: * Implementing inode permission operations in /proc is almost @@ -136,8 +137,6 @@ struct pid_entry { NULL, proc_single_file_operations, \ { .proc_show = show } ) -static int proc_fd_permission(struct inode *inode, int mask); - /* * Count the number of hardlinks for the pid_entry table, excluding the . * and .. links. @@ -1492,7 +1491,7 @@ out: return error; } -static const struct inode_operations proc_pid_link_inode_operations = { +const struct inode_operations proc_pid_link_inode_operations = { .readlink = proc_pid_readlink, .follow_link= proc_pid_follow_link, .setattr= proc_setattr, @@ -1501,21 +1500,6 @@ static const struct inode_operations pro /* building an inode */ -static int task_dumpable(struct task_struct *task) -{ - int dumpable = 0; - struct mm_struct *mm; - - task_lock(task); - mm = task-mm; - if (mm) - dumpable = get_dumpable(mm); - task_unlock(task); - if(dumpable == 1) - return 1; - return 0; -} - struct inode *proc_pid_make_inode(struct super_block * sb, struct task_struct *task) { struct inode * inode; @@ -1641,15 +1625,6 @@ int pid_revalidate(struct dentry *dentry return 0; } -static int pid_delete_dentry(const struct dentry * dentry) -{ - /* Is the task we represent dead? -* If so, then don't put the dentry on the lru list, -* kill it immediately. -*/ - return !proc_pid(dentry-d_inode)-tasks[PIDTYPE_PID].first; -} - const struct dentry_operations pid_dentry_operations = { .d_revalidate = pid_revalidate, @@ -1712,289 +1687,6 @@ end_instantiate: return filldir(dirent, name, len, filp-f_pos, ino, type); } -static unsigned name_to_int(struct dentry *dentry) -{ - const char *name = dentry-d_name.name; - int len = dentry-d_name.len; - unsigned n = 0; - - if (len 1 *name == '0') - goto out; - while (len-- 0) { - unsigned c = *name++ - '0'; - if (c 9) - goto out; - if (n = (~0U-9)/10) - goto out; - n *= 10; - n += c; - } - return n; -out: - return ~0U; -} - -#define PROC_FDINFO_MAX 64 - -static int proc_fd_info(struct inode *inode, struct path *path, char *info) -{ - struct task_struct *task = get_proc_task(inode); - struct files_struct *files = NULL; - struct file *file; - int fd = proc_fd(inode); - - if (task) { - files = get_files_struct(task); - put_task_struct(task); - } - if (files) { - /* -* We are not taking a ref to the file structure, so we must -* hold -file_lock. -*/ - spin_lock(files-file_lock); - file = fcheck_files(files, fd); - if (file) { - unsigned int f_flags; - struct fdtable *fdt; - - fdt = files_fdtable(files); - f_flags = file-f_flags ~O_CLOEXEC; - if (close_on_exec(fd, fdt)) - f_flags |= O_CLOEXEC; - - if (path) { - *path = file-f_path; -
Mmap on SSD (directly mapping the device vs. mapping a file)
Hi, folks! Like you can see on the subject I experimented a little with mmap in the last time. I've written a little B+tree library which uses mmap to store the tree to a file or the whole device (means it is also possible to map the raw device (i.e. /dev/sdb)). I used msync after every successfull change on the tree. Next thing I did was to use this for a little benchmark on performance of different storage devices (ramdisk, HDD, and a very fast flashcard directly atteched to the PCIe bus). I recognized that in allmost all cases when directly mapping the device without a filesystem the file-mapped version was a little bit slower. But when I tried it on an SSD device the file-mapped version was an order of magnitude faster. I also tried secure erase and did the benchmarks many times and in many configurations, but I came to the same results. Last thing I tried were the different queue schedulers, without any changes. In one of the posts from january I read that there is a performance bug when directly reading from the raw SSD device, but I didn't find any else comment which stated this as true. For the benchmarks I used a current Ubuntu with a 3.2.16 kernel (from kernel.org). Is this behavior normal, or did I miss something. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 4/8] fs, exportfs: Add export_encode_inode_fh helper
To provide fsnotify object inodes being watched without binding to alphabetical path we need to encode them with exportfs help. This patch adds a helper which operates with plain inodes directly. Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org Acked-by: Pavel Emelyanov xe...@parallels.com CC: Al Viro v...@zeniv.linux.org.uk CC: Alexey Dobriyan adobri...@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: James Bottomley jbottom...@parallels.com --- fs/exportfs/expfs.c | 19 +++ include/linux/exportfs.h |2 ++ 2 files changed, 21 insertions(+) Index: linux-2.6.git/fs/exportfs/expfs.c === --- linux-2.6.git.orig/fs/exportfs/expfs.c +++ linux-2.6.git/fs/exportfs/expfs.c @@ -302,6 +302,25 @@ out: return error; } +int export_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len) +{ + int len = *max_len; + int type = FILEID_INO32_GEN; + + if (len 2) { + *max_len = 2; + return 255; + } + + len = 2; + fid-i32.ino = inode-i_ino; + fid-i32.gen = inode-i_generation; + *max_len = len; + + return type; +} +EXPORT_SYMBOL_GPL(export_encode_inode_fh); + /** * export_encode_fh - default export_operations-encode_fh function * @inode: the object to encode Index: linux-2.6.git/include/linux/exportfs.h === --- linux-2.6.git.orig/include/linux/exportfs.h +++ linux-2.6.git/include/linux/exportfs.h @@ -177,6 +177,8 @@ struct export_operations { int (*commit_metadata)(struct inode *inode); }; +extern int export_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len); + extern int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len, int connectable); extern struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: twofish - add x86_64/avx assembler implementation
On Wed, Aug 15, 2012 at 11:42:16AM +0300, Jussi Kivilinna wrote: I started thinking about the performance on AMD Bulldozer. vmovq/vmovd/vpextr*/vpinsr* between FPU and general purpose registers on AMD CPU is alot slower (latencies from 8 to 12 cycles) than on Intel sandy-bridge (where instructions have latency of 1 to 2). See: http://www.agner.org/optimize/instruction_tables.pdf It would be really good, if implementation could be tested on AMD CPU to determinate, if it causes performance regression. However I don't have access to machine with such CPU. But I do. :) And if you tell me exactly how to run the tests and on what kernel, I'll try to do so. HTH. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] kconfig: remove CONFIG_MTD_NAND_VERIFY_WRITE
Just as Artem suggested: Both UBI and JFFS2 are able to read verify what they wrote already. There are also MTD tests which do this verification. So I think there is no reason to keep this in the NAND layer, let alone wasting RAM in the driver to support this feature. So kill MTD_NAND_VERIFY_WRITE entirely. Please see the patch: http://lists.infradead.org/pipermail/linux-mtd/2012-August/043189.html This patch removes the CONFIG_MTD_NAND_VERIFY_WRITE in the defconfigs. Signed-off-by: Huang Shijie b32...@freescale.com --- arch/arm/configs/bcmring_defconfig |1 - arch/arm/configs/cam60_defconfig|1 - arch/arm/configs/corgi_defconfig|1 - arch/arm/configs/ep93xx_defconfig |1 - arch/arm/configs/mini2440_defconfig |1 - arch/arm/configs/mv78xx0_defconfig |1 - arch/arm/configs/nhk8815_defconfig |1 - arch/arm/configs/orion5x_defconfig |1 - arch/arm/configs/pxa3xx_defconfig |1 - arch/arm/configs/spitz_defconfig|1 - arch/blackfin/configs/BF561-ACVILON_defconfig |1 - arch/mips/configs/rb532_defconfig |1 - arch/powerpc/configs/83xx/mpc8313_rdb_defconfig |1 - arch/powerpc/configs/83xx/mpc8315_rdb_defconfig |1 - arch/powerpc/configs/mpc83xx_defconfig |1 - 15 files changed, 0 insertions(+), 15 deletions(-) diff --git a/arch/arm/configs/bcmring_defconfig b/arch/arm/configs/bcmring_defconfig index 9e6a8fe..6c389d9 100644 --- a/arch/arm/configs/bcmring_defconfig +++ b/arch/arm/configs/bcmring_defconfig @@ -44,7 +44,6 @@ CONFIG_MTD_CFI_ADV_OPTIONS=y CONFIG_MTD_CFI_GEOMETRY=y # CONFIG_MTD_CFI_I2 is not set CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_BCM_UMI=y CONFIG_MTD_NAND_BCM_UMI_HWCS=y # CONFIG_MISC_DEVICES is not set diff --git a/arch/arm/configs/cam60_defconfig b/arch/arm/configs/cam60_defconfig index cedc92e..1457971 100644 --- a/arch/arm/configs/cam60_defconfig +++ b/arch/arm/configs/cam60_defconfig @@ -49,7 +49,6 @@ CONFIG_MTD_COMPLEX_MAPPINGS=y CONFIG_MTD_PLATRAM=m CONFIG_MTD_DATAFLASH=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_ATMEL=y CONFIG_BLK_DEV_LOOP=y CONFIG_BLK_DEV_RAM=y diff --git a/arch/arm/configs/corgi_defconfig b/arch/arm/configs/corgi_defconfig index e53c475..4b8a25d 100644 --- a/arch/arm/configs/corgi_defconfig +++ b/arch/arm/configs/corgi_defconfig @@ -97,7 +97,6 @@ CONFIG_MTD_BLOCK=y CONFIG_MTD_ROM=y CONFIG_MTD_COMPLEX_MAPPINGS=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_SHARPSL=y CONFIG_BLK_DEV_LOOP=y CONFIG_IDE=y diff --git a/arch/arm/configs/ep93xx_defconfig b/arch/arm/configs/ep93xx_defconfig index 8e97b2f..806005a 100644 --- a/arch/arm/configs/ep93xx_defconfig +++ b/arch/arm/configs/ep93xx_defconfig @@ -61,7 +61,6 @@ CONFIG_MTD_CFI_STAA=y CONFIG_MTD_ROM=y CONFIG_MTD_PHYSMAP=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_BLK_DEV_NBD=y CONFIG_EEPROM_LEGACY=y CONFIG_SCSI=y diff --git a/arch/arm/configs/mini2440_defconfig b/arch/arm/configs/mini2440_defconfig index 082175c..00630e6 100644 --- a/arch/arm/configs/mini2440_defconfig +++ b/arch/arm/configs/mini2440_defconfig @@ -102,7 +102,6 @@ CONFIG_MTD_CFI_STAA=y CONFIG_MTD_RAM=y CONFIG_MTD_ROM=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_S3C2410=y CONFIG_MTD_NAND_PLATFORM=y CONFIG_MTD_LPDDR=y diff --git a/arch/arm/configs/mv78xx0_defconfig b/arch/arm/configs/mv78xx0_defconfig index 7305ebd..1f08219 100644 --- a/arch/arm/configs/mv78xx0_defconfig +++ b/arch/arm/configs/mv78xx0_defconfig @@ -49,7 +49,6 @@ CONFIG_MTD_CFI_INTELEXT=y CONFIG_MTD_CFI_AMDSTD=y CONFIG_MTD_PHYSMAP=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_ORION=y CONFIG_BLK_DEV_LOOP=y # CONFIG_SCSI_PROC_FS is not set diff --git a/arch/arm/configs/nhk8815_defconfig b/arch/arm/configs/nhk8815_defconfig index bf123c5..240b25e 100644 --- a/arch/arm/configs/nhk8815_defconfig +++ b/arch/arm/configs/nhk8815_defconfig @@ -57,7 +57,6 @@ CONFIG_MTD_CHAR=y CONFIG_MTD_BLOCK=y CONFIG_MTD_NAND=y CONFIG_MTD_NAND_ECC_SMC=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_NOMADIK=y CONFIG_MTD_ONENAND=y CONFIG_MTD_ONENAND_VERIFY_WRITE=y diff --git a/arch/arm/configs/orion5x_defconfig b/arch/arm/configs/orion5x_defconfig index a288d70..cd5e6ba 100644 --- a/arch/arm/configs/orion5x_defconfig +++ b/arch/arm/configs/orion5x_defconfig @@ -72,7 +72,6 @@ CONFIG_MTD_CFI_INTELEXT=y CONFIG_MTD_CFI_AMDSTD=y CONFIG_MTD_PHYSMAP=y CONFIG_MTD_NAND=y -CONFIG_MTD_NAND_VERIFY_WRITE=y CONFIG_MTD_NAND_PLATFORM=y CONFIG_MTD_NAND_ORION=y CONFIG_BLK_DEV_LOOP=y diff --git a/arch/arm/configs/pxa3xx_defconfig b/arch/arm/configs/pxa3xx_defconfig index 1677a06..60e3138 100644 --- a/arch/arm/configs/pxa3xx_defconfig +++ b/arch/arm/configs/pxa3xx_defconfig @@ -36,7 +36,6 @@ CONFIG_MTD_CONCAT=y CONFIG_MTD_CHAR=y
[PATCH] video:uvesafb: reduce the double check
uvesafb_open had checked the par-vbe_state_size, so we don't need to check it again in uvesafb_vbe_state_save, this patch just can reduce a few lines of code. Signed-off-by: Wang YanQing udkni...@gmail.com --- drivers/video/uvesafb.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/video/uvesafb.c b/drivers/video/uvesafb.c index 2f8f82d..b064a3e 100644 --- a/drivers/video/uvesafb.c +++ b/drivers/video/uvesafb.c @@ -357,9 +357,6 @@ static u8 *uvesafb_vbe_state_save(struct uvesafb_par *par) u8 *state; int err; - if (!par-vbe_state_size) - return NULL; - state = kmalloc(par-vbe_state_size, GFP_KERNEL); if (!state) return ERR_PTR(-ENOMEM); -- 1.7.11.1.116.g8228a23 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] kbuild: move W=... stuff to Kbuild.arch
On Wed, 2012-06-06 at 17:35 +0200, Sam Ravnborg wrote: On Wed, Jun 06, 2012 at 01:18:47PM +0300, Artem Bityutskiy wrote: On Sat, 2012-05-05 at 10:18 +0200, Sam Ravnborg wrote: Prevent that we eveluate cc-option multiple times for the same option by moving the definitions to Kbuild.arch. The file is included once only, thus gcc is not invoked once per directory. Another side-effect of this patch is that KCFLAGS are appended last to the list of options. This allows us to better control the options. Artem Bityutskiy dedeki...@gmail.com noticed this. Signed-off-by: Sam Ravnborg s...@ravnborg.org Cc: Artem Bityutskiy dedeki...@gmail.com Hi, what happened to this patch? I was fixing the real issue I am encountering and I thought it'd be taken instead of my original patch. We decided to move this to next merge release because is was not added to kbuild thus not enough exposure in -next. I am planning to resend the serie at around -rc2 time. Hi Sam, what happened to this patch-set? At least KCFLAGS patches I am waiting for are still not merged. -- Best Regards, Artem Bityutskiy signature.asc Description: This is a digitally signed message part
Re: [PATCH v2 04/11] kmem accounting basic infrastructure
We always account to both user and kernel resource_counters. This effectively means that an independent kernel limit is in place when the limit is set to a lower value than the user memory. A equal or higher value means that the user limit will always hit first, meaning that kmem is effectively unlimited. Well, it contributes to the user limit so it is not unlimited. It just falls under a different limit and it tends to contribute less. You are right, but this is just wording. I will update it, but what I really mean here is that an independent limit is no imposed on kmem. This can be quite confusing. I am still not sure whether we should mix the two things together. If somebody wants to limit the kernel memory he has to touch the other limit anyway. Do you have a strong reason to mix the user and kernel counters? This is funny, because the first opposition I found to this work was Why would anyone want to limit it separately? =p It seems that a quite common use case is to have a container with a unified view of memory that it can use the way he likes, be it with kernel memory, or user memory. I believe those people would be happy to just silently account kernel memory to user memory, or at the most have a switch to enable it. What gets clear from this back and forth, is that there are people interested in both use cases. My impression was that kernel allocation should simply fail while user allocations might reclaim as well. Why should we reclaim just because of the kernel allocation (which is unreclaimable from hard limit reclaim point of view)? That is not what the kernel does, in general. We assume that if he wants that memory and we can serve it, we should. Also, not all kernel memory is unreclaimable. We can shrink the slabs, for instance. Ying Han claims she has patches for that already... I also think that the whole thing would get much simpler if those two are split. Anyway if this is really a must then this should be documented here. Well, documentation can't hurt. This doesn't check for the hierachy so kmem_accounted might not be in sync with it's parents. mem_cgroup_create (below) needs to copy kmem_accounted down from the parent and the above needs to check if this is a similar dance like mem_cgroup_oom_control_write. I don't see why we have to. I believe in a A/B/C hierarchy, C should be perfectly able to set a different limit than its parents. Note that this is not a boolean. Also, right now, C can become completely unlimited (by not setting a limited) and this is, indeed, not the desired behavior. A later patch will change kmem_accounted to a bitfield, and we'll use one of the bits to signal that we should account kmem because our parent is limited. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] HID:hid-multitouch: Add ELAN production request when resume.
On Wed, 15 Aug 2012, Scott Liu wrote: Add ELAN production request when resume. Some Elan legacy devices require SET_IDLE to be set on resume. It should be safe to send it to other devices too. Tested on 3M, Stantum, Cypress, Zytronic, eGalax, and Elan panels. Signed-off-by: Scott Liu scott@emc.com.tw Suggested-off-by: Benjamin Tissoires benjamin.tissoi...@enac.fr --- drivers/hid/hid-multitouch.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 59c8b5c..e824c37 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -767,6 +767,32 @@ static int mt_reset_resume(struct hid_device *hdev) mt_set_input_mode(hdev); return 0; } + +static int mt_resume(struct hid_device *hdev) +{ + struct usb_interface *intf; + struct usb_host_interface *interface; + struct usb_device *dev; + + if (hdev-bus != BUS_USB) + return 0; + + intf = to_usb_interface(hdev-dev.parent); + interface = intf-cur_altsetting; + dev = hid_to_usb_dev(hdev); + + /* Some Elan legacy devices require SET_IDLE to be set on resume. + * It should be safe to send it to other devices too. + * Tested on 3M, Stantum, Cypress, Zytronic, eGalax, and Elan panels. */ + + usb_control_msg(dev, usb_sndctrlpipe(dev, 0), + HID_REQ_SET_IDLE, + USB_TYPE_CLASS | USB_RECIP_INTERFACE, + 0, interface-desc.bInterfaceNumber, + NULL, 0, USB_CTRL_SET_TIMEOUT); + + return 0; +} #endif static void mt_remove(struct hid_device *hdev) @@ -1092,6 +1118,7 @@ static struct hid_driver mt_driver = { .event = mt_event, #ifdef CONFIG_PM .reset_resume = mt_reset_resume, + .resume = mt_resume, #endif }; I am now queuing this one in my tree. If it later turns out that some devices actually don't like this request (which one would hope is very unlinkely to happen), we'll have to make it device specific. Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] HID: picoLCD updates
Hi Jiri, On Wed, 15 August 2012 Jiri Kosina jkos...@suse.cz wrote: On Mon, 30 Jul 2012, Bruno Prémont wrote: Hi, This series updates picoLCD driver: - split the driver functions into separate files which get included depending on Kconfig selection (implementation for CIR using RC_CORE will follow later) - drop private framebuffer refcounting in favor of refcounting added to fb_info some time ago - fix various bugs issues - disabled firmware version checking in probe() as it does not work anymore since commit 4ea5454203d991ec85264f64f89ca8855fce69b0 [HID: Fix race condition between driver core and ll-driver] I have now applied the series to my 'picolcd' branch, except for 6/7, please see the comment I sent to it separately. Will respin that one soon Note: I still get weird behavior on quick unbind/bind sequences issued via sysfs (CONFIG_SMP=n system) that are triggered by framebuffer support and apparently more specifically fb_defio part of it. Unfortunately I'm out of ideas as to how to track down the problem which shows either as SLAB corruption (detected with SLUB debugging, e.g. Would be nice to have this sorted out before the next merge window indeed, so that it can go in together with the rest of the changes. [ 6383.521833] = [ 6383.530020] BUG kmalloc-64 (Not tainted): Object already free [ 6383.530020] - [ 6383.530020] [ 6383.530020] INFO: Slab 0xdde0ea20 objects=51 used=40 fp=0xcef516e0 flags=0x4080 [ 6383.530020] INFO: Object 0xcef51190 @offset=400 fp=0xcef51f50 [ 6383.530020] [ 6383.530020] Bytes b4 cef51180: cc cc cc cc d0 12 f5 ce 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Object cef51190: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b [ 6383.530020] Object cef511c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [ 6383.530020] Redzone cef511d0: bb bb bb bb [ 6383.530020] Padding cef511d8: 5a 5a 5a 5a 5a 5a 5a 5a [ 6383.530020] Pid: 1922, comm: bash Not tainted 3.5.0-jupiter-3-g8d858b1-dirty #2 [ 6383.530020] Call Trace: [ 6383.530020] [c10bd3cc] print_trailer+0x11c/0x130 [ 6383.530020] [c10bd415] object_err+0x35/0x40 [ 6383.530020] [c10be809] free_debug_processing+0x99/0x200 [ 6383.530020] [c10bf77e] __slab_free+0x2e/0x280 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322870] ? __usbhid_submit_report+0xc0/0x3c0 [ 6383.530020] [c10bfbda] ? kfree+0xfa/0x110 [ 6383.530020] [de932aa4] ? picolcd_debug_out_report+0x8c4/0x8e0 [hid_picolcd] [ 6383.530020] [c10bfbda] kfree+0xfa/0x110 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] ? hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322284] hid_submit_out+0xa4/0x120 [ 6383.530020] [c1322908] __usbhid_submit_report+0x158/0x3c0 [ 6383.530020] [c1322c2b] usbhid_submit_report+0x1b/0x30 [ 6383.530020] [de930789] picolcd_fb_reset+0xb9/0x180 [hid_picolcd] [ 6383.530020] [de930f1d] picolcd_init_framebuffer+0x20d/0x2e0 [hid_picolcd] [ 6383.530020] [de92fb9c] picolcd_probe+0x3cc/0x580 [hid_picolcd] [ 6383.530020] [c1319147] hid_device_probe+0x67/0xf0 [ 6383.530020] [c1282f97] ? driver_sysfs_add+0x57/0x80 [ 6383.530020] [c128329d] driver_probe_device+0xbd/0x1c0 [ 6383.530020] [c1318a1b] ? hid_match_device+0x7b/0x90 [ 6383.530020] [c12821e5] driver_bind+0x75/0xd0 [ 6383.530020] [c1282170] ? driver_unbind+0x90/0x90 [ 6383.530020] [c12818b7] drv_attr_store+0x27/0x30 [ 6383.530020] [c1114aec] sysfs_write_file+0xac/0xf0 [ 6383.530020] [c10c794c] vfs_write+0x9c/0x130 [ 6383.530020] [c10d4a1f] ? sys_dup3+0x11f/0x160 [ 6383.530020] [c1114a40] ? sysfs_poll+0x90/0x90 [ 6383.530020] [c10c7bbd] sys_write+0x3d/0x70 [ 6383.530020] [c13f2557] sysenter_do_call+0x12/0x26 So I am wondering whether the path this happens on is if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) { usbhid_restart_out_queue(usbhid); in __usbhid_submit_report(). It would then indicate perhaps some race with iofl handling. Huh, that specific test_bit hunk I can't find in __usbhid_submit_report, is that 3.6 material? I'm running my tests against 3.5... The nearest I have is: if (!test_bit(HID_OUT_RUNNING, usbhid-iofl)) if (!irq_out_pump_restart(hid))
[PATCH] act_mirred: do not drop packets when fails to mirror it
We drop packet unconditionally when we fail to mirror it. This is not intended in some cases. Consdier for kvm guest, we may mirror the traffic of the bridge to a tap device used by a VM. When kernel fails to mirror the packet in conditions such as when qemu crashes or stop polling the tap, it's hard for the management software to detect such condition and clean the the mirroring before. This would lead all packets to the bridge to be dropped and break the netowrk of other virtual machines. To solve the issue, the patch does not drop packets when kernel fails to mirror it, and only drop the redirected packets. Signed-off-by: Jason Wang jasow...@redhat.com --- net/sched/act_mirred.c |9 +++-- 1 files changed, 3 insertions(+), 6 deletions(-) diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index fe81cc1..3682951 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -198,15 +198,12 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a, err = dev_queue_xmit(skb2); out: - if (err) { + if (err) m-tcf_qstats.overlimits++; - /* should we be asking for packet to be dropped? -* may make sense for redirect case only -*/ + if (err m-tcf_action == TC_ACT_STOLEN) retval = TC_ACT_SHOT; - } else { + else retval = m-tcf_action; - } spin_unlock(m-tcf_lock); return retval; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 06/11] memcg: kmem controller infrastructure
+ * memcg_kmem_new_page: verify if a new kmem allocation is allowed. + * @gfp: the gfp allocation flags. + * @handle: a pointer to the memcg this was charged against. + * @order: allocation order. + * + * returns true if the memcg where the current task belongs can hold this + * allocation. + * + * We return true automatically if this allocation is not to be accounted to + * any memcg. + */ +static __always_inline bool +memcg_kmem_new_page(gfp_t gfp, void *handle, int order) +{ +if (!memcg_kmem_on) +return true; +if (!(gfp __GFP_KMEMCG) || (gfp __GFP_NOFAIL)) OK, I see the point behind __GFP_NOFAIL but it would deserve a comment or a mention in the changelog. documentation can't hurt! Just added. [...] diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 54e93de..e9824c1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c [...] +EXPORT_SYMBOL(__memcg_kmem_new_page); Why is this exported? It shouldn't be. Removed. + +void __memcg_kmem_commit_page(struct page *page, void *handle, int order) +{ +struct page_cgroup *pc; +struct mem_cgroup *memcg = handle; + +if (!memcg) +return; + +WARN_ON(mem_cgroup_is_root(memcg)); +/* The page allocation must have failed. Revert */ +if (!page) { +size_t size = PAGE_SIZE order; + +memcg_uncharge_kmem(memcg, size); +mem_cgroup_put(memcg); +return; +} + +pc = lookup_page_cgroup(page); +lock_page_cgroup(pc); +pc-mem_cgroup = memcg; +SetPageCgroupUsed(pc); Don't we need a write barrier before assigning memcg? Same as __mem_cgroup_commit_charge. This tests the Used bit always from within lock_page_cgroup so it should be safe but I am not 100% sure about the rest of the code. Well, I don't see the reason, precisely because we'll always grab it from within the locked region. That should ensure all the necessary serialization. +#ifdef CONFIG_MEMCG_KMEM +int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, s64 delta) +{ +struct res_counter *fail_res; +struct mem_cgroup *_memcg; +int ret; +bool may_oom; +bool nofail = false; + +may_oom = (gfp __GFP_WAIT) (gfp __GFP_FS) +!(gfp __GFP_NORETRY); This deserves a comment. can't hurt!! =) + +ret = 0; + +if (!memcg) +return ret; + +_memcg = memcg; +ret = __mem_cgroup_try_charge(NULL, gfp, delta / PAGE_SIZE, +_memcg, may_oom); This is really dangerous because atomic allocation which seem to be possible could result in deadlocks because of the reclaim. Can you elaborate on how this would happen? Also, as I have mentioned in the other email in this thread. Why should we reclaim just because of kernel allocation when we are not reclaiming any of it because shrink_slab is ignored in the memcg reclaim. Don't get too distracted by the fact that shrink_slab is ignored. It is temporary, and while this being ignored now leads to suboptimal behavior, it will 1st, only affect its users, and 2nd, not be disastrous. I see it this as more or less on pair with the soft limit reclaim problem we had. It is not ideal, but it already provided functionality + +if (ret == -EINTR) { +nofail = true; +/* + * __mem_cgroup_try_charge() chosed to bypass to root due to + * OOM kill or fatal signal. Since our only options are to + * either fail the allocation or charge it to this cgroup, do + * it as a temporary condition. But we can't fail. From a + * kmem/slab perspective, the cache has already been selected, + * by mem_cgroup_get_kmem_cache(), so it is too late to change + * our minds + */ +res_counter_charge_nofail(memcg-res, delta, fail_res); +if (do_swap_account) +res_counter_charge_nofail(memcg-memsw, delta, + fail_res); Hmmm, this is kind of ugly but I guess unvoidable with the current implementation. Oh well... Oh well... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Wed, Aug 15, 2012 at 12:25:28PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:05:28AM +0100, Mel Gorman wrote: On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. To clarify, are you concerned about cost of rcu_read_lock for non balloon pages? Not as such, but given the choice between introducing RCU locking and rechecking page-mapping under a spinlock I would choose the latter as it is more straight-forward. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked
On Tue, Aug 14, 2012 at 02:35:34PM -0500, Anthony Liguori wrote: Do you consider allowing support for Windows as overengineering? I don't think there is a way to hook BSOD on Windows so attempting to engineer something that works with Windows seems odd, no? Yan says in other email that is is possible to register a bugcheck callback. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Wed, Aug 15, 2012 at 10:48:39AM +0100, Mel Gorman wrote: On Wed, Aug 15, 2012 at 12:25:28PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:05:28AM +0100, Mel Gorman wrote: On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. To clarify, are you concerned about cost of rcu_read_lock for non balloon pages? Not as such, but given the choice between introducing RCU locking and rechecking page-mapping under a spinlock I would choose the latter as it is more straight-forward. OK but checking it how? page-mapping == balloon_mapping does not scale to multiple balloons, so I hoped we can switch to page-mapping-flags BALLOON_MAPPING or some such, but this means we dereference it outside the lock ... We will also need to add some API to set/clear mapping so that driver does not need to poke in mm internals, but that's easy. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [NEW DRIVER V2 5/7] DA9058 GPIO driver
-Original Message- From: Linus Walleij [mailto:linus.wall...@linaro.org] Sent: 13 August 2012 14:10 To: Opensource [Anthony Olech] Cc: Grant Likely; Linus Walleij; Mark Brown; LKML; David Dajun Chen; Samuel Ortiz; Lee Jones Subject: Re: [NEW DRIVER V2 5/7] DA9058 GPIO driver Hi Anthony, sorry for delayed reply... On Sun, Aug 5, 2012 at 10:43 PM, Anthony Olech anthony.olech.opensou...@diasemi.com wrote: This is the GPIO component driver of the Dialog DA9058 PMIC. This driver is just one component of the whole DA9058 PMIC driver. It depends on the core DA9058 MFD driver. OK +config GPIO_DA9058 + tristate Dialog DA9058 GPIO + depends on MFD_DA9058 select IRQ_DOMAIN, you're going to want to use it... diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 0f55662..209224a 100644 (...) +#include linux/module.h +#include linux/fs.h Really? +#include linux/uaccess.h Really? +#include linux/platform_device.h +#include linux/gpio.h +#include linux/syscalls.h Really? +#include linux/seq_file.h +#include linux/slab.h +#include linux/interrupt.h +#include linux/mfd/core.h +#include linux/regmap.h If you're using regmap you better select it in Kconfig too, but it appears you don't. You should be using regmap in the main MFD driver in this case (I haven't looked at it though.) This header set just looks like it was copied from some other file and never really proofread, so please go over it in detail. for some reason against 2.6.2x some of those were required, I just totally forgot to prune out the unwanted ones when rebasing forwards to 3.5. Good that you spotted it! Sorry I will try to prune the includes in the future. +#include linux/mfd/da9058/version.h #include +linux/mfd/da9058/registers.h #include linux/mfd/da9058/core.h +#include linux/mfd/da9058/gpio.h #include linux/mfd/da9058/irq.h +#include linux/mfd/da9058/pdata.h Samuel will have to comment on this organization of headers, it seems a little much. DO you really need all of them? One of them should have been stripped out by my submit script, but as for the others you must bear in mind that the DA9058 PMIC is a multifunction device, and thus some header files are common and some are specific to various component drivers. The very reason that you picked up on non-relevant include files surely has implications on the structure of the header files for the DA9058, in particular struct's and define's that only apply to one component driver should be in separate header files. +struct da9058_gpio { + struct da9058 *da9058; + struct platform_device *pdev; + struct gpio_chip gp; + struct mutex lock; + u8 inp_config; + u8 out_config; +}; + +static struct da9058_gpio *gpio_chip_to_da9058_gpio(struct gpio_chip +*chip) { + return container_of(chip, struct da9058_gpio, gp); } static inline, or a #define, but the compile will probably optimize-inline it anyway. The compiler should optimize it to in-line, but I will change it anyway. +static int da9058_gpio_get(struct gpio_chip *gc, unsigned offset) { + struct da9058_gpio *gpio = gpio_chip_to_da9058_gpio(gc); + struct da9058 *da9058 = gpio-da9058; + unsigned int gpio_level; + int ret; + + if (offset 1) + return -EINVAL; So there are two GPIO pins, 0 and 1? That seems odd, but OK. That is a feature of the hardware. I believe that calling them 0 and 1 is the correct thing to do. Correct me if they should have been called 1 and 2, or something else. + if (offset) { So this is for GPIO 1 Yes, it seemed the obvious thing to do. + u8 value_bits = value ? 0x80 : 0x00; These value_bits are just confusing. Just delete this and use the direct value below. Will do. It was done for diagnostics that have since been stripped out. + + gpio-out_config = ~0x80; A better way of writing = ~0x80; is = 0x7F + gpio-out_config |= value_bits; gpio-out_config = value ? 0x80 : 0x00; So, less confusing. see HANDLING NIBBLES below + if (!(gpio_cntrl 0x20)) + goto exit; Please insert a comment explaining what this bit is doing and why you're just exiting if it's not set. I don't understand one thing. I have explained why in the driver source in the next submission attempt Maybe this would be better if you didn't use so many magic values, what about: #include linux/bitops.h #define FOO_FLAG BIT(3) /* This is a flag for foo */ + + gpio_cntrl = ~0xF0; A better way to write = ~F0 is to write = 0x0F; If you don't #define the constants this way of negating numbers just get confusing. So this is OK: foo = ~FOO_FLAG; foo |= set ? FOO_FLAG : 0; This is just hard to read: foo = ~0x55; foo |= set ? 0x55 : 0; And is better off foo = 0xAA; foo |=
Re: [PATCH 7/8] kbuild: move W=... stuff to Kbuild.arch
On Wed, Aug 15, 2012 at 12:41:23PM +0300, Artem Bityutskiy wrote: On Wed, 2012-06-06 at 17:35 +0200, Sam Ravnborg wrote: On Wed, Jun 06, 2012 at 01:18:47PM +0300, Artem Bityutskiy wrote: On Sat, 2012-05-05 at 10:18 +0200, Sam Ravnborg wrote: Prevent that we eveluate cc-option multiple times for the same option by moving the definitions to Kbuild.arch. The file is included once only, thus gcc is not invoked once per directory. Another side-effect of this patch is that KCFLAGS are appended last to the list of options. This allows us to better control the options. Artem Bityutskiy dedeki...@gmail.com noticed this. Signed-off-by: Sam Ravnborg s...@ravnborg.org Cc: Artem Bityutskiy dedeki...@gmail.com Hi, what happened to this patch? I was fixing the real issue I am encountering and I thought it'd be taken instead of my original patch. We decided to move this to next merge release because is was not added to kbuild thus not enough exposure in -next. I am planning to resend the serie at around -rc2 time. Hi Sam, what happened to this patch-set? At least KCFLAGS patches I am waiting for are still not merged. Vacation and then I have not yet gotten back to them. Will do soon - thanks for the reminder! Sam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[perf] make clean problematic bashism
Dear perf maintainers, I attempted to compile perf 3.5.1 without worrying about installing dependencies first. The resulting error messages were quite helpful, and led me to install a bunch of development libraries and flex. Unfortunately, after installing flex the build still failed, even after make clean. The reason for this was a bunch of generated empty flex files in util/ that were not removed by make clean. They are intended to be erased, since the Makefile executes rm -f util/*-{bison,flex}* however, this command does not remove the files. I guess because {,} alternatives are only special in bash but the makefile is run with some other shell? I got perf to compile now, but thought you would be interested to know about this little problem. With kind regards, Wouter Koolen PS: as a side note: GNU make has the .DELETE_ON_ERROR: special target, which removes the target file when its generating command fails. This would have prevented my problem and sounds like a good idea in general. Maybe perf could make use of this feature when on GNU make? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [perf] make clean problematic bashism
On Wed, 2012-08-15 at 11:52 +0200, Wouter M. Koolen wrote: Dear perf maintainers, I attempted to compile perf 3.5.1 without worrying about installing dependencies first. The resulting error messages were quite helpful, and led me to install a bunch of development libraries and flex. Unfortunately, after installing flex the build still failed, even after make clean. The reason for this was a bunch of generated empty flex files in util/ that were not removed by make clean. They are intended to be erased, since the Makefile executes rm -f util/*-{bison,flex}* however, this command does not remove the files. I guess because {,} alternatives are only special in bash but the makefile is run with some other shell? ISTR us getting a number of such patches, did we miss a site, acme? I got perf to compile now, but thought you would be interested to know about this little problem. With kind regards, Wouter Koolen PS: as a side note: GNU make has the .DELETE_ON_ERROR: special target, which removes the target file when its generating command fails. This would have prevented my problem and sounds like a good idea in general. Maybe perf could make use of this feature when on GNU make? I don't think we build with anything but gnu make, mind sending a patch implementing your suggestion? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf: Let O= makes handle relative paths
On Mon, Aug 13, 2012 at 03:02:49PM -0300, Arnaldo Carvalho de Melo wrote: [acme@sandy linux]$ rm -rf ../build/perf [acme@sandy linux]$ make -j8 -C tools/perf/ LIBUNWIND_DIR=/opt/libunwind O=/home/acme/git/build/perf install /bin/sh: line 0: cd: /home/acme/git/build/perf: No such file or directory make: Entering directory `/home/git/linux/tools/perf' GEN perf-archive GEN /home/git/linux/tools/perf/python/perf.so make[1]: Entering directory `/home/git/linux/tools/lib/traceevent' make[2]: warning: jobserver unavailable: using -j1. Add `+' to parent make rule. * new build flags or cross compiler CC /home/git/linux/tools/perf/perf.o CC /home/git/linux/tools/perf/builtin-annotate.o CC /home/git/linux/tools/perf/builtin-bench.o CC /home/git/linux/tools/perf/bench/sched-messaging.o CC /home/git/linux/tools/perf/bench/sched-pipe.o I.e. it should stop if the O= provided directory is not there. Why stop? Don't we want to make the directory instead and continue building in there? -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] mtd: lpc32xx_slc: Adjust to pl08x DMA interface changes
On Thu, 2012-07-12 at 14:22 +0200, Roland Stigge wrote: This patch adjusts the LPC32xx SLC NAND driver to the new pl08x DMA interface, fixing the compile error resulting from changed pl08x structures. Signed-off-by: Roland Stigge sti...@antcom.de This patch breaks compilation: ERROR: pl08x_filter_id [drivers/mtd/nand/lpc32xx_slc.ko] undefined! Please, send a fix. The defconfig I used is attached. -- Best Regards, Artem Bityutskiy CONFIG_EXPERIMENTAL=y CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_BLK_DEV_INITRD=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL_SYSCALL=y CONFIG_EMBEDDED=y CONFIG_SLAB=y CONFIG_JUMP_LABEL=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_BLK_DEV_BSG is not set CONFIG_PARTITION_ADVANCED=y CONFIG_ARCH_LPC32XX=y CONFIG_PREEMPT=y CONFIG_AEABI=y CONFIG_ZBOOT_ROM_TEXT=0x0 CONFIG_ZBOOT_ROM_BSS=0x0 CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y CONFIG_CMDLINE=console=ttyS0,115200n81 root=/dev/ram0 CONFIG_CPU_IDLE=y CONFIG_FPE_NWFPE=y CONFIG_VFP=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_BINFMT_AOUT=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_PNP=y CONFIG_IP_PNP_DHCP=y CONFIG_IP_PNP_BOOTP=y # CONFIG_INET_XFRM_MODE_TRANSPORT is not set # CONFIG_INET_XFRM_MODE_TUNNEL is not set # CONFIG_INET_XFRM_MODE_BEET is not set # CONFIG_INET_LRO is not set # CONFIG_INET_DIAG is not set CONFIG_IPV6=y CONFIG_IPV6_PRIVACY=y # CONFIG_WIRELESS is not set CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug # CONFIG_FW_LOADER is not set CONFIG_MTD=m CONFIG_MTD_TESTS=m CONFIG_MTD_REDBOOT_PARTS=m CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=1 CONFIG_MTD_AFS_PARTS=m CONFIG_MTD_AR7_PARTS=m CONFIG_MTD_CHAR=m CONFIG_MTD_BLOCK=m CONFIG_FTL=m CONFIG_NFTL=m CONFIG_NFTL_RW=y CONFIG_INFTL=m CONFIG_RFD_FTL=m CONFIG_SSFDC=m CONFIG_SM_FTL=m CONFIG_MTD_OOPS=m CONFIG_MTD_SWAP=m CONFIG_MTD_CFI=m CONFIG_MTD_JEDECPROBE=m CONFIG_MTD_CFI_INTELEXT=m CONFIG_MTD_CFI_AMDSTD=m CONFIG_MTD_CFI_STAA=m CONFIG_MTD_ROM=m CONFIG_MTD_ABSENT=m CONFIG_MTD_COMPLEX_MAPPINGS=y CONFIG_MTD_PHYSMAP=m CONFIG_MTD_PHYSMAP_COMPAT=y CONFIG_MTD_PHYSMAP_OF=m CONFIG_MTD_IMPA7=m CONFIG_MTD_GPIO_ADDR=m CONFIG_MTD_PLATRAM=m CONFIG_MTD_LATCH_ADDR=m CONFIG_MTD_DATAFLASH=m CONFIG_MTD_DATAFLASH_WRITE_VERIFY=y CONFIG_MTD_DATAFLASH_OTP=y CONFIG_MTD_M25P80=m # CONFIG_M25PXX_USE_FAST_READ is not set CONFIG_MTD_SST25L=m CONFIG_MTD_SLRAM=m CONFIG_MTD_PHRAM=m CONFIG_MTD_MTDRAM=m CONFIG_MTD_BLOCK2MTD=m CONFIG_MTD_DOC2000=m CONFIG_MTD_DOC2001=m CONFIG_MTD_DOC2001PLUS=m CONFIG_MTD_DOCG3=m CONFIG_MTD_DOCPROBE_ADVANCED=y CONFIG_MTD_NAND_ECC_SMC=y CONFIG_MTD_NAND=m CONFIG_MTD_NAND_MUSEUM_IDS=y CONFIG_MTD_NAND_GPIO=m CONFIG_MTD_NAND_DISKONCHIP=m CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADVANCED=y CONFIG_MTD_NAND_DISKONCHIP_BBTWRITE=y CONFIG_MTD_NAND_DOCG4=m CONFIG_MTD_NAND_SLC_LPC32XX=m CONFIG_MTD_NAND_MLC_LPC32XX=m CONFIG_MTD_NAND_NANDSIM=m CONFIG_MTD_NAND_PLATFORM=m CONFIG_MTD_ALAUDA=m CONFIG_MTD_ONENAND=m CONFIG_MTD_ONENAND_VERIFY_WRITE=y CONFIG_MTD_ONENAND_GENERIC=m CONFIG_MTD_ONENAND_OTP=y CONFIG_MTD_ONENAND_2X_PROGRAM=y CONFIG_MTD_ONENAND_SIM=m CONFIG_MTD_LPDDR=m CONFIG_MTD_UBI=m CONFIG_MTD_UBI_GLUEBI=m CONFIG_BLK_DEV_LOOP=y CONFIG_BLK_DEV_CRYPTOLOOP=y CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=1 CONFIG_BLK_DEV_RAM_SIZE=16384 CONFIG_EEPROM_AT25=y CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y CONFIG_NETDEVICES=y CONFIG_MII=y # CONFIG_NET_VENDOR_BROADCOM is not set # CONFIG_NET_VENDOR_CHELSIO is not set # CONFIG_NET_VENDOR_CIRRUS is not set # CONFIG_NET_VENDOR_FARADAY is not set # CONFIG_NET_VENDOR_INTEL is not set # CONFIG_NET_VENDOR_MARVELL is not set # CONFIG_NET_VENDOR_MICREL is not set # CONFIG_NET_VENDOR_MICROCHIP is not set # CONFIG_NET_VENDOR_NATSEMI is not set CONFIG_LPC_ENET=y # CONFIG_NET_VENDOR_SEEQ is not set # CONFIG_NET_VENDOR_SMSC is not set # CONFIG_NET_VENDOR_STMICRO is not set CONFIG_SMSC_PHY=y # CONFIG_WLAN is not set # CONFIG_INPUT_MOUSEDEV_PSAUX is not set CONFIG_INPUT_MOUSEDEV_SCREEN_X=240 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=320 CONFIG_INPUT_EVDEV=y # CONFIG_INPUT_MOUSE is not set CONFIG_INPUT_TOUCHSCREEN=y CONFIG_TOUCHSCREEN_LPC32XX=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y # CONFIG_HW_RANDOM is not set CONFIG_I2C=y CONFIG_I2C_CHARDEV=y CONFIG_I2C_PNX=y CONFIG_SPI=y CONFIG_SPI_PL022=y CONFIG_GPIO_SYSFS=y # CONFIG_HWMON is not set CONFIG_WATCHDOG=y CONFIG_PNX4008_WATCHDOG=y CONFIG_FB=y CONFIG_FB_ARMCLCD=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_LOGO=y # CONFIG_LOGO_LINUX_MONO is not set # CONFIG_LOGO_LINUX_VGA16 is not set CONFIG_SOUND=y CONFIG_SND=y CONFIG_SND_SEQUENCER=y CONFIG_SND_MIXER_OSS=y CONFIG_SND_PCM_OSS=y CONFIG_SND_SEQUENCER_OSS=y # CONFIG_SND_SUPPORT_OLD_API is not set # CONFIG_SND_VERBOSE_PROCFS is not set CONFIG_SND_DEBUG=y CONFIG_SND_DEBUG_VERBOSE=y # CONFIG_SND_DRIVERS is not
Re: [perf] make clean problematic bashism
On 08/15/2012 12:26 PM, Peter Zijlstra wrote: On Wed, 2012-08-15 at 11:52 +0200, Wouter M. Koolen wrote: Dear perf maintainers, I attempted to compile perf 3.5.1 without worrying about installing dependencies first. The resulting error messages were quite helpful, and led me to install a bunch of development libraries and flex. Unfortunately, after installing flex the build still failed, even after make clean. The reason for this was a bunch of generated empty flex files in util/ that were not removed by make clean. They are intended to be erased, since the Makefile executes rm -f util/*-{bison,flex}* however, this command does not remove the files. I guess because {,} alternatives are only special in bash but the makefile is run with some other shell? ISTR us getting a number of such patches, did we miss a site, acme? I got perf to compile now, but thought you would be interested to know about this little problem. With kind regards, Wouter Koolen PS: as a side note: GNU make has the .DELETE_ON_ERROR: special target, which removes the target file when its generating command fails. This would have prevented my problem and sounds like a good idea in general. Maybe perf could make use of this feature when on GNU make? I don't think we build with anything but gnu make, mind sending a patch implementing your suggestion? Hi Peter, Some more information: my system has /bin/sh set to dash. I remember a question about this during Debian installation. I guess Ubuntu does something similar viz. https://lkml.org/lkml/2012/5/4/90 Patch attached :) With kind regards, Wouter diff --git a/tools/perf/Makefile b/tools/perf/Makefile index 0eee64c..29b2373 100644 --- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -1,3 +1,5 @@ +.DELETE_ON_ERROR: + include ../scripts/Makefile.include # The default target of this Makefile is...
Re: [PATCH 0/4] promote zcache from staging
On Fri, Aug 10, 2012 at 01:14:01PM -0500, Seth Jennings wrote: On 08/09/2012 03:20 PM, Dan Magenheimer wrote I also wonder if you have anything else unusual in your test setup, such as a fast swap disk (mine is a partition on the same rotating disk as source and target of the kernel build, the default install for a RHEL6 system)? I'm using a normal SATA HDD with two partitions, one for swap and the other an ext3 filesystem with the kernel source. Or have you disabled cleancache? Yes, I _did_ disable cleancache. I could see where having cleancache enabled could explain the difference in results. Why did you disable the cleancache? Having both (cleancache to compress fs data) and frontswap (to compress swap data) is the goal - while you turned one of its sources off. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] zram/zsmalloc promotion
On Tue, Aug 14, 2012 at 12:39:25PM -0500, Seth Jennings wrote: On 08/14/2012 12:36 AM, Nitin Gupta wrote: On 08/13/2012 07:35 PM, Greg Kroah-Hartman wrote: On Wed, Aug 08, 2012 at 03:12:13PM +0900, Minchan Kim wrote: This patchset promotes zram/zsmalloc from staging. Both are very clean and zram is used by many embedded product for a long time. [1-3] are patches not merged into linux-next yet but needed it as base for [4-5] which promotes zsmalloc. Greg, if you merged [1-3] already, skip them. I've applied 1-3 and now 4, but that's it, I can't apply the rest without getting acks from the -mm maintainers, sorry. Please work with them to get those acks, and then I will be glad to apply the rest (after you resend them of course...) On a second thought, I think zsmalloc should stay in drivers/block/zram since zram is now the only user of zsmalloc since zcache and ramster are moving to another allocator. The removal of zsmalloc from zcache has not been agreed upon yet. nods Dan _suggested_ removing zsmalloc as the persistent allocator for zcache in favor of zbud to solve flaws in zcache. However, zbud has large deficiencies. A zero-filled 4k page will compress with LZO to 103 bytes. zbud can only store two compressed pages in each memory pool page, resulting in 95% fragmentation (i.e. 95% of the memory pool page goes unused). While this might not be a typical case, it is the worst case and absolutely does happen. zbud's design also effectively limits the useful page compression to 50%. If pages are compressed beyond that, the added space savings is lost in memory pool fragmentation. For example, if two pages compress to 30% of their original size, those two pages take up 60% of the zbud memory pool page, and 40% is lost to fragmentation because zbud can't store anything in the remaining space. To say it another way, for every two page cache pages that cleancache stores in zcache, zbud _must_ allocate a memory pool page, regardless of how well those pages compress. This reduces the efficiency of the page cache reclaim mechanism by half. I have posted some work (zsmalloc shrinker interface, user registered alloc/free functions for the zsmalloc memory pool) that begins to make zsmalloc a suitable replacement for zbud, but that work was put on hold until the path out of staging was established. I'm hoping to continue this work once the code is in mainline. While zbud has deficiencies, it doesn't prevent zcache from having value as I have already demonstrated. However, replacing zsmalloc with zbud would step backward for the reasons mentioned above. What would be nice is having only one engine instead of two - and I believe that is what you and Dan are aiming at. Dan is looking at it from the perspective of re-engineering zcache to use an LRU for keeping track of pages and pushing those to the compression engine. And redoing the zbud engine a bit (I think, let me double-check the git tree he pointed out). I do not support the removal of zsmalloc from zcache. As such, I think the zsmalloc code should remain independent. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/12] KVM: introduce readonly memslot
On 08/14/2012 06:51 PM, Marcelo Tosatti wrote: Userspace may want to modify the ROM (for example, when programming a flash device). It is also possible to map an hva range rw through one slot and ro through another. Right, can do that with multiple userspace maps to the same anonymous memory region (see other email). Yes it's possible. It requires that we move all memory allocation to be fd based, since userspace can't predict what memory will be dual-mapped (at least if emulated hardware allows this). Is this a reasonable requirement? Do ksm/thp/autonuma work with this? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 06/11] memcg: kmem controller infrastructure
On 08/15/2012 01:42 PM, Glauber Costa wrote: Also, as I have mentioned in the other email in this thread. Why should we reclaim just because of kernel allocation when we are not reclaiming any of it because shrink_slab is ignored in the memcg reclaim. Don't get too distracted by the fact that shrink_slab is ignored. It is temporary, and while this being ignored now leads to suboptimal behavior, it will 1st, only affect its users, and 2nd, not be disastrous. I see it this as more or less on pair with the soft limit reclaim problem we had. It is not ideal, but it already provided functionality Okay, I sent the e-mail before finishing it... duh What I meant in this last sentence, is that the situation while the memcg-aware shrinkers doesn't land in the kernel is more or less the same (obviously not exactly) as with the soft reclaim work. It is an evolutionary approach that provides some functionality that is not yet perfect but already solves lots of problems for people willing to live with its temporary drawbacks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] perf gtk/browser: Use hist_period_print functions
On Mon, 6 Aug 2012, Namhyung Kim wrote: Now we can support color using pango markup with this change. Cc: Pekka Enberg penb...@kernel.org Signed-off-by: Namhyung Kim namhy...@kernel.org Awesome! Acked-by: Pekka Enberg penb...@kernel.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [net PATCH v3 1/3] net: netprio: fix files lock and remove useless d_path bits
On Tue, Aug 14, 2012 at 03:34:24PM -0700, John Fastabend wrote: Add lock to prevent a race with a file closing and also remove useless and ugly sscanf code. The extra code was never needed and the case it supposedly protected against is in fact handled correctly by sock_from_file as pointed out by Al Viro. CC: Neil Horman nhor...@tuxdriver.com Reported-by: Al Viro v...@zeniv.linux.org.uk Signed-off-by: John Fastabend john.r.fastab...@intel.com --- net/core/netprio_cgroup.c | 22 -- 1 files changed, 4 insertions(+), 18 deletions(-) diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c index ed0c043..f65dba3 100644 --- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -277,12 +277,6 @@ out_free_devname: void net_prio_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) { struct task_struct *p; - char *tmp = kzalloc(sizeof(char) * PATH_MAX, GFP_KERNEL); - - if (!tmp) { - pr_warn(Unable to attach cgrp due to alloc failure!\n); - return; - } cgroup_taskset_for_each(p, cgrp, tset) { unsigned int fd; @@ -296,32 +290,24 @@ void net_prio_attach(struct cgroup *cgrp, struct cgroup_taskset *tset) continue; } - rcu_read_lock(); + spin_lock(files-file_lock); fdt = files_fdtable(files); for (fd = 0; fd fdt-max_fds; fd++) { - char *path; struct file *file; struct socket *sock; - unsigned long s; - int rv, err = 0; + int err; file = fcheck_files(files, fd); if (!file) continue; - path = d_path(file-f_path, tmp, PAGE_SIZE); - rv = sscanf(path, socket:[%lu], s); - if (rv = 0) - continue; - sock = sock_from_file(file, err); - if (!err) + if (sock) sock_update_netprioidx(sock-sk, p); } - rcu_read_unlock(); + spin_unlock(files-file_lock); task_unlock(p); } - kfree(tmp); } static struct cftype ss_files[] = { Acked-by: Neil Horman nhor...@tuxdriver.com It looks good to me. Al, could you please lend your review here too? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [net PATCH v3 2/3] net: netprio: fd passed in SCM_RIGHTS datagram not set correctly
On Tue, Aug 14, 2012 at 03:34:30PM -0700, John Fastabend wrote: A socket fd passed in a SCM_RIGHTS datagram was not getting updated with the new tasks cgrp prioidx. This leaves IO on the socket tagged with the old tasks priority. To fix this add a check in the scm recvmsg path to update the sock cgrp prioidx with the new tasks value. Thanks to Al Viro for catching this. CC: Neil Horman nhor...@tuxdriver.com Reported-by: Al Viro v...@zeniv.linux.org.uk Signed-off-by: John Fastabend john.r.fastab...@intel.com --- net/core/scm.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/net/core/scm.c b/net/core/scm.c index 8f6ccfd..040cebe 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -265,6 +265,7 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm) for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); ifdmax; i++, cmfptr++) { + struct socket *sock; int new_fd; err = security_file_receive(fp[i]); if (err) @@ -281,6 +282,9 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm) } /* Bump the usage count and install the file. */ get_file(fp[i]); + sock = sock_from_file(fp[i], err); + if (sock) + sock_update_netprioidx(sock-sk, current); fd_install(new_fd, fp[i]); } Acked-by: Neil Horman nhor...@tuxdriver.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [net PATCH v3 3/3] net: netprio: fix cgrp create and write priomap race
On Tue, Aug 14, 2012 at 03:34:35PM -0700, John Fastabend wrote: A race exists where creating cgroups and also updating the priomap may result in losing a priomap update. This is because priomap writers are not protected by rtnl_lock. Move priority writer into rtnl_lock()/rtnl_unlock(). CC: Neil Horman nhor...@tuxdriver.com Reported-by: Al Viro v...@zeniv.linux.org.uk Signed-off-by: John Fastabend john.r.fastab...@intel.com --- net/core/netprio_cgroup.c |8 +++- 1 files changed, 3 insertions(+), 5 deletions(-) diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c index f65dba3..c75e3f9 100644 --- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -101,12 +101,10 @@ static int write_update_netdev_table(struct net_device *dev) u32 max_len; struct netprio_map *map; - rtnl_lock(); max_len = atomic_read(max_prioidx) + 1; map = rtnl_dereference(dev-priomap); if (!map || map-priomap_len max_len) ret = extend_netdev_table(dev, max_len); - rtnl_unlock(); return ret; } @@ -256,17 +254,17 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft, if (!dev) goto out_free_devname; + rtnl_lock(); ret = write_update_netdev_table(dev); if (ret 0) goto out_put_dev; - rcu_read_lock(); - map = rcu_dereference(dev-priomap); + map = rtnl_dereference(dev-priomap); if (map) map-priomap[prioidx] = priority; - rcu_read_unlock(); out_put_dev: + rtnl_unlock(); dev_put(dev); out_free_devname: Acked-by: Neil Horman nhor...@tuxdriver.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] Align MIPS swapper_pg_dir for faster code.
On Tue, Aug 14, 2012 at 11:07:59AM -0700, David Daney wrote: From: David Daney david.da...@cavium.com The MIPS swapper_pg_dir needs 64K alignment for faster TLB refills in kernel mappings. There are two parts to the patch set: 1) Modify generic vmlinux.lds.h to allow architectures to place additional sections at the start of .bss. This allows alignment constraints to be met with minimal holes added for padding. Putting this in common code should reduce the risk of future changes to the linker scripts not being propagated to MIPS (or any other architecture that needs something like this). 2) Align the MIPS swapper_pg_dir. Since the initial use of the code is for MIPS, perhaps both parts could be merged by Ralf's tree (after collecting any Acked-bys). Looks good to me but will wait a bit longer for comments and (N)Acks before merging. Ralf -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET] timer: clean up initializers and implement irqsafe timers
On Tue, 2012-08-14 at 17:18 -0700, Tejun Heo wrote: Let's see if we can agree on the latter point first. Do you agree that it wouldn't be a good idea to implement relatively complex timer subsystem inside workqueue? RB-trees are fairly trivial to use, but can we please get back to why people want to do del/mod delayed work from IRQ context? I can get the queueing part, but why do they need to cancel and or modify stuff? Trying to come up with a solution to a problem you don't understand is kinda difficult. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: twofish - add x86_64/avx assembler implementation
Quoting Borislav Petkov b...@alien8.de: On Wed, Aug 15, 2012 at 11:42:16AM +0300, Jussi Kivilinna wrote: I started thinking about the performance on AMD Bulldozer. vmovq/vmovd/vpextr*/vpinsr* between FPU and general purpose registers on AMD CPU is alot slower (latencies from 8 to 12 cycles) than on Intel sandy-bridge (where instructions have latency of 1 to 2). See: http://www.agner.org/optimize/instruction_tables.pdf It would be really good, if implementation could be tested on AMD CPU to determinate, if it causes performance regression. However I don't have access to machine with such CPU. But I do. :) And if you tell me exactly how to run the tests and on what kernel, I'll try to do so. Twofish-avx (CONFIG_TWOFISH_AVX_X86_64) is available in 3.6-rc1. For testing you need CRYPTO_TEST build as module. You should turn off turbo-core, freq-scaling, etc. Testing twofish-avx ('async twofish' speed test): modprobe twofish-avx-x86_64 modprobe tcrypt mode=504 sec=1 Testing twofish-x86_64-3way ('sync twofish' speed test): modprobe twofish-x86_64-3way modprobe tcrypt mode=202 sec=1 Loading tcrypt will block until tests are complete, after which modprobe will return with error. This is expected. Results are in kernel log. -Jussi HTH. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] scripts/decodecode: Fixup trapping instruction marker
From: Borislav Petkov borislav.pet...@amd.com When dumping Code: sections from an oops, the trapping instruction %rip points to can be a string copy 2b:* f3 a5 rep movsl %ds:(%rsi),%es:(%rdi) and the line contain a bunch of :. Current cut selects only the and the second field output looks funnily overlaid this: 2b:* f3 a5 rep movsl %ds -- trapping instruction:(%rsi),%es:(%rdi Fix this by selecting the remaining fields too. Cc: Andrew Morton a...@linux-foundation.org Cc: Linus Torvalds torva...@linux-foundation.org Cc: linux-kbu...@vger.kernel.org Signed-off-by: Borislav Petkov borislav.pet...@amd.com --- scripts/decodecode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/decodecode b/scripts/decodecode index 18ba881c3415..4f8248d5a11f 100755 --- a/scripts/decodecode +++ b/scripts/decodecode @@ -89,7 +89,7 @@ echo $code $T.s disas $T cat $T.dis $T.aa -faultline=`cat $T.dis | head -1 | cut -d: -f2` +faultline=`cat $T.dis | head -1 | cut -d: -f2-` faultline=`echo $faultline | sed -e 's/\[/\\\[/g; s/\]/\\\]/g'` cat $T.oo | sed -e s/\($faultline\)/\*\1 -- trapping instruction/g -- 1.7.11.rc1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [discussion]sched: a rough proposal to enable power saving in scheduler
On Mon, 2012-08-13 at 20:21 +0800, Alex Shi wrote: Since there is no power saving consideration in scheduler CFS, I has a very rough idea for enabling a new power saving schema in CFS. Adding Thomas, he always delights poking holes in power schemes. It bases on the following assumption: 1, If there are many task crowd in system, just let few domain cpus running and let other cpus idle can not save power. Let all cpu take the load, finish tasks early, and then get into idle. will save more power and have better user experience. I'm not sure this is a valid assumption. I've had it explained to me by various people that race-to-idle isn't always the best thing. It has to do with the cost of switching power states and the duration of execution and other such things. 2, schedule domain, schedule group perfect match the hardware, and the power consumption unit. So, pull tasks out of a domain means potentially this power consumption unit idle. I'm not sure I understand what you're saying, sorry. So, according Peter mentioned in commit 8e7fbcbc22c(sched: Remove stale power aware scheduling), this proposal will adopt the sched_balance_policy concept and use 2 kind of policy: performance, power. Yay, ideally we'd also provide a 3rd option: auto, which simply switches between the two based on AC/BAT, UPS status and simple things like that. But this seems like a later concern, you have to have something to pick between before you can pick :-) And in scheduling, 2 place will care the policy, load_balance() and in task fork/exec: select_task_rq_fair(). ack Here is some pseudo code try to explain the proposal behaviour in load_balance() and select_task_rq_fair(); Oh man.. A few words outlining the general idea would've been nice. load_balance() { update_sd_lb_stats(); //get busiest group, idlest group data. if (sd-nr_running sd's capacity) { //power saving policy is not suitable for //this scenario, it runs like performance policy mv tasks from busiest cpu in busiest group to idlest cpu in idlest group; Once upon a time we talked about adding a factor to the capacity for this. So say you'd allow 2*capacity before overflowing and waking another power group. But I think we should not go on nr_running here, PJTs per-entity load tracking stuff gives us much better measures -- also, repost that series already Paul! :-) Also, I'm not sure this is entirely correct, the thing you want to do for power aware stuff is to minimize the number of active power domains, this means you don't want idlest, you want least busy non-idle. } else {// the sd has enough capacity to hold all tasks. if (sg-nr_running sg's capacity) { //imbalanced between groups if (schedule policy == performance) { //when 2 busiest group at same busy //degree, need to prefer the one has // softest group?? move tasks from busiest group to idletest group; So I'd leave the currently implemented scheme as performance, and I don't think the above describes the current state. } else if (schedule policy == power) move tasks from busiest group to idlest group until busiest is just full of capacity. //the busiest group can balance //internally after next time LB, There's another thing we need to do, and that is collect tasks in a minimal amount of power domains. The old code (that got deleted) did something like that, you can revive some of the that code if needed -- I just killed everything to be able to start with a clean slate. } else { //all groups has enough capacity for its tasks. if (schedule policy == performance) //all tasks may has enough cpu //resources to run, //mv tasks from busiest to idlest group? //no, at this time, it's better to keep //the task on current cpu. //so, it is maybe better to do balance //in each of groups for_each_imbalance_groups() move tasks from busiest cpu to idlest cpu in each of groups; else if (schedule policy == power) { if (no hard pin in idlest group) mv tasks from idlest group to busiest until busiest
Re: [PATCH v2 04/11] kmem accounting basic infrastructure
On Wed, 2012-08-15 at 13:33 +0400, Glauber Costa wrote: This can be quite confusing. I am still not sure whether we should mix the two things together. If somebody wants to limit the kernel memory he has to touch the other limit anyway. Do you have a strong reason to mix the user and kernel counters? This is funny, because the first opposition I found to this work was Why would anyone want to limit it separately? =p It seems that a quite common use case is to have a container with a unified view of memory that it can use the way he likes, be it with kernel memory, or user memory. I believe those people would be happy to just silently account kernel memory to user memory, or at the most have a switch to enable it. What gets clear from this back and forth, is that there are people interested in both use cases. Haven't we already had this discussion during the Prague get together? We discussed the use cases and finally agreed to separate accounting for k and then k+u mem because that satisfies both the Google and Parallels cases. No-one was overjoyed by k and k+u but no-one had a better suggestion ... is there a better way of doing this that everyone can agree to? We do need to get this nailed down because it's the foundation of the patch series. James -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Wed, Aug 15, 2012 at 01:01:08PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:48:39AM +0100, Mel Gorman wrote: On Wed, Aug 15, 2012 at 12:25:28PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:05:28AM +0100, Mel Gorman wrote: On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. To clarify, are you concerned about cost of rcu_read_lock for non balloon pages? Not as such, but given the choice between introducing RCU locking and rechecking page-mapping under a spinlock I would choose the latter as it is more straight-forward. OK but checking it how? page-mapping == balloon_mapping does not scale to multiple balloons, I was thinking of exactly that page-mapping == balloon_mapping check. As I do not know how many active balloon drivers there might be I cannot guess in advance how much of a scalability problem it will be. so I hoped we can switch to page-mapping-flags BALLOON_MAPPING or some such, but this means we dereference it outside the lock ... That also sounded like future stuff to me that would be justified with profiling if necessary. Personally I would have started with the spinlock and a simple check and moved to RCU later when either scalability was a problem or it was found there was a need to stabilise whether a page was a balloon page or not outside a spinlock. This is not a NAK to the idea and I'm not objecting to RCU being used now if that is what is really desired. I just suspect it's making the series more complex than it needs to be right now. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH-V2] arm/dts: AM33XX: Set the default status of module to disabled state
Ideally in common SoC dtsi file should set all modules to disabled state and it should get enabled in respective EVM/Board dts file as per usage. This patch sets default status of all modules to disabled state in am33xx.dtsi file. Currently there are no modules supported as part of Bone and EVM dts support, so care to add entry status = okay while adding support for any module. Signed-off-by: Vaibhav Hiremath hvaib...@ti.com Acked-by: Arnd Bergmann a...@arndb.de Cc: Benoit Cousson b-cous...@ti.com Cc: Grant Likely grant.lik...@secretlab.ca Cc: Tony Lindgren t...@atomide.com --- Changes from V1: - Fixed indentation issue caused due to extra spaces. arch/arm/boot/dts/am335x-bone.dts |6 ++ arch/arm/boot/dts/am335x-evm.dts |6 ++ arch/arm/boot/dts/am33xx.dtsi |9 + 3 files changed, 21 insertions(+), 0 deletions(-) diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone.dts index a9af4db..a7906cb 100644 --- a/arch/arm/boot/dts/am335x-bone.dts +++ b/arch/arm/boot/dts/am335x-bone.dts @@ -17,4 +17,10 @@ device_type = memory; reg = 0x8000 0x1000; /* 256 MB */ }; + + ocp { + uart1: serial@44E09000 { + status = okay; + }; + }; }; diff --git a/arch/arm/boot/dts/am335x-evm.dts b/arch/arm/boot/dts/am335x-evm.dts index d6a97d9..5dd8a6b 100644 --- a/arch/arm/boot/dts/am335x-evm.dts +++ b/arch/arm/boot/dts/am335x-evm.dts @@ -17,4 +17,10 @@ device_type = memory; reg = 0x8000 0x1000; /* 256 MB */ }; + + ocp { + uart1: serial@44E09000 { + status = okay; + }; + }; }; diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi index 59509c4..5f6c8e3 100644 --- a/arch/arm/boot/dts/am33xx.dtsi +++ b/arch/arm/boot/dts/am33xx.dtsi @@ -102,36 +102,42 @@ compatible = ti,omap3-uart; ti,hwmods = uart1; clock-frequency = 4800; + status = disabled; }; uart2: serial@48022000 { compatible = ti,omap3-uart; ti,hwmods = uart2; clock-frequency = 4800; + status = disabled; }; uart3: serial@48024000 { compatible = ti,omap3-uart; ti,hwmods = uart3; clock-frequency = 4800; + status = disabled; }; uart4: serial@481A6000 { compatible = ti,omap3-uart; ti,hwmods = uart4; clock-frequency = 4800; + status = disabled; }; uart5: serial@481A8000 { compatible = ti,omap3-uart; ti,hwmods = uart5; clock-frequency = 4800; + status = disabled; }; uart6: serial@481AA000 { compatible = ti,omap3-uart; ti,hwmods = uart6; clock-frequency = 4800; + status = disabled; }; i2c1: i2c@44E0B000 { @@ -139,6 +145,7 @@ #address-cells = 1; #size-cells = 0; ti,hwmods = i2c1; + status = disabled; }; i2c2: i2c@4802A000 { @@ -146,6 +153,7 @@ #address-cells = 1; #size-cells = 0; ti,hwmods = i2c2; + status = disabled; }; i2c3: i2c@4819C000 { @@ -153,6 +161,7 @@ #address-cells = 1; #size-cells = 0; ti,hwmods = i2c3; + status = disabled; }; }; }; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 2/4] virtio_balloon: introduce migration primitives to balloon pages
On Wed, Aug 15, 2012 at 12:16:51PM +0100, Mel Gorman wrote: On Wed, Aug 15, 2012 at 01:01:08PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:48:39AM +0100, Mel Gorman wrote: On Wed, Aug 15, 2012 at 12:25:28PM +0300, Michael S. Tsirkin wrote: On Wed, Aug 15, 2012 at 10:05:28AM +0100, Mel Gorman wrote: On Tue, Aug 14, 2012 at 05:11:13PM -0300, Rafael Aquini wrote: On Tue, Aug 14, 2012 at 10:51:39PM +0300, Michael S. Tsirkin wrote: What I think you should do is use rcu for access. And here sync rcu before freeing. Maybe an overkill but at least a documented synchronization primitive, and it is very light weight. I liked your suggestion on barriers, as well. I have not thought about this as deeply as I shouold but is simply rechecking the mapping under the pages_lock to make sure the page is still a balloon page an option? i.e. use pages_lock to stabilise page-mapping. To clarify, are you concerned about cost of rcu_read_lock for non balloon pages? Not as such, but given the choice between introducing RCU locking and rechecking page-mapping under a spinlock I would choose the latter as it is more straight-forward. OK but checking it how? page-mapping == balloon_mapping does not scale to multiple balloons, I was thinking of exactly that page-mapping == balloon_mapping check. As I do not know how many active balloon drivers there might be I cannot guess in advance how much of a scalability problem it will be. Not at all sure multiple drivers are worth supporting, but multiple *devices* is I think worth supporting, if for no other reason than that they can work today. For that, we need a device pointer which Rafael wants to put into the mapping, this means multiple balloon mappings. so I hoped we can switch to page-mapping-flags BALLOON_MAPPING or some such, but this means we dereference it outside the lock ... That also sounded like future stuff to me that would be justified with profiling if necessary. Personally I would have started with the spinlock and a simple check and moved to RCU later when either scalability was a problem or it was found there was a need to stabilise whether a page was a balloon page or not outside a spinlock. This is not a NAK to the idea and I'm not objecting to RCU being used now if that is what is really desired. I just suspect it's making the series more complex than it needs to be right now. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:sched/core] tile: Remove SD_PREFER_LOCAL leftover
Commit-ID: c7660994ed6b44d17dad0aac0d156da1e0a2f003 Gitweb: http://git.kernel.org/tip/c7660994ed6b44d17dad0aac0d156da1e0a2f003 Author: Alex Shi alex@intel.com AuthorDate: Wed, 15 Aug 2012 08:14:36 +0800 Committer: Thomas Gleixner t...@linutronix.de CommitDate: Wed, 15 Aug 2012 13:22:55 +0200 tile: Remove SD_PREFER_LOCAL leftover commit (sched: recover SD_WAKE_AFFINE in select_task_rq_fair and code clean up) removed SD_PREFER_LOCAL, but left a SD_PREFER_LOCAL usage in arch/tile code. That breaks the arch/tile build. Reported-by: Fengguang Wu fengguang...@intel.com Signed-off-by: Alex Shi alex@intel.com Acked-by: Peter Zijlstra a.p.zijls...@chello.nl Link: http://lkml.kernel.org/r/502af3e6.3050...@intel.com Signed-off-by: Thomas Gleixner t...@linutronix.de --- arch/tile/include/asm/topology.h |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/tile/include/asm/topology.h b/arch/tile/include/asm/topology.h index 7a7ce39..d5e86c9 100644 --- a/arch/tile/include/asm/topology.h +++ b/arch/tile/include/asm/topology.h @@ -69,7 +69,6 @@ static inline const struct cpumask *cpumask_of_node(int node) | 1*SD_BALANCE_FORK \ | 0*SD_BALANCE_WAKE \ | 0*SD_WAKE_AFFINE \ - | 0*SD_PREFER_LOCAL \ | 0*SD_SHARE_CPUPOWER \ | 0*SD_SHARE_PKG_RESOURCES \ | 0*SD_SERIALIZE\ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked
On Aug 14, 2012, at 10:35 PM, Anthony Liguori wrote: Marcelo Tosatti mtosa...@redhat.com writes: On Tue, Aug 14, 2012 at 01:53:01PM -0500, Anthony Liguori wrote: Marcelo Tosatti mtosa...@redhat.com writes: On Tue, Aug 14, 2012 at 05:55:54PM +0300, Yan Vugenfirer wrote: On Aug 14, 2012, at 1:42 PM, Jan Kiszka wrote: On 2012-08-14 10:56, Daniel P. Berrange wrote: On Mon, Aug 13, 2012 at 03:21:32PM -0300, Marcelo Tosatti wrote: On Wed, Aug 08, 2012 at 10:43:01AM +0800, Wen Congyang wrote: We can know the guest is panicked when the guest runs on xen. But we do not have such feature on kvm. Another purpose of this feature is: management app(for example: libvirt) can do auto dump when the guest is panicked. If management app does not do auto dump, the guest's user can do dump by hand if he sees the guest is panicked. We have three solutions to implement this feature: 1. use vmcall 2. use I/O port 3. use virtio-serial. We have decided to avoid touching hypervisor. The reason why I choose choose the I/O port is: 1. it is easier to implememt 2. it does not depend any virtual device 3. it can work when starting the kernel How about searching for the Kernel panic - not syncing string in the guests serial output? Say libvirtd could take an action upon that? No, this is not satisfactory. It depends on the guest OS being configured to use the serial port for console output which we cannot mandate, since it may well be required for other purposes. Please don't forget Windows guests, there is no console and no Kernel Panic string ;) What I used for debugging purposes on Windows guest is to register a bugcheck callback in virtio-net driver and write 1 to VIRTIO_PCI_ISR register. Yan. Considering whether a panic-device should cover other OSes is also \ something to consider. Even for Linux, is panic the only case which should be reported via the mechanism? What about oopses without panic? Is the mechanism general enough for supporting new events, etc. Hi, I think this discussion is gone of the deep end. Forget about !x86 platforms. They have their own way to do this sort of thing. The panic function in kernel/panic.c has the following options, which appear to be arch independent, on panic: - reboot - blink Not sure the semantics of blink but that might be a good place for a pvops hook. None are paravirtual interfaces however. Think of this feature like a status LED on a motherboard. These are very common and usually controlled by IO ports. We're simply reserving a status LED for the guest to indicate that it has paniced. Let's not over engineer this. My concern is that you end up with state that is dependant on x86. Subject: [PATCH v8 3/6] add a new runstate: RUN_STATE_GUEST_PANICKED Having the ability to stop/restart the guest (and even introducing a new VM runstate) is more than a status LED analogy. I must admit, I don't know why a new runstate is necessary/useful. The kernel shouldn't have to care about the difference between a halted guest and a panicked guest. That level of information belongs in userspace IMHO. Can this new infrastructure be used by other architectures? I guess I don't understand why the kernel side of this isn't anything more than a paravirt op hook that does a single outb() with the remaining logic handled 100% in QEMU. Do you consider allowing support for Windows as overengineering? I don't think there is a way to hook BSOD on Windows so attempting to engineer something that works with Windows seems odd, no? Actually there is a way (http://msdn.microsoft.com/en-us/library/windows/hardware/ff553105(v=vs.85).aspx). That's what I just mentioned already done in Windows virtio-net driver. Best regards, Yan. Regards, Anthony Liguori Regards, Anthony Liguori Well, we have more than a single serial port, even when leaving virtio-serial aside... Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/6][resend] mempolicy memory corruption fixlet
On Mon, Aug 6, 2012 at 3:32 PM, KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com wrote: On 7/31/2012 8:33 AM, Josh Boyer wrote: On Mon, Jun 11, 2012 at 5:17 AM, kosaki.motoh...@gmail.com wrote: From: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Hi This is trivial fixes of mempolicy meory corruption issues. There are independent patches each ather. and, they don't change userland ABIs. Thanks. changes from v1: fix some typo of changelogs s. --- KOSAKI Motohiro (6): Revert mm: mempolicy: Let vma_merge and vma_split handle vma-vm_policy linkages mempolicy: Kill all mempolicy sharing mempolicy: fix a race in shared_policy_replace() mempolicy: fix refcount leak in mpol_set_shared_policy() mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() MAINTAINERS: Added MEMPOLICY entry MAINTAINERS|7 +++ mm/mempolicy.c | 151 mm/shmem.c |9 ++-- 3 files changed, 120 insertions(+), 47 deletions(-) I don't see these patches queued anywhere. They aren't in linux-next, mmotm, or Linus' tree. Did these get dropped? Is the revert still needed? Sorry. my fault. yes, it is needed. currently, Some LTP was fail since Mel's mm: mempolicy: Let vma_merge and vma_split handle vma-vm_policy linkages patch. The series still isn't queued anywhere. Are you planning on resending it again, or should it get picked up in a particular tree? josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] time: Improve sanity checking of timekeeping inputs
On Wed, Aug 8, 2012 at 3:36 PM, John Stultz john.stu...@linaro.org wrote: Thomas, Ingo, Here's a fix against tip/timers/urgent that addresses timekeeping edge cases detected by both a bad BIOS and system fuzzing w/ trinity. Thanks to Sasha Levin and CAI Qian for finding and reporting these! Let me know if you have any tweaks you want to see. thanks -john Unexpected behavior could occur if the time is set to a value large enough to overflow a 64bit ktime_t (which is something larger then the year 2262). Also unexpected behavior could occur if large negative offsets are injected via adjtimex. So this patch improves the sanity check timekeeping inputs by improving the timespec_valid() check, and then makes better use of timespec_valid() to make sure we don't set the time to an invalid negative value or one that overflows ktime_t. Note: This does not protect from setting the time close to overflowing ktime_t and then letting natural accumulation cause the overflow. Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra a.p.zijls...@chello.nl Cc: Prarit Bhargava pra...@redhat.com Cc: Thomas Gleixner t...@linutronix.de Cc: Zhouping Liu z...@redhat.com Cc: CAI Qian caiq...@redhat.com Cc: Sasha Levin levinsasha...@gmail.com Cc: sta...@vger.kernel.org Reported-by: CAI Qian caiq...@redhat.com Reported-by: Sasha Levin levinsasha...@gmail.com Signed-off-by: John Stultz john.stu...@linaro.org --- include/linux/ktime.h |7 --- include/linux/time.h | 22 -- kernel/time/timekeeping.c | 26 -- 3 files changed, 44 insertions(+), 11 deletions(-) This patch fixes a boot regression on machines with crappy BIOS. Is this going to get committed soon? https://bugzilla.redhat.com/show_bug.cgi?id=844249 josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5] Call netif_carrier_off() after register_netdev()
Ben Hutchings bhutchi...@solarflare.com writes: But if you do it beforehand then it doesn't have the intended effect. (Supposed to be fixed by 22604c866889c4b2e12b73cbf1683bda1b72a313, which had to be reverted: c276e098d3ee33059b4a1c747354226cec58487c.) So you have to do it after, but without dropping the RTNL lock in between. So you may want to add something like int register_netdev_carrier_off(struct net_device *dev) { int err; rtnl_lock(); err = register_netdevice(dev); if (!err) set_bit(__LINK_STATE_NOCARRIER, dev-state) rtnl_unlock(); return err; } for these drivers? Bjørn -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Qemu-devel] [PATCH v8] kvm: notify host when the guest is panicked
On Aug 15, 2012, at 12:56 PM, Gleb Natapov wrote: On Tue, Aug 14, 2012 at 02:35:34PM -0500, Anthony Liguori wrote: Do you consider allowing support for Windows as overengineering? I don't think there is a way to hook BSOD on Windows so attempting to engineer something that works with Windows seems odd, no? Yan says in other email that is is possible to register a bugcheck callback. Here you go - http://msdn.microsoft.com/en-us/library/windows/hardware/ff553105(v=vs.85).aspx Already done in virtio-net for two reasons: 1. we could configure virtio-net to notify QEMU in a hacky way (write 1 to VIRTIO_PCI_ISR register) that there was a bugckeck .It was very useful debugging complex WHQL issues that involved host networking. 2. Store additional information (for example time stamps of last receive packet, last interrupt and etc) in crash dump. Yan. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/