Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
Hi Yuan and Han-Wen, Thank you for your comments. (2012/07/06 22:58), Han-Wen Nienhuys wrote: On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan wrote: On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote: One of the ways to solve this is to make them tunable. In this series, the new sysfs parameter max_pages_per_req is introduced. It limits the maximum read/write size in fuse request and it can be changed from 32 to 256 pages in current implementations. When the max_read/max_write mount option is specified, FUSE request size is set per mount. (The size is rounded-up to page size and limited up to max_pages_per_req.) Why maxim 256 pages? If we are here, we can go further: most of object storage system has object size of multiple to dozens of megabytes. So I think probably 1M is too small. Our distribution storage system has 4M per object, so I think at least maxim size could be bigger than 4M. The maximum pipe size on my system is 1M, so if you go beyond that, splicing from the FD won't work. Also, the userspace client must reserve a buffer this size so it can receive a write, which is a waste since most requests are much smaller. I checked the maximum pipe size can be changed using fcntl(2) or /proc/sys/fs/pipe-max-size. It is clear that it is not a fixed value. Also, it seems that there is a request for setting the maximum number of pages per fuse request to 4M (1024 pages). One of the reasons to introduce the sysfs max_pages_per_req parameter is to set a threshold of the maximum number of pages dynamically according to the administrator's demand, and root can only change it. So, when the maximum value is required to be set to not more than the pipe-max-size, the max_pages_per_req should be changed considering it. It seems that the upper limit of this parameter does not have to be not more than it. I'm planning to limit max_pages_per_req up to 1024 pages and add the document to /Documentation/filesystems/fuse.txt, as follows. "the sysfs max_pages_per_req parameter can be changed from 32 to 1024. The default is 32 pages. Generally, the pipe-max-size is 1M (256 pages) and it is better to set it to not more than the pipe-max-size." This is just a plan and any comments are appreciated. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] uprobes: suppress uprobe_munmap() from mmput()
* Oleg Nesterov [2012-07-08 22:30:03]: > uprobe_munmap() does get_user_pages() and it is also called from > the final mmput()->exit_mmap() path. This slows down exit/mmput() > for no reason, and I think it is simply dangerous/wrong to try to > fault-in a page into the dying mm. If nothing else, this happens > after the last sync_mm_rss(), afaics handle_mm_fault() can change > the task->rss_stat and make the subsequent check_mm() unhappy. > > Change uprobe_munmap() to check mm->mm_users != 0. > > Signed-off-by: Oleg Nesterov Acked-by: Srikar Dronamraju > --- > kernel/events/uprobes.c |3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index a93b6df..47c4e24 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1082,6 +1082,9 @@ void uprobe_munmap(struct vm_area_struct *vma, unsigned > long start, unsigned lon > if (!atomic_read(_events) || !valid_vma(vma, false)) > return; > > + if (!atomic_read(>vm_mm->mm_users)) /* called by mmput() ? */ > + return; > + > if (!atomic_read(>vm_mm->uprobes_state.count)) > return; > > -- > 1.5.5.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the kvm-ppc tree with the powerpc tree
On 12.07.2012, at 05:57, Stephen Rothwell wrote: > Hi Alexander, > > Today's linux-next merge of the kvm-ppc tree got a conflict in > arch/powerpc/kvm/booke_interrupts.S between commit c75df6f96c59 > ("powerpc: Fix usage of register macros getting ready for %r0 change") > from the powerpc tree and commit fc372c0843b8 ("booke: Added crit/mc > exception handler for e500v2") from the kvm-ppc tree. > > I fixed it up (see below - could do with checking) and can carry the fix > as necessary. Hrm. Ben already warned me that this will happen, so I did a test merge a few days ago. Back then I also had to change 2 other bits that were not conflicting, to get to code to actually compile. Could you please do an s/VCPU_GPR(r/VCPU_GPR/R/g in arch/powerpc/kvm/booke_interrupts.S? I'd check if you actually need it myself, but the tree doesn't seem to be pushed out yet :). Alex > -- > Cheers, > Stephen Rothwells...@canb.auug.org.au > > diff --cc arch/powerpc/kvm/booke_interrupts.S > index 8fd4b2a,09456c4..000 > --- a/arch/powerpc/kvm/booke_interrupts.S > +++ b/arch/powerpc/kvm/booke_interrupts.S > @@@ -52,16 -53,21 +52,21 @@@ > (1< (1< > - .macro KVM_HANDLER ivor_nr > + .macro KVM_HANDLER ivor_nr scratch srr0 > _GLOBAL(kvmppc_handler_\ivor_nr) > /* Get pointer to vcpu and record exit number. */ > - mtspr SPRN_SPRG_WSCRATCH0, r4 > + mtspr \scratch , r4 > mfspr r4, SPRN_SPRG_RVCPU > - stw r3, VCPU_GPR(r3)(r4) > - stw r5, VCPU_GPR(r5)(r4) > - stw r6, VCPU_GPR(r6)(r4) > ++stw r3, VCPU_GPR(R3)(r4) > + stw r5, VCPU_GPR(R5)(r4) > + stw r6, VCPU_GPR(R6)(r4) > + mfspr r3, \scratch > mfctr r5 > - lis r6, kvmppc_resume_host@h > - stw r3, VCPU_GPR(r4)(r4) > ++stw r3, VCPU_GPR(R4)(r4) > stw r5, VCPU_CTR(r4) > + mfspr r3, \srr0 > + lis r6, kvmppc_resume_host@h > + stw r3, VCPU_PC(r4) > li r5, \ivor_nr > ori r6, r6, kvmppc_resume_host@l > mtctr r6 > @@@ -99,12 -104,11 +103,11 @@@ _GLOBAL(kvmppc_handler_len > * r5: KVM exit number > */ > _GLOBAL(kvmppc_resume_host) > - stw r3, VCPU_GPR(R3)(r4) > mfcrr3 > stw r3, VCPU_CR(r4) > - stw r7, VCPU_GPR(r7)(r4) > - stw r8, VCPU_GPR(r8)(r4) > - stw r9, VCPU_GPR(r9)(r4) > + stw r7, VCPU_GPR(R7)(r4) > + stw r8, VCPU_GPR(R8)(r4) > + stw r9, VCPU_GPR(R9)(r4) > > li r6, 1 > slw r6, r6, r5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 82571EB: Detected Hardware Unit Hang
>-Original Message- >From: Joe Jin [mailto:joe@oracle.com] >Sent: Wednesday, July 11, 2012 8:13 PM >To: Dave, Tushar N >Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: Re: 82571EB: Detected Hardware Unit Hang > >On 07/12/12 11:07, Dave, Tushar N wrote: >>> -Original Message- >>> From: Joe Jin [mailto:joe@oracle.com] >>> Sent: Wednesday, July 11, 2012 7:58 PM >>> To: Dave, Tushar N >>> Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- >>> ker...@vger.kernel.org >>> Subject: Re: 82571EB: Detected Hardware Unit Hang >>> >>> On 07/12/12 10:52, Dave, Tushar N wrote: What is the exact error messages in BIOS log? >>> >>> Error message from BIOS event log: >>> 07/12/12 05:54:00 >>>PCI Express Non-Fatal Error >>> >>> Thanks, >>> Joe >Hi Tushar, > >Please find eeprom from attachment. Do you have lspci -vvv dump of entire system before and after issue occurs? If you have can you send it to me? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] bootmem: Make ___alloc_bootmem_node_nopanic() to be real nopanic
after | From 99ab7b19440a72ebdf225f99b20f8ef40decee86 Mon Sep 17 00:00:00 2001 | Date: Wed, 11 Jul 2012 14:02:53 -0700 | Subject: [PATCH] mm: sparse: fix usemap allocation above node descriptor section Johannes said: | while backporting the below patch, I realised that your fix busted | f5bf18fa22f8 again. The problem was not a panicking version on | allocation failure but when the usemap size was too large such that | goal + size > limit triggers the BUG_ON in the bootmem allocator. So | we need a version that passes limit ONLY if the usemap is smaller than | the section. after checking the code, the name of ___alloc_bootmem_node_nopanic() does not reflect the fact. Make bootmem really not panic. Hope will kill bootmem sooner. Signed-off-by: Yinghai Lu Cc: Andrew Morton Cc: Johannes Weiner Cc: sta...@vger.kernel.org --- mm/bootmem.c |4 1 file changed, 4 insertions(+) Index: linux-2.6/mm/bootmem.c === --- linux-2.6.orig/mm/bootmem.c +++ linux-2.6/mm/bootmem.c @@ -710,6 +710,10 @@ again: if (ptr) return ptr; + /* do not panic in alloc_bootmem_bdata() */ + if (limit && goal + size > limit) + limit = 0; + ptr = alloc_bootmem_bdata(pgdat->bdata, size, align, goal, limit); if (ptr) return ptr; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/5] uprobes: fix overflow in vma_address/find_active_uprobe
* Oleg Nesterov [2012-07-09 12:54:45]: > On 07/08, Joe Perches wrote: > > > > On Sun, 2012-07-08 at 22:30 +0200, Oleg Nesterov wrote: > > > @@ -1450,7 +1450,7 @@ static struct uprobe *find_active_uprobe(unsigned > > > long bp_vaddr, int *is_swbp) > > > > > > inode = vma->vm_file->f_mapping->host; > > > offset = bp_vaddr - vma->vm_start; > > > - offset += (vma->vm_pgoff << PAGE_SHIFT); > > > + offset += ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > > > > It's be nice to take remove the unnecessary parenthesis > > and make it consistent with the vaddr use above it too. > > OK, please find v2 below. > > -- > Subject: [PATCH] uprobes: fix overflow in vma_address/find_active_uprobe > > vma->vm_pgoff is "unsigned long", it should be promoted to loff_t > before the multiplication to avoid the overflow. > > Signed-off-by: Oleg Nesterov Acked-by: Srikar Dronamraju > --- > kernel/events/uprobes.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index 47c4e24..6194edb 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -117,7 +117,7 @@ static loff_t vma_address(struct vm_area_struct *vma, > loff_t offset) > loff_t vaddr; > > vaddr = vma->vm_start + offset; > - vaddr -= vma->vm_pgoff << PAGE_SHIFT; > + vaddr -= (loff_t)vma->vm_pgoff << PAGE_SHIFT; > > return vaddr; > } > @@ -1450,7 +1450,7 @@ static struct uprobe *find_active_uprobe(unsigned long > bp_vaddr, int *is_swbp) > > inode = vma->vm_file->f_mapping->host; > offset = bp_vaddr - vma->vm_start; > - offset += (vma->vm_pgoff << PAGE_SHIFT); > + offset += (loff_t)vma->vm_pgoff << PAGE_SHIFT; > uprobe = find_uprobe(inode, offset); > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: build warning after merge of the slab tree
Hi all, After merging the slab tree, today's linux-next build (i386_defconfig) produced this warning: mm/slab_common.c: In function 'kmem_cache_create': mm/slab_common.c:101:1: warning: label 'oops' defined but not used [-Wunused-label] Introduced by commit 20cea9683ecc ("mm, sl[aou]b: Move kmem_cache_create mutex handling to common code") from the slab tree. The label is only used when CONFIG_DEBUG_VM is defined. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpLXS1L9Y9eG.pgp Description: PGP signature
3.4.4-rt13: btrfs + xfstests 006 = BOOM.. and a bonus rt_mutex deadlock report for absolutely free!
Greetings, I'm chasing btrfs critters in an enterprise 3.0-rt kernel, and just checked to see if they're alive in virgin latest/greatest rt kernel. Both are indeed alive and well, ie I didn't break it, nor did the zillion patches in enterprise base kernel, so others may have an opportunity to meet these critters up close and personal as well. Unfortunately, this kernel refuses to crash dump, but both appear to be my exact critters, so I'll report them, then go back to squabbling with the things where I can at least rummage in piles of wreckage to gather rocks and sharpen sticks. Box: x3550 M3 1 x E5620, HT enabled ATM. Reproducer1: xfstests 006 in a loop, box doesn't last long at all. [ 189.300478] [ cut here ] [ 189.300482] kernel BUG at kernel/rtmutex_common.h:75! [ 189.300486] invalid opcode: [#1] PREEMPT SMP [ 189.300489] CPU 2 [ 189.300490] Modules linked in: ibm_rtl nfsd ipmi_devintf lockd nfs_acl auth_rpcgss sunrpc ipmi_si ipmi_msghandler ipv6 af_packet edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse loop dm_mod tpm_tis tpm ioatdma shpchp tpm_bios pci_hotplug sg cdc_ether usbnet i2c_i801 serio_raw mii pcspkr i7core_edac i2c_core dca iTCO_wdt edac_core button iTCO_vendor_support bnx2 usbhid hid uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif rtc_cmos usb_common fan processor ata_generic ata_piix libata megaraid_sas scsi_mod thermal thermal_sys hwmon [ 189.300531] [ 189.300534] Pid: 15363, comm: btrfs-worker-1 Not tainted 3.4.4-rt13 #24 IBM System x3550 M3 -[7944K3G]-/69Y5698 [ 189.300539] RIP: 0010:[] [] __try_to_take_rt_mutex+0x169/0x170 [ 189.300551] RSP: 0018:880174527b90 EFLAGS: 00010296 [ 189.300554] RAX: RBX: 880177a0edd0 RCX: 0001 [ 189.300557] RDX: RSI: 8801760a77c8 RDI: 8801760a77b0 [ 189.300559] RBP: 880174527bd0 R08: 0001 R09: 0001 [ 189.300562] R10: R11: R12: 8801791812c0 [ 189.300565] R13: 880177a0edd8 R14: 880177a0edd0 R15: 8801791812c0 [ 189.300569] FS: () GS:88017f24() knlGS: [ 189.300572] CS: 0010 DS: ES: CR0: 8005003b [ 189.300575] CR2: 7f6449423f90 CR3: 0180e000 CR4: 07e0 [ 189.300578] DR0: DR1: DR2: [ 189.300582] DR3: DR6: 0ff0 DR7: 0400 [ 189.300585] Process btrfs-worker-1 (pid: 15363, threadinfo 880174526000, task 8801791812c0) [ 189.300587] Stack: [ 189.300589] 880179234500 88017927b6b0 88017a6a4180 880177a0edd0 [ 189.300595] 880175b10da0 880177ad9e98 880177a0edd0 8801791812c0 [ 189.300599] 880174527ca0 814c466e 00011200 88017f24ca40 [ 189.300604] Call Trace: [ 189.300611] [] rt_spin_lock_slowlock+0x4e/0x291 [ 189.300618] [] ? kmem_cache_alloc+0x114/0x1f0 [ 189.300626] [] ? bvec_alloc_bs+0x60/0x110 [ 189.300631] [] rt_spin_lock+0x21/0x30 [ 189.300636] [] schedule_bio+0x63/0x130 [ 189.300640] [] ? bio_clone+0x47/0x90 [ 189.300645] [] btrfs_map_bio+0xc2/0x230 [ 189.300650] [] __btree_submit_bio_done+0x16/0x20 [ 189.300654] [] run_one_async_done+0xa8/0xc0 [ 189.300658] [] run_ordered_completions+0x88/0xe0 [ 189.300663] [] worker_loop+0xc5/0x430 [ 189.300669] [] ? __schedule+0x2b0/0x630 [ 189.300673] [] ? btrfs_queue_worker+0x1e0/0x1e0 [ 189.300677] [] ? btrfs_queue_worker+0x1e0/0x1e0 [ 189.300684] [] kthread+0x96/0xa0 [ 189.300690] [] ? finish_task_switch+0x54/0xd0 [ 189.300695] [] kernel_thread_helper+0x4/0x10 [ 189.300700] [] ? __init_kthread_worker+0x50/0x50 [ 189.300704] [] ? gs_change+0x13/0x13 [ 189.300706] Code: 02 ff ff ff e9 49 ff ff ff 49 39 f5 74 18 4d 8d b4 24 b0 05 00 00 4c 89 f7 e8 74 ae 43 00 49 89 c7 e9 67 ff ff ff 4c 89 e0 eb aa <0f> 0b 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 18 e8 ef [ 189.300735] RIP [] __try_to_take_rt_mutex+0x169/0x170 [ 189.300740] RSP [ 189.636837] ---[ end trace 0002 ]--- >From 3.0-rt that will dump. crash> struct rt_mutex 0x8801770601c8 struct rt_mutex { wait_lock = { raw_lock = { slock = 7966 } }, wait_list = { node_list = { next = 0x880175eedbe0, prev = 0x880175eedbe0 }, rawlock = 0x880175eedbd8, spinlock = 0x0 }, owner = 0x1, save_state = 0, file = 0x0, name = 0x81781b9b "&(>io_lock)->lock", line = 0, magic = 0x0 } crash> struct list_head 0x880175eedbe0 struct list_head { next = 0x6b6b6b6b6b6b6b6b, prev = 0x6b6b6b6b6b6b6b6b } Reproducer2: dbench -t 30 8 [ 692.857164] [ 692.857165] [ 692.863963] [ BUG: circular locking deadlock detected! ] [ 692.869264] Not tainted [ 692.871708]
[PATCH 1/2 v2] USB: dwc3-exynos: Add support for device tree
This patch adds support to parse probe data for dwc3 driver for exynos using device tree Signed-off-by: Praveen Paneri Signed-off-by: Vivek Gautam diff --git a/drivers/usb/dwc3/dwc3-exynos.c b/drivers/usb/dwc3/dwc3-exynos.c index d190301..9ae91b7 100644 --- a/drivers/usb/dwc3/dwc3-exynos.c +++ b/drivers/usb/dwc3/dwc3-exynos.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "core.h" @@ -30,6 +31,8 @@ struct dwc3_exynos { struct clk *clk; }; +static u64 dwc3_exynos_dma_mask = DMA_BIT_MASK(32); + static int __devinit dwc3_exynos_probe(struct platform_device *pdev) { struct dwc3_exynos_data *pdata = pdev->dev.platform_data; @@ -46,6 +49,16 @@ static int __devinit dwc3_exynos_probe(struct platform_device *pdev) goto err0; } + /* +* Right now device-tree probed devices don't get dma_mask set. +* Since shared usb code relies on it, set it here for now. +* Once we move to full device tree support this will vanish off. +*/ + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = _exynos_dma_mask; + if (!pdev->dev.coherent_dma_mask) + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + platform_set_drvdata(pdev, exynos); devid = dwc3_get_device_id(); @@ -135,11 +148,20 @@ static int __devexit dwc3_exynos_remove(struct platform_device *pdev) return 0; } +#ifdef CONFIG_OF +static const struct of_device_id exynos_xhci_match[] = { + { .compatible = "samsung,exynos-xhci" }, + {}, +}; +MODULE_DEVICE_TABLE(of, exynos_xhci_match); +#endif + static struct platform_driver dwc3_exynos_driver = { .probe = dwc3_exynos_probe, .remove = __devexit_p(dwc3_exynos_remove), .driver = { .name = "exynos-dwc3", + .of_match_table = of_match_ptr(exynos_xhci_match), }, }; -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2 v2] USB: dwc3-exynos: Add vbus setup function to the exynos dwc3 glue layer
From: Abhilash Kesavan This patch retrieves and configures the vbus control gpio via the device tree. The suspend/resume callbacks will be later modified for vbus control. Signed-off-by: Abhilash Kesavan Signed-off-by: Vivek Gautam diff --git a/drivers/usb/dwc3/dwc3-exynos.c b/drivers/usb/dwc3/dwc3-exynos.c index 9ae91b7..5dd87c1 100644 --- a/drivers/usb/dwc3/dwc3-exynos.c +++ b/drivers/usb/dwc3/dwc3-exynos.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "core.h" @@ -31,6 +32,28 @@ struct dwc3_exynos { struct clk *clk; }; +static int dwc3_setup_vbus_gpio(struct platform_device *pdev) +{ + int err; + int gpio; + + if (!pdev->dev.of_node) + return 0; + + gpio = of_get_named_gpio(pdev->dev.of_node, + "samsung,vbus-gpio", 0); + if (!gpio_is_valid(gpio)) + return 0; + + err = gpio_request_one(gpio, GPIOF_OUT_INIT_HIGH, "dwc3_vbus_gpio"); + if (err) { + dev_err(>dev, "can't request dwc3 vbus gpio %d", gpio); + return err; + } + + return err; +} + static u64 dwc3_exynos_dma_mask = DMA_BIT_MASK(32); static int __devinit dwc3_exynos_probe(struct platform_device *pdev) @@ -59,6 +82,8 @@ static int __devinit dwc3_exynos_probe(struct platform_device *pdev) if (!pdev->dev.coherent_dma_mask) pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + dwc3_setup_vbus_gpio(pdev); + platform_set_drvdata(pdev, exynos); devid = dwc3_get_device_id(); -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2 v2] USB: host: Add Device tree support for dwc3-exynos
Changes from v1: 1) Added comment to explain inclusion of dma_mask through pdata. 2) Replaced gpio_request() with gpio_request_one() 3) Removed gpio_set_value() This patchset is based and tested on 3.5 rc5. Abhilash Kesavan (1): USB: dwc3-exynos: Add vbus setup function to the exynos dwc3 glue layer Vivek Gautam (1): USB: dwc3-exynos: Add support for device tree drivers/usb/dwc3/dwc3-exynos.c | 47 1 files changed, 47 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: remove useless code in yield_to
On 07/03/2012 02:34 PM, Michael Wang wrote: > From: Michael Wang > > it's impossible to enter else branch if we have set skip_clock_update > in task_yield_fair(), as yield_to_task_fair() will directly return > true after invoke task_yield_fair(). > > Signed-off-by: Michael Wang > --- > kernel/sched/core.c |7 --- > 1 files changed, 0 insertions(+), 7 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 9bb7d28..77c14aa 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4737,13 +4737,6 @@ again: >*/ > if (preempt && rq != p_rq) > resched_task(p_rq->curr); > - } else { > - /* > - * We might have set it in task_yield_fair(), but are > - * not going to schedule(), so don't want to skip > - * the next update. > - */ > - rq->skip_clock_update = 0; > } Can I get some comments on this patch? Regards, Michael Wang > > out: > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] panel: Use pr_err(...) rather than printk(KERN_ERR ...)
On Thu, 2012-07-12 at 15:22 +1000, Ryan Mallon wrote: > On 12/07/12 12:35, Toshiaki Yamane wrote: > > This change is inspired by checkpatch. > > Your changelog needs to describe all of the changes you are making. The > subject line only describes one. This patch is doing the following: > > - Converting printk(KERN_ERR to pr_err > - Adding __func__ prefixes to printk lines > - Refactoring split printk strings onto a single line > > There are a few other printks in this file which could be converted to > pr_* to make the code more consistent. Perhaps a follow up patch? > > Typically for a sub-sequent version of a patch/series you should list > the changes since the last round. Put these below the --- so that they > don't become part of the change log, e.g.: > > Signed-off-by: Your name/email here > --- > > Changes since v2: > - Some stuff > > Changes since v1: > - Some other stuff > > Some more comments below. And ideally cc the people that gave you notes/comments on your previous patches too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3 v2] USB: ehci-s5p: Add vbus setup function to the s5p ehci glue layer
From: Abhilash Kesavan This patch retrieves and configures the vbus control gpio via the device tree. The suspend/resume callbacks will be later modified for vbus control. Signed-off-by: Abhilash Kesavan Signed-off-by: Vivek Gautam diff --git a/drivers/usb/host/ehci-s5p.c b/drivers/usb/host/ehci-s5p.c index 52d0049..9f9870c 100644 --- a/drivers/usb/host/ehci-s5p.c +++ b/drivers/usb/host/ehci-s5p.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include @@ -64,6 +65,28 @@ static const struct hc_driver s5p_ehci_hc_driver = { .clear_tt_buffer_complete = ehci_clear_tt_buffer_complete, }; +static int s5p_ehci_setup_gpio(struct platform_device *pdev) +{ + int err; + int gpio; + + if (!pdev->dev.of_node) + return 0; + + gpio = of_get_named_gpio(pdev->dev.of_node, + "samsung,vbus-gpio", 0); + if (!gpio_is_valid(gpio)) + return 0; + + err = gpio_request_one(gpio, GPIOF_OUT_INIT_HIGH, "ehci_vbus_gpio"); + if (err) { + dev_err(>dev, "can't request ehci vbus gpio %d", gpio); + return err; + } + + return err; +} + static u64 ehci_s5p_dma_mask = DMA_BIT_MASK(32); static int __devinit s5p_ehci_probe(struct platform_device *pdev) @@ -92,6 +115,8 @@ static int __devinit s5p_ehci_probe(struct platform_device *pdev) if (!pdev->dev.coherent_dma_mask) pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + s5p_ehci_setup_gpio(pdev); + s5p_ehci = kzalloc(sizeof(struct s5p_ehci_hcd), GFP_KERNEL); if (!s5p_ehci) return -ENOMEM; -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3 v2] USB: ohci-exynos: Add support for device tree
This patch adds support to parse probe data for ohci driver for exynos using device tree. Signed-off-by: Thomas Abraham Signed-off-by: Abhilash Kesavan Signed-off-by: Vivek Gautam diff --git a/drivers/usb/host/ohci-exynos.c b/drivers/usb/host/ohci-exynos.c index 2909621..c4ad60f 100644 --- a/drivers/usb/host/ohci-exynos.c +++ b/drivers/usb/host/ohci-exynos.c @@ -12,6 +12,7 @@ */ #include +#include #include #include #include @@ -71,6 +72,8 @@ static const struct hc_driver exynos_ohci_hc_driver = { .start_port_reset = ohci_start_port_reset, }; +static u64 ohci_exynos_dma_mask = DMA_BIT_MASK(32); + static int __devinit exynos_ohci_probe(struct platform_device *pdev) { struct exynos4_ohci_platdata *pdata; @@ -87,6 +90,16 @@ static int __devinit exynos_ohci_probe(struct platform_device *pdev) return -EINVAL; } + /* +* Right now device-tree probed devices don't get dma_mask set. +* Since shared usb code relies on it, set it here for now. +* Once we move to full device tree support this will vanish off. +*/ + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = _exynos_dma_mask; + if (!pdev->dev.coherent_dma_mask) + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + exynos_ohci = kzalloc(sizeof(struct exynos_ohci_hcd), GFP_KERNEL); if (!exynos_ohci) return -ENOMEM; @@ -258,6 +271,14 @@ static const struct dev_pm_ops exynos_ohci_pm_ops = { .resume = exynos_ohci_resume, }; +#ifdef CONFIG_OF +static const struct of_device_id exynos_ohci_match[] = { + { .compatible = "samsung,exynos-ohci" }, + {}, +}; +MODULE_DEVICE_TABLE(of, exynos_ohci_match); +#endif + static struct platform_driver exynos_ohci_driver = { .probe = exynos_ohci_probe, .remove = __devexit_p(exynos_ohci_remove), @@ -266,6 +287,7 @@ static struct platform_driver exynos_ohci_driver = { .name = "exynos-ohci", .owner = THIS_MODULE, .pm = _ohci_pm_ops, + .of_match_table = of_match_ptr(exynos_ohci_match), } }; -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3 v2] USB: ehci-s5p: Add support for device tree
This patch adds support to parse probe data for ehci driver for exynos using device tree Signed-off-by: Thomas Abraham Signed-off-by: Abhilash Kesavan Signed-off-by: Vivek Gautam diff --git a/drivers/usb/host/ehci-s5p.c b/drivers/usb/host/ehci-s5p.c index c474cec..52d0049 100644 --- a/drivers/usb/host/ehci-s5p.c +++ b/drivers/usb/host/ehci-s5p.c @@ -13,6 +13,7 @@ */ #include +#include #include #include #include @@ -63,6 +64,8 @@ static const struct hc_driver s5p_ehci_hc_driver = { .clear_tt_buffer_complete = ehci_clear_tt_buffer_complete, }; +static u64 ehci_s5p_dma_mask = DMA_BIT_MASK(32); + static int __devinit s5p_ehci_probe(struct platform_device *pdev) { struct s5p_ehci_platdata *pdata; @@ -79,6 +82,16 @@ static int __devinit s5p_ehci_probe(struct platform_device *pdev) return -EINVAL; } + /* +* Right now device-tree probed devices don't get dma_mask set. +* Since shared usb code relies on it, set it here for now. +* Once we move to full device tree support this will vanish off. +*/ + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = _s5p_dma_mask; + if (!pdev->dev.coherent_dma_mask) + pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32); + s5p_ehci = kzalloc(sizeof(struct s5p_ehci_hcd), GFP_KERNEL); if (!s5p_ehci) return -ENOMEM; @@ -298,6 +311,14 @@ static int s5p_ehci_resume(struct device *dev) #define s5p_ehci_resumeNULL #endif +#ifdef CONFIG_OF +static const struct of_device_id exynos_ehci_match[] = { + { .compatible = "samsung,exynos-ehci" }, + {}, +}; +MODULE_DEVICE_TABLE(of, exynos_ehci_match); +#endif + static const struct dev_pm_ops s5p_ehci_pm_ops = { .suspend= s5p_ehci_suspend, .resume = s5p_ehci_resume, @@ -311,6 +332,7 @@ static struct platform_driver s5p_ehci_driver = { .name = "s5p-ehci", .owner = THIS_MODULE, .pm = _ehci_pm_ops, + .of_match_table = of_match_ptr(exynos_ehci_match), } }; -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3 v2] USB: host: Add Device tree support for ohci-exynos & ehci-s5p
From: Ajay Kumar Changes from v1: 1) Added comment to explain inclusion of dma_mask through pdata. 2) Replaced gpio_request() with gpio_request_one() 3) Removed gpio_set_value() This patchset is based and tested on 3.5 rc5. Abhilash Kesavan (1): USB: ehci-s5p: Add vbus setup function to the s5p ehci glue layer Vivek Gautam (2): USB: ohci-exynos: Add support for device tree USB: ehci-s5p: Add support for device tree drivers/usb/host/ehci-s5p.c| 47 drivers/usb/host/ohci-exynos.c | 22 ++ 2 files changed, 69 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] checkpatch: Add check for use of sizeof without parenthesis
Kernel style uses parenthesis around sizeof. Signed-off-by: Joe Perches --- Add check that works for sizeof *foo as well as sizeof foo scripts/checkpatch.pl |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 7190f95..72c1803 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -3265,6 +3265,12 @@ sub process { "sizeof(& should be avoided\n" . $herecurr); } +# check for sizeof without parenthesis + if ($line =~ /\bsizeof\s+((?:\*\s*|)$Lval|$Type(?:\s+$Lval|))/) { + WARN("SIZEOF_PARENTHESIS", +"sizeof $1 should be sizeof($1)\n" . $herecurr); + } + # check for line continuations in quoted strings with odd counts of " if ($rawline =~ /\\$/ && $rawline =~ tr/"/"/ % 2) { WARN("LINE_CONTINUATIONS", -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] panel: Use pr_err(...) rather than printk(KERN_ERR ...)
On 12/07/12 12:35, Toshiaki Yamane wrote: > This change is inspired by checkpatch. Your changelog needs to describe all of the changes you are making. The subject line only describes one. This patch is doing the following: - Converting printk(KERN_ERR to pr_err - Adding __func__ prefixes to printk lines - Refactoring split printk strings onto a single line There are a few other printks in this file which could be converted to pr_* to make the code more consistent. Perhaps a follow up patch? Typically for a sub-sequent version of a patch/series you should list the changes since the last round. Put these below the --- so that they don't become part of the change log, e.g.: Signed-off-by: Your name/email here --- Changes since v2: - Some stuff Changes since v1: - Some other stuff Some more comments below. ~Ryan > > Signed-off-by: Toshiaki Yamane > --- > drivers/staging/panel/panel.c | 42 +--- > 1 files changed, 18 insertions(+), 24 deletions(-) > > diff --git a/drivers/staging/panel/panel.c b/drivers/staging/panel/panel.c > index 7365089..a6d71fd 100644 > --- a/drivers/staging/panel/panel.c > +++ b/drivers/staging/panel/panel.c > @@ -34,6 +34,8 @@ > * > */ > > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + If you are going to print __func__ on each line, you can do: #define pr_fmt(fmt) KBUILD_MODNAME "%s: " fmt, __func__ Do you really need to print the function name out everywhere though? > #include > > #include > @@ -1987,10 +1989,9 @@ static struct logical_input *panel_bind_key(char > *name, char *press, > struct logical_input *key; > > key = kzalloc(sizeof(struct logical_input), GFP_KERNEL); > - if (!key) { > - printk(KERN_ERR "panel: not enough memory\n"); > + if (!key) > return NULL; > - } > + > if (!input_name2mask(name, >mask, >value, _mask_i, >_mask_o)) { > kfree(key); > @@ -2030,10 +2031,9 @@ static struct logical_input *panel_bind_callback(char > *name, > struct logical_input *callback; > > callback = kmalloc(sizeof(struct logical_input), GFP_KERNEL); > - if (!callback) { > - printk(KERN_ERR "panel: not enough memory\n"); > + if (!callback) > return NULL; > - } > + > memset(callback, 0, sizeof(struct logical_input)); > if (!input_name2mask(name, >mask, >value, >_mask_i, _mask_o)) > @@ -2110,10 +2110,8 @@ static void panel_attach(struct parport *port) > return; > > if (pprt) { > - printk(KERN_ERR > -"panel_attach(): port->number=%d parport=%d, " > -"already registered !\n", > -port->number, parport); > + pr_err("%s: port->number=%d parport=%d, already registered !\n", > +__func__, port->number, parport); Nitpick - Could remove the space before the '!'. The original has it that, so no big deal if you want to leave it. > return; > } > > @@ -2122,16 +2120,14 @@ static void panel_attach(struct parport *port) > /*PARPORT_DEV_EXCL */ > 0, (void *)); > if (pprt == NULL) { > - pr_err("panel_attach(): port->number=%d parport=%d, " > -"parport_register_device() failed\n", > -port->number, parport); > + pr_err("%s: port->number=%d parport=%d, > parport_register_device() failed\n", > +__func__, port->number, parport); > return; > } > > if (parport_claim(pprt)) { > - printk(KERN_ERR > -"Panel: could not claim access to parport%d. " > -"Aborting.\n", parport); > + pr_err("%s: could not claim access to parport%d. Aborting.\n", > +__func__, parport); > goto err_unreg_device; > } > > @@ -2165,10 +2161,8 @@ static void panel_detach(struct parport *port) > return; > > if (!pprt) { > - printk(KERN_ERR > -"panel_detach(): port->number=%d parport=%d, " > -"nothing to unregister.\n", > -port->number, parport); > + pr_err("%s: port->number=%d parport=%d, nothing to > unregister.\n", > +__func__, port->number, parport); > return; > } > > @@ -2278,8 +2272,8 @@ int panel_init(void) > init_in_progress = 1; > > if (parport_register_driver(_driver)) { > - printk(KERN_ERR > -"Panel: could not register with parport. Aborting.\n"); > + pr_err("%s: could not register with parport. Aborting.\n", > +__func__); > return -EIO; > } > > @@ -2291,8 +2285,8 @@ int
[PATCH RFC] pci: ACS quirk for AMD southbridge
We've confirmed that peer-to-peer between these devices is not possible. We can therefore claim that they support a subset of ACS. Signed-off-by: Alex Williamson Cc: Joerg Roedel --- Two things about this patch make me a little nervous. The first is that I'd really like to have a pci_is_pcie() test in pci_mf_no_p2p_acs_enabled(), but these devices don't have a PCIe capability. That means that if there was a topology where these devices sit on a legacy PCI bus, we incorrectly return that we're ACS safe here. That leads to my second problem, pciids seems to suggest that some of these functions have been around for a while. Is it just this package that's peer-to-peer safe, or is it safe to assume that any previous assembly of these functions is also p2p safe. Maybe we need to factor in device revs if that uniquely identifies this package? Looks like another useful device to potentially quirk would be: 00:15.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0) 00:15.1 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1) 00:15.2 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI bridge (PCIE port 2) 00:15.3 PCI bridge: Advanced Micro Devices [AMD] nee ATI SB900 PCI to PCI bridge (PCIE port 3) 00:15.0 0604: 1002:43a0 00:15.1 0604: 1002:43a1 00:15.2 0604: 1002:43a2 00:15.3 0604: 1002:43a3 drivers/pci/quirks.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 4ebc865..2c84961 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3271,11 +3271,40 @@ struct pci_dev *pci_get_dma_source(struct pci_dev *dev) return pci_dev_get(dev); } +/* + * Multifunction devices that do not support peer-to-peer between + * functions can claim to support a subset of ACS. Such devices + * effectively enable request redirect (RR) and completion redirect (CR) + * since all transactions are redirected to the upstream root complex. + */ +static int pci_mf_no_p2p_acs_enabled(struct pci_dev *dev, u16 acs_flags) +{ + if (!dev->multifunction) + return -ENODEV; + + /* Filter out flags not applicable to multifunction */ + acs_flags &= (PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC | PCI_ACS_DT); + + return acs_flags & ~(PCI_ACS_RR | PCI_ACS_CR) ? 0 : 1; +} + static const struct pci_dev_acs_enabled { u16 vendor; u16 device; int (*acs_enabled)(struct pci_dev *dev, u16 acs_flags); } pci_dev_acs_enabled[] = { + /* +* AMD/ATI multifunction southbridge devices. AMD has confirmed +* that peer-to-peer between these devices is not possible, so +* they do support a subset of ACS even though the capability is +* not exposed in config space. +*/ + { PCI_VENDOR_ID_ATI, 0x4385, pci_mf_no_p2p_acs_enabled }, + { PCI_VENDOR_ID_ATI, 0x439c, pci_mf_no_p2p_acs_enabled }, + { PCI_VENDOR_ID_ATI, 0x4383, pci_mf_no_p2p_acs_enabled }, + { PCI_VENDOR_ID_ATI, 0x439d, pci_mf_no_p2p_acs_enabled }, + { PCI_VENDOR_ID_ATI, 0x4384, pci_mf_no_p2p_acs_enabled }, + { PCI_VENDOR_ID_ATI, 0x4399, pci_mf_no_p2p_acs_enabled }, { 0 } }; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 0/2] kvm: Improving directed yield in PLE handler
On 07/11/2012 05:09 PM, Avi Kivity wrote: On 07/11/2012 02:18 PM, Christian Borntraeger wrote: On 11/07/12 13:04, Avi Kivity wrote: On 07/11/2012 01:17 PM, Christian Borntraeger wrote: On 11/07/12 11:06, Avi Kivity wrote: [...] Almost all s390 kernels use diag9c (directed yield to a given guest cpu) for spinlocks, though. Perhaps x86 should copy this. See arch/s390/lib/spinlock.c The basic idea is using several heuristics: - loop for a given amount of loops - check if the lock holder is currently scheduled by the hypervisor (smp_vcpu_scheduled, which uses the sigp sense running instruction) Dont know if such thing is available for x86. It must be a lot cheaper than a guest exit to be useful We could make it available via shared memory, updated using preempt notifiers. Of course piling on more pv makes this less attractive. - if lock holder is not running and we looped for a while do a directed yield to that cpu. So there is no win here, but there are other cases were diag44 is used, e.g. cpu_relax. I have to double check with others, if these cases are critical, but for now, it seems that your dummy implementation for s390 is just fine. After all it is a no-op until we implement something. Does the data structure make sense for you? If so we can move it to common code (and manage it in kvm_vcpu_on_spin()). We can guard it with CONFIG_KVM_HAVE_CPU_RELAX_INTERCEPT or something, so other archs don't have to pay anything. Ignoring the name, What name would you suggest? maybe vcpu_no_progress instead of pause_loop_exited Ah, I thouht you objected to the CONFIG var. Maybe call it cpu_relax_intercepted since that's the linuxy name for the instruction. Ok, just to be on same page. 'll have : 1. cpu_relax_intercepted instead of pause_loop_exited. 2. CONFIG_KVM_HAVE_CPU_RELAX_INTERCEPT which is unconditionally selected for x86 and s390 3. make request mechanism to clear cpu_relax_intercepted. ('ll do same thing for s390 also but have not seen s390 code using request mechanism, so not sure if it ok.. otherwise we have to clear unconditionally for s390 before guest enter and for x86 we have to move make_request back to vmx/svm). will post V3 with these changes. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v3 3/13] memory-hotplug : unify argument of firmware_map_add_early/hotplug
Hi Dave, 2012/07/12 0:30, Dave Hansen wrote: > On 07/09/2012 03:25 AM, Yasuaki Ishimatsu wrote: >> @@ -642,7 +642,7 @@ int __ref add_memory(int nid, u64 start, >> } >> >> /* create new memmap entry */ >> -firmware_map_add_hotplug(start, start + size, "System RAM"); >> +firmware_map_add_hotplug(start, start + size - 1, "System RAM"); > > I know the firmware_map_*() calls use inclusive end addresses > internally, but do we really need to expose them? Both of the callers > you mentioned do: > > firmware_map_add_hotplug(start, start + size - 1, "System RAM"); > > or > > firmware_map_add_early(entry->addr, > entry->addr + entry->size - 1, > e820_type_to_string(entry->type)); > > So it seems a _bit_ silly to keep all of the callers doing this size-1 > thing. I also noted that the new caller that you added does the same > thing. Could we just change the external calling convention to be > exclusive? Thank you for your comment. Does the following patch include your comment? If O.K., I will separate the patch from the series and send it for bug fix. --- arch/x86/kernel/e820.c|2 +- drivers/firmware/memmap.c |8 2 files changed, 5 insertions(+), 5 deletions(-) Index: linux-next/arch/x86/kernel/e820.c === --- linux-next.orig/arch/x86/kernel/e820.c 2012-07-02 09:50:23.0 +0900 +++ linux-next/arch/x86/kernel/e820.c 2012-07-12 13:30:45.942318179 +0900 @@ -944,7 +944,7 @@ for (i = 0; i < e820_saved.nr_map; i++) { struct e820entry *entry = _saved.map[i]; firmware_map_add_early(entry->addr, - entry->addr + entry->size - 1, + entry->addr + entry->size, e820_type_to_string(entry->type)); } } Index: linux-next/drivers/firmware/memmap.c === --- linux-next.orig/drivers/firmware/memmap.c 2012-07-02 09:50:26.0 +0900 +++ linux-next/drivers/firmware/memmap.c2012-07-12 13:40:53.823318481 +0900 @@ -98,7 +98,7 @@ /** * firmware_map_add_entry() - Does the real work to add a firmware memmap entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * @entry: Pre-allocated (either kmalloc() or bootmem allocator), uninitialised * entry. @@ -113,7 +113,7 @@ BUG_ON(start > end); entry->start = start; - entry->end = end; + entry->end = end - 1; entry->type = type; INIT_LIST_HEAD(>list); kobject_init(>kobj, _ktype); @@ -148,7 +148,7 @@ * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do * memory hotplug. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function is for memory hotplug, it is @@ -175,7 +175,7 @@ /** * firmware_map_add_early() - Adds a firmware mapping entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function uses the bootmem allocator -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH] extcon: spelling of detach in function doc
From: Peter Meerwald Signed-off-by: Peter Meerwald Signed-off-by: MyungJoo Ham --- include/linux/extcon/extcon_gpio.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/linux/extcon/extcon_gpio.h b/include/linux/extcon/extcon_gpio.h index a2129b7..2d8307f 100644 --- a/include/linux/extcon/extcon_gpio.h +++ b/include/linux/extcon/extcon_gpio.h @@ -31,7 +31,7 @@ * @irq_flags IRQ Flags (e.g., IRQF_TRIGGER_LOW). * @state_on print_state is overriden with state_on if attached. If Null, * default method of extcon class is used. - * @state_off print_state is overriden with state_on if dettached. If Null, + * @state_off print_state is overriden with state_on if detached. If Null, * default method of extcon class is used. * * Note that in order for state_on or state_off to be valid, both state_on -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] extcon: Remove CONFIG_EXTCON_MODULE config to fix build break
This patch modify 'Kconfig' of EXTCON Subsystem to support either active or inactive of EXTCON Subsystem. The various subsystem refer to EXTCON subsystem for controlling external connector, so core class of EXTCON should be included in kernel image. If EXTCON subsystem is builded with MODULE, other subsystem have build break because of linking the core class of EXTCON. Signed-off-by: Chanwoo Choi Signed-off-by: Myungjoo Ham Signed-off-by: Kyungmin Park --- drivers/extcon/Kconfig |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig index 29c5cf8..b0eac45 100644 --- a/drivers/extcon/Kconfig +++ b/drivers/extcon/Kconfig @@ -1,5 +1,5 @@ menuconfig EXTCON - tristate "External Connector Class (extcon) support" + bool "External Connector Class (extcon) support" help Say Y here to enable external connector class (extcon) support. This allows monitoring external connectors by userspace -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ACPI / PCI: Make _SxD/_SxW check follow ACPI 4.0a spec
On Wed, 2012-07-11 at 21:04 +0200, Rafael J. Wysocki wrote: > On Wednesday, July 11, 2012, Greg KH wrote: > > On Wed, Jul 11, 2012 at 11:07:56AM +0200, Rafael J. Wysocki wrote: > > > On Wednesday, July 11, 2012, Greg KH wrote: > > > > On Sun, Apr 29, 2012 at 10:44:16PM +0200, Rafael J. Wysocki wrote: > > > > > From: Oleksij Rempel > > > > > > > > > > This patch makes _SxD/_SxW check follow the ACPI 4.0a specification > > > > > more closely and fixes suspend bug found on ASUS Zenbook UX31E. > > > > > > > > > > Some OEM use _SxD fileds do blacklist brocken Dx states. > > > > > If _SxD/_SxW return values are check before suspend as appropriate, > > > > > some nasty suspend/resume issues may be avoided. > > > > > > > > > > References: https://bugzilla.kernel.org/show_bug.cgi?id=42728 > > > > > Signed-off-by: Oleksij Rempel > > > > > Signed-off-by: Rafael J. Wysocki > > > > > --- > > > > > > > > > > Bjorn, Len, > > > > > > > > > > This is -stable material and therefore v3.4 as well, IMO. Please let > > > > > me > > > > > know if one of you can take it or whether you want me to handle it > > > > > all the > > > > > way to Linus. > > > > > > > > What ever hapened to this patch? > > > > > > It was split and partially merged. As far as the other part is concerned, > > > the jury is still out. > > > > Ok, I'll wait for someone to ask for it to be added to the stable tree > > before I do anything. > > In fact, it should have been marked as -stable material, so please add: > > commit dbe9a2edd17d843d80faf2b99f20a691c1853418 > Author: Rafael J. Wysocki > Date: Tue May 29 21:21:07 2012 +0200 > > ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification Queued up for 3.2.y. Ben. -- Ben Hutchings The generation of random numbers is too important to be left to chance. - Robert Coveyou signature.asc Description: This is a digitally signed message part
Re: perf with precise attribute kills all KVM based VMs
On Wed, Jul 11, 2012 at 10:11:57PM -0600, David Ahern wrote: > On 7/11/12 3:53 AM, Gleb Natapov wrote: > >On Wed, Jul 11, 2012 at 11:49:47AM +0200, Peter Zijlstra wrote: > >>On Wed, 2012-07-11 at 10:10 +0300, Gleb Natapov wrote: > >> > >>>Looks like Avi is right about the overshoot. Can you test something like > >>>this? > >>> > >>>diff --git a/arch/x86/kernel/cpu/perf_event_intel.c > >>>b/arch/x86/kernel/cpu/perf_event_intel.c > >>>index 166546e..5fb371a 100644 > >>>--- a/arch/x86/kernel/cpu/perf_event_intel.c > >>>+++ b/arch/x86/kernel/cpu/perf_event_intel.c > >>>@@ -1374,8 +1374,11 @@ static struct perf_guest_switch_msr > >>>*intel_guest_get_msrs(int *nr) > >>> arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL; > >>> arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask; > >>> arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask; > >>>+ arr[1].msr = MSR_IA32_PEBS_ENABLE; > >>>+ arr[1].host = cpuc->pebs_enabled; > >>>+ arr[1].guest = 0; > >>>+ *nr = 2; > >>> > >>>- *nr = 1; > >>> return arr; > >>> } > >> > > So far the 64-bit Fedora 10 VM with both a Fedora 10 stock kernel > and a 2.6.38 kernel have not faired well - and that's the only VM I > have tried at the moment. Using -e cycles:pp I have been able to > lock up the VM 3 times out of 3 series of tests with perf-kvm that > includes network traffic (e.g., netperf), disk I/O (dd based to > create a file with dsync flag) and pure userspace cpu bound (openssl > speed). May or may not be related. > OK that's may be BTSes. What about -e cycles:p? BTW are you using your patch to set exclude_guest parameter? If not use -e cycles:Hp. > Also, I noted that 'perf kvm --guest record -e cycles:pp' does not > generate a whole lot of samples -- like < 100 in a 20-second sample > -- despite the fact that the guest is rather busy. > Host events do not suppose to generate events while guest is running. > I won't have much time over the next few days to run much in the way > of tests; I'll come back to it Sunday night. > > David > > >> > >>You also need to clear TR, BTS, BTINT from MSR_IA32_DEBUGCTLMSR and > >>ideally you'd also clear MSR_IA32_DS_AREA so that any write will be a > >>proper NULL deref or such. > >Yes. With the patch above :pp modifier does not crash guest for me, but > >in theory it should since BTS are still written to DS. May be BTS writes do > >not overshoot guest entry. Will have to ask Intel for clarification. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] sh fixes for 3.5-rc7
The following changes since commit ca24a145573124732152daff105ba68cc9a2b545: Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm (2012-07-01 11:02:25 -0700) are available in the git repository at: git://github.com/pmundt/linux-sh.git tags/sh-for-linus for you to fetch changes up to 44033109e99cf584d6285226ed521098f5ef7250: SH: Convert out[bwl] macros to inline functions (2012-07-12 13:12:13 +0900) SuperH fixes for 3.5-rc7 Corey Minyard (1): SH: Convert out[bwl] macros to inline functions Paul Mundt (1): sh: Fix up se7721 GPIOLIB=y build warnings. arch/sh/include/asm/io_noioport.h | 17 ++--- arch/sh/kernel/cpu/sh3/serial-sh7720.c | 2 +- 2 files changed, 15 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] SH: Convert out[bwl] macros to inline functions
On Mon, Jul 09, 2012 at 03:35:20PM -0500, miny...@acm.org wrote: > From: Corey Minyard > > The macros just called BUG(), but that results in unused variable > warnings all over the place, like in the IPMI driver. The build > regression emails were annoying me, so here's the fix. I have > not even compile tested this, but it's rather obvious. > > Signed-off-by: Corey Minyard Builds fine, I've switched the port type to unsigned long, but it looks good otherwise. Will queue it up for -rc7, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlocks due to per-process plugging
On Thu, 2012-07-12 at 00:12 +0200, Thomas Gleixner wrote: > On Wed, 11 Jul 2012, Jan Kara wrote: > > On Wed 11-07-12 12:05:51, Jeff Moyer wrote: > > > This eventually ends in a call to blk_run_queue_async(q) after > > > submitting the I/O from the plug list. Right? So is the question > > > really why doesn't the kblockd workqueue get scheduled? > > Ah, I didn't know this. Thanks for the hint. So in the kdump I have I can > > see requests queued in tsk->plug despite the process is sleeping in > > TASK_UNINTERRUPTIBLE state. So the only way how unplug could have been > > omitted is if tsk_is_pi_blocked() was true. Rummaging through the dump... > > indeed task has pi_blocked_on = 0x8802717d79c8. The dump is from an -rt > > kernel (I just didn't originally thought that makes any difference) so > > actually any mutex is rtmutex and thus tsk_is_pi_blocked() is true whenever > > we are sleeping on a mutex. So this seems like a bug in rtmutex code. > > Thomas, you seemed to have added that condition... Any idea how to avoid > > the deadlock? > > Mike has sent out a fix related to the plug stuff, which I just posted > for the rt stable series. Can you verify against that ? btw, I called io_schedule() instead of a plain unplug thinking we're going to schedule anyway, but if we unplug and schedule, and we're not leftmost (non-rt task 'course), while we're away, likely contended mutex we're about to take may be released or at least become less contended. What a we won't be doing is accruing sleep time to help trigger yet more preemption. Anyone more deserving can move smartly rightward, and thus out of our way for a bit. If we're leftmost or rt, all was for naught, but it seemed worth a shot. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: perf with precise attribute kills all KVM based VMs
On 7/11/12 3:53 AM, Gleb Natapov wrote: On Wed, Jul 11, 2012 at 11:49:47AM +0200, Peter Zijlstra wrote: On Wed, 2012-07-11 at 10:10 +0300, Gleb Natapov wrote: Looks like Avi is right about the overshoot. Can you test something like this? diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index 166546e..5fb371a 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1374,8 +1374,11 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr) arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL; arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask; arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask; + arr[1].msr = MSR_IA32_PEBS_ENABLE; + arr[1].host = cpuc->pebs_enabled; + arr[1].guest = 0; + *nr = 2; - *nr = 1; return arr; } So far the 64-bit Fedora 10 VM with both a Fedora 10 stock kernel and a 2.6.38 kernel have not faired well - and that's the only VM I have tried at the moment. Using -e cycles:pp I have been able to lock up the VM 3 times out of 3 series of tests with perf-kvm that includes network traffic (e.g., netperf), disk I/O (dd based to create a file with dsync flag) and pure userspace cpu bound (openssl speed). May or may not be related. Also, I noted that 'perf kvm --guest record -e cycles:pp' does not generate a whole lot of samples -- like < 100 in a 20-second sample -- despite the fact that the guest is rather busy. I won't have much time over the next few days to run much in the way of tests; I'll come back to it Sunday night. David You also need to clear TR, BTS, BTINT from MSR_IA32_DEBUGCTLMSR and ideally you'd also clear MSR_IA32_DS_AREA so that any write will be a proper NULL deref or such. Yes. With the patch above :pp modifier does not crash guest for me, but in theory it should since BTS are still written to DS. May be BTS writes do not overshoot guest entry. Will have to ask Intel for clarification. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] dma: sh: provide a migration path for slave drivers to stop using .private
On Thu, Jul 12, 2012 at 06:55:32AM +0900, Magnus Damm wrote: > Hi Guennadi, > > [CC Paul] > > On Thu, Jul 5, 2012 at 1:17 AM, Guennadi Liakhovetski > wrote: > > This patch extends the sh dmaengine driver to support the preferred channel > > selection and configuration method, instead of using the "private" field > > from struct dma_chan. We add a standard filter function to be used by > > slave drivers instead of implementing their own ones, and add support for > > the DMA_SLAVE_CONFIG control operation, which must accompany the new > > channel selection method. We still support the legacy .private channel > > allocation method to cater for a smooth driver migration. > > > > Signed-off-by: Guennadi Liakhovetski > > --- > > Thanks for your efforts on this. Something that caught my eye in this > patch is this portion: > > +bool shdma_chan_filter(struct dma_chan *chan, void *arg); > > If we would use this function in our DMA Engine slave drivers (MMCIF, > SDHI, SCIF, FSI, SIU and so on) then wouldn't we add a strict > dependency on this symbol provided by this particular DMA Engine > driver implementation for the DMAC hardware (that your patch > modifies)? > > And what do we do if we want to use the same DMA Engine slave driver > with a different DMA Engine driver implementation? > > From my point of view, there must be some better way to not have such > tight dependencies between the DMA Engine slave consumer and the DMA > Engine driver. Not sure what that looks like though. This symbol > dependency is pretty far from great IMO. > I vaguely recall this coming up before, and it wasn't acceptable then either. We will by no means be adding driver-specific hooks in to other drivers that really couldn't care less what the underlying DMA engine driving them is. We already have CPUs with different DMA engines that can be used by the same drivers. As I said the last time, this needs to be fixed in the dmaengine framework, period. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the kvm-ppc tree with the powerpc tree
Hi Alexander, Today's linux-next merge of the kvm-ppc tree got a conflict in arch/powerpc/kvm/booke_interrupts.S between commit c75df6f96c59 ("powerpc: Fix usage of register macros getting ready for %r0 change") from the powerpc tree and commit fc372c0843b8 ("booke: Added crit/mc exception handler for e500v2") from the kvm-ppc tree. I fixed it up (see below - could do with checking) and can carry the fix as necessary. -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/powerpc/kvm/booke_interrupts.S index 8fd4b2a,09456c4..000 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@@ -52,16 -53,21 +52,21 @@@ (1< pgph0OxDaYXii.pgp Description: PGP signature
Re: [PATCH] perf: fix perf-lock report coredump
On 7/10/12 11:14 PM, Jovi Zhang wrote: Does this fix it for you: http://lkml.org/lkml/2012/7/6/405 Yeah, same problem. But the question is if there have some sample event with raw data in perf.data, are we still just exit(1)? or let perf-lock only report those sample events with raw data? perf-lock uses 4 tracepoints. tracepoints add -R (raw data) to events. So, if the perf.data does not have tracepoints, there is no need for perf-lock info or report subcommands to proceed. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] irq_domain: Standardise legacy/linear domain selection
On Thu, 5 Jul 2012 12:19:19 +0100, Mark Brown wrote: > A large proportion of interrupt controllers that support legacy mappings > do so because non-DT systems need to use fixed IRQ numbers when registering > devices via buses but can otherwise use a linear mapping. The interrupt > controller itself typically is not affected by the mapping used and best > practice is to use a linear mapping where possible so drivers frequently > select at runtime depending on if a legacy range has been allocated to > them. > > Standardise this behaviour by providing irq_domain_register_simple() which > will allocate a linear mapping unless a positive first_irq is provided in > which case it will fall back to a legacy mapping. This helps make best > practice for irq_domain adoption clearer. > > Signed-off-by: Mark Brown Applied, thanks. g. > --- > Documentation/IRQ-domain.txt |5 + > include/linux/irqdomain.h|5 + > kernel/irq/irqdomain.c | 30 ++ > 3 files changed, 40 insertions(+) > > diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt > index 27dcaab..1401cec 100644 > --- a/Documentation/IRQ-domain.txt > +++ b/Documentation/IRQ-domain.txt > @@ -93,6 +93,7 @@ Linux IRQ number into the hardware. > Most drivers cannot use this mapping. > > Legacy > +irq_domain_add_simple() > irq_domain_add_legacy() > irq_domain_add_legacy_isa() > > @@ -115,3 +116,7 @@ The legacy map should only be used if fixed IRQ mappings > must be > supported. For example, ISA controllers would use the legacy map for > mapping Linux IRQs 0-15 so that existing ISA drivers get the correct IRQ > numbers. > + > +Most users of legacy mappings should use irq_domain_add_simple() which > +will use a legacy domain only if an IRQ range is supplied by the > +system and will otherwise use a linear domain mapping. > diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h > index 5abb533e..17b60be 100644 > --- a/include/linux/irqdomain.h > +++ b/include/linux/irqdomain.h > @@ -112,6 +112,11 @@ struct irq_domain { > }; > > #ifdef CONFIG_IRQ_DOMAIN > +struct irq_domain *irq_domain_add_simple(struct device_node *of_node, > + unsigned int size, > + unsigned int first_irq, > + const struct irq_domain_ops *ops, > + void *host_data); > struct irq_domain *irq_domain_add_legacy(struct device_node *of_node, >unsigned int size, >unsigned int first_irq, > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c > index d3968e9..0c51958 100644 > --- a/kernel/irq/irqdomain.c > +++ b/kernel/irq/irqdomain.c > @@ -140,6 +140,36 @@ static unsigned int irq_domain_legacy_revmap(struct > irq_domain *domain, > } > > /** > + * irq_domain_add_simple() - Allocate and register a simple irq_domain. > + * @of_node: pointer to interrupt controller's device tree node. > + * @size: total number of irqs in mapping > + * @first_irq: first number of irq block assigned to the domain > + * @ops: map/unmap domain callbacks > + * @host_data: Controller private data pointer > + * > + * Allocates a legacy irq_domain if irq_base is positive or a linear > + * domain otherwise. > + * > + * This is intended to implement the expected behaviour for most > + * interrupt controllers which is that a linear mapping should > + * normally be used unless the system requires a legacy mapping in > + * order to support supplying interrupt numbers during non-DT > + * registration of devices. > + */ > +struct irq_domain *irq_domain_add_simple(struct device_node *of_node, > + unsigned int size, > + unsigned int first_irq, > + const struct irq_domain_ops *ops, > + void *host_data) > +{ > + if (first_irq > 0) > + return irq_domain_add_legacy(of_node, size, first_irq, 0, > + ops, host_data); > + else > + return irq_domain_add_linear(of_node, size, ops, host_data); > +} > + > +/** > * irq_domain_add_legacy() - Allocate and register a legacy revmap > irq_domain. > * @of_node: pointer to interrupt controller's device tree node. > * @size: total number of irqs in legacy mapping > -- > 1.7.10 > -- Grant Likely, B.Sc, P.Eng. Secret Lab Technologies, Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] devicetree: add helper inline for retrieving a node's full name
On Thu, 05 Jul 2012 11:07:31 -0500, Rob Herring wrote: > Grant, > > On 06/15/2012 12:50 PM, Grant Likely wrote: > > The pattern (np ? np->full_name : "") is rather common in the > > kernel, but can also make for quite long lines. This patch adds a new > > inline function, of_node_full_name() so that the test for a valid node > > pointer doesn't need to be open coded at all call sites. > > > > Signed-off-by: Grant Likely > > Cc: Paul Mundt > > Cc: Benjamin Herrenschmidt > > Cc: Thomas Gleixner > > --- > > I'm assuming you want me to apply this now, so I have. I've had this one in linux-next via irqdomain/next for a while now, and the next round of irqdomain patches are based on it. g. > > Rob > > > arch/microblaze/pci/pci-common.c |6 ++ > > arch/powerpc/kernel/pci-common.c |6 ++ > > arch/powerpc/kernel/vio.c |5 ++--- > > arch/powerpc/platforms/cell/iommu.c|3 +-- > > arch/powerpc/platforms/pseries/iommu.c |2 +- > > arch/sparc/kernel/of_device_64.c |2 +- > > drivers/of/base.c |2 +- > > drivers/of/irq.c |2 +- > > include/linux/of.h | 10 ++ > > kernel/irq/irqdomain.c |8 > > 10 files changed, 25 insertions(+), 21 deletions(-) > > > > diff --git a/arch/microblaze/pci/pci-common.c > > b/arch/microblaze/pci/pci-common.c > > index ed22bfc..ca8f6e7 100644 > > --- a/arch/microblaze/pci/pci-common.c > > +++ b/arch/microblaze/pci/pci-common.c > > @@ -249,8 +249,7 @@ int pci_read_irq_line(struct pci_dev *pci_dev) > > } else { > > pr_debug(" Got one, spec %d cells (0x%08x 0x%08x...) on %s\n", > > oirq.size, oirq.specifier[0], oirq.specifier[1], > > -oirq.controller ? oirq.controller->full_name : > > -""); > > +of_node_full_name(oirq.controller)); > > > > virq = irq_create_of_mapping(oirq.controller, oirq.specifier, > > oirq.size); > > @@ -1493,8 +1492,7 @@ static void __devinit pcibios_scan_phb(struct > > pci_controller *hose) > > struct pci_bus *bus; > > struct device_node *node = hose->dn; > > > > - pr_debug("PCI: Scanning PHB %s\n", > > -node ? node->full_name : ""); > > + pr_debug("PCI: Scanning PHB %s\n", of_node_full_name(node)); > > > > pcibios_setup_phb_resources(hose, ); > > > > diff --git a/arch/powerpc/kernel/pci-common.c > > b/arch/powerpc/kernel/pci-common.c > > index 8e78e93..886c254 100644 > > --- a/arch/powerpc/kernel/pci-common.c > > +++ b/arch/powerpc/kernel/pci-common.c > > @@ -248,8 +248,7 @@ static int pci_read_irq_line(struct pci_dev *pci_dev) > > } else { > > pr_debug(" Got one, spec %d cells (0x%08x 0x%08x...) on %s\n", > > oirq.size, oirq.specifier[0], oirq.specifier[1], > > -oirq.controller ? oirq.controller->full_name : > > -""); > > +of_node_full_name(oirq.controller)); > > > > virq = irq_create_of_mapping(oirq.controller, oirq.specifier, > > oirq.size); > > @@ -1628,8 +1627,7 @@ void __devinit pcibios_scan_phb(struct pci_controller > > *hose) > > struct device_node *node = hose->dn; > > int mode; > > > > - pr_debug("PCI: Scanning PHB %s\n", > > -node ? node->full_name : ""); > > + pr_debug("PCI: Scanning PHB %s\n", of_node_full_name(node)); > > > > /* Get some IO space for the new PHB */ > > pcibios_setup_phb_io_space(hose); > > diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c > > index cb87301..63f72ed 100644 > > --- a/arch/powerpc/kernel/vio.c > > +++ b/arch/powerpc/kernel/vio.c > > @@ -1296,8 +1296,7 @@ static void __devinit vio_dev_release(struct device > > *dev) > > struct iommu_table *tbl = get_iommu_table_base(dev); > > > > if (tbl) > > - iommu_free_table(tbl, dev->of_node ? > > - dev->of_node->full_name : dev_name(dev)); > > + iommu_free_table(tbl, of_node_full_name(dev->of_node)); > > of_node_put(dev->of_node); > > kfree(to_vio_dev(dev)); > > } > > @@ -1509,7 +1508,7 @@ static ssize_t devspec_show(struct device *dev, > > { > > struct device_node *of_node = dev->of_node; > > > > - return sprintf(buf, "%s\n", of_node ? of_node->full_name : "none"); > > + return sprintf(buf, "%s\n", of_node_full_name(of_node)); > > } > > > > static ssize_t modalias_show(struct device *dev, struct device_attribute > > *attr, > > diff --git a/arch/powerpc/platforms/cell/iommu.c > > b/arch/powerpc/platforms/cell/iommu.c > > index b9f509a..b673200 100644 > > --- a/arch/powerpc/platforms/cell/iommu.c > > +++ b/arch/powerpc/platforms/cell/iommu.c > > @@ -552,8 +552,7 @@ static struct iommu_table *cell_get_iommu_table(struct > > device *dev) > >
Re: [PATCH V3 5/6] Avoid duplicate probe for of platform devices
On Mon, 9 Jul 2012 07:58:31 -0700, Greg KH wrote: > On Mon, Jul 09, 2012 at 03:46:59AM +, Li Yang-R58472 wrote: > > > > I don't understand, why is this just showing up now? What changed to > > > > cause this? Couldn't that be the real problem here? > > > > > > > > > > The issue is showing up because we now probe devices twice. > > > Previously, we just probe devices once. But now we changed the way of pci > > > init which makes pci controllers should be probed earlier than other > > > devices. > > > So we have to probe pci nodes separately. Probe more than once is the > > > root > > > cause of this issue. > > > > > > The pci patchset I mentioned please refer to: > > > http://patchwork.ozlabs.org/patch/163742/ > > > > Let me try to clarify a little bit. The of platform bus normally > > traverse the device tree to add all the devices. The change which > > caused problem is that we need to probe PCIe RC devices at a earlier > > stage of initialization. > > That sounds, wrong. Yes, really really wrong; starting with terminology... > > So we added these PCIe RC devices earlier than the normal device tree > > traversal process. These PCIe RC devices will be scanned again during > > the normal traversal and cause duplicated devices being added. Our > > proposal is to deal with duplicated devices automatically and make it > > possible to scan the device tree multiple times for devices to be > > added. ... This isn't *probing* twice; it is *registration*. That's cause confusion on this thread. > Then you need to put something in your own tree scanning logic to not > try to register devices multiple times. How about a simple flag in your > device structure instead of having to muck around in the driver core > internals? Right. If you're going to create the pci bus devices early, then you need to explicitly inhibit creation of them later... but still; why do the PCI bus devices need to be registered separately from the rest of the devices on the simple-bus? Why not just move *all* device registration earlier? > > Although one should seriously question the need to want to recan the bus > and register devices at different times of the boot process... Yes; the model they're trying to use sounds wrong. g. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] irqdomain: Fix up linear revmap for non-zero hwirq displacement.
On Thu, 21 Jun 2012 10:19:40 +0900, Paul Mundt wrote: > On Sat, Jun 16, 2012 at 08:14:19AM +0900, Paul Mundt wrote: > > On Fri, Jun 15, 2012 at 12:25:40PM -0600, Grant Likely wrote: > > > On Wed, 13 Jun 2012 16:34:00 +0900, Paul Mundt > > > wrote: > > > > Presently the linear revmap code assumes that all hwirqs start at 0, > > > > using the hwirq directly as an index value for the lookup. In the case > > > > of > > > > legacy revmaps this isn't necessarily the case, as the first_hwirq value > > > > passed in can be non-zero, causing those types of users to silently have > > > > their IRQs placed in the radix tree instead. > > > > > > > > With this change, hwirq displacement is factored in at association time > > > > directly. This also makes it possible for non-legacy users to use linear > > > > revmaps regardless of hwirq base position. This could potentially lead > > > > to > > > > a bug if there's an attempt to associate multiple times in to the linear > > > > map in a nonsensical and non-linear order, but at that point being > > > > silently punted to the radix tree is likely to be the least of your > > > > concerns (in such a case it's fairly trivial to simply extend > > > > irq_domain_add_linear() to take a hwirq base and move the linear base > > > > assignment there). > > > > > > I actually hoped to be rid of the whole hwirq start offset thing. > > > Doing without it simplifies the code, is slightly faster. I suspect > > > very few controllers actually need it, and for those that do I'm > > > hoping the wasted space is in the order of 0-32 words. > > > > > > Instead of this, can we change the affected controllers to use the > > > maximum hwirq number when setting the size of the linear map? > > > > > > Do you have hardware where the first hwirq is a >32 number? > > > > > Yes. On the CPU I was just working on I have two linear ranges and a > > tree, one of the linear ranges begins at hwirq 56. On other CPUs we have > > linear ranges that begin at 64, 72, etc. most of which are fairly low in > > the space. On newer parts on the ARM side there are also controllers with > > ranges that begin > 400. > > > > I don't particularly care for the linear_start hack myself either, but I > > couldn't think of any cleaner approach for it. The simplest might be if > > we can just bury these details in a domain-specific canonicalization op > > (distinctly different from xlate), and plug it in for the few cases that > > need a non-zero hwirq base. I don't mind hacking that up if you're more > > agreeable to that approach. > > Ping? > > We can't do away with the first_irq thing in the legacy->linear merge > without at least having a strategy for getting existing users off of it. > Requiring the linear revmap to always begin at 0 seems like a significant > regression in functionality for marginal performance gain, so if you're > not willing to have the linear_start factored in we do need some other > alternative. I've proposed several, if you don't like any of those you > are welcome to propose an alternative. I don't mind doing the work one > way or the other, but I do mind losing the functionality. >From another perspective, even if irqs do start at 400, that is wasted space of 1600 bytes. Less than half a page. It still isn't a huge amount. The choice to use a linear vs. radix map is a choice between speed and sparse flexability. Considering that one of Ben's concerns is preserving the fastest lookup path possible, I greatly prefer the simplicity of a single offset between hwirq and irq. If you want to avoid it in your driver, I won't object, but it seems to me a case of premature optimization. However, I do agree that allocating 400 unused irq_descs would be a problem, but the patches don't work that way. irq_domain_add_legacy() only calls irq_domain_associate_many() on the requested range of hwirqs by using the value of hwirq_base passed in. > Punishing legacy users with leading gaps in their revmap is likewise > undesirable, especially as that most of the in-tree users (regmap-irq > especially) of this functionality are using this legacy behaviour without > incident. Hardly punishment. It is a different optimization decision. The vast majority of _add_legacy calls use 0 for hwirq_base, and the ones that don't use a very small number. ...ummm are we talking about the same things? You mention regmap-irq specifically, but regmap irq also uses 0 for hwirq_start, so there is no leading gap. g. -- Grant Likely, B.Sc, P.Eng. Secret Lab Technologies, Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/2] irq: add irq_desc_initialize to remove some duplicated lines
Hi Thomas, Thanks for the review firstly. On Thu, Jul 12, 2012 at 06:19:18AM +0800, Thomas Gleixner wrote: > On Wed, 20 Jun 2012, Dong Aisheng wrote: > > From: Dong Aisheng > > > > There're two copies of irq_desc initialization code, reform them into > > an irq_desc_initialize function to call. > > > > Signed-off-by: Dong Aisheng > > --- > > kernel/irq/irqdesc.c | 51 > > +++-- > > 1 files changed, 28 insertions(+), 23 deletions(-) > > We add more code by removing redundant copies? > I also had this strange question. I looked the code a bit more, i guess the main problem is that the redundant copies is not too big, so we can not see great savings. Compare to the init code of irq_desc in original alloc_desc function, the new irq_desc_initialize function saves 4 lines. However, the new function also add 4 lines for defining extra function name, parameter and etc. And alloc_desc still needs to call irq_desc_initialize and checking return value which needs extra 6 lines. The main saving is another copy of irq_desc initialization in early_irq_init, but this copy does not check any return values which cause we did not save too much, only about 4 lines. Plus extra blan lines added, so totally it does not save more than new added. However, i was thinking it could make code looks a bit better. So i sent out this RFC patch. Do you think if it's reasonable? BTW, there's an issue in my patch, should change like: if (alloc_masks(desc, GFP_KERNEL, node)) { - kfree(desc->kstat_irqs); + free_percpu(desc->kstat_irqs); return -ENOMEM; } Regards Dong Aisheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/2] irq_domain: correct a minor wrong comment for linear revmap
On Wed, 20 Jun 2012 17:00:30 +0800, Dong Aisheng wrote: > From: Dong Aisheng > > The revmap type should be linear for irq_domain_add_linear function. > > Signed-off-by: Dong Aisheng Applied, thanks. g. > --- > kernel/irq/irqdomain.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c > index 3e4ea85..cb83554 100644 > --- a/kernel/irq/irqdomain.c > +++ b/kernel/irq/irqdomain.c > @@ -219,7 +219,7 @@ struct irq_domain *irq_domain_add_legacy(struct > device_node *of_node, > EXPORT_SYMBOL_GPL(irq_domain_add_legacy); > > /** > - * irq_domain_add_linear() - Allocate and register a legacy revmap > irq_domain. > + * irq_domain_add_linear() - Allocate and register a linear revmap > irq_domain. > * @of_node: pointer to interrupt controller's device tree node. > * @size: Number of interrupts in the domain. > * @ops: map/unmap domain callbacks > -- > 1.7.0.4 > > -- Grant Likely, B.Sc, P.Eng. Secret Lab Technologies, Ltd. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] tmpfs: revert SEEK_DATA and SEEK_HOLE
On 07/12/2012 07:01 AM, Dave Chinner wrote: > On Wed, Jul 11, 2012 at 11:55:34AM -0700, Hugh Dickins wrote: >> On Wed, 11 Jul 2012, Cong Wang wrote: >>> On Mon, 09 Jul 2012 at 22:41 GMT, Hugh Dickins wrote: Revert 4fb5ef089b28 ("tmpfs: support SEEK_DATA and SEEK_HOLE"). I believe it's correct, and it's been nice to have from rc1 to rc6; but as the original commit said: I don't know who actually uses SEEK_DATA or SEEK_HOLE, and whether it would be of any use to them on tmpfs. This code adds 92 lines and 752 bytes on x86_64 - is that bloat or worthwhile? >>> >>> >>> I don't think 752 bytes matter much, especially for x86_64. >>> Nobody asked for it, so I conclude that it's bloat: let's revert tmpfs to the dumb generic support for v3.5. We can always reinstate it later if useful, and anyone needing it in a hurry can just get it out of git. >>> >>> If you don't have burden to maintain it, I'd prefer to leave as it is, >>> I don't think 752-bytes is the reason we revert it. >> >> Thank you, your vote has been counted ;) >> and I'll be glad if yours stimulates some agreement or disagreement. >> >> But your vote would count for a lot more if you know of some app which >> would really benefit from this functionality in tmpfs: I've heard of none. > > So what? I've heard of no apps that use this functionality on XFS, > either, but I have heard of a lot of people asking for it to be > implemented over the past couple of years so they can use it. > There's been patches written to make coreutils (cp) make use of it > instead of parsing FIEMAP output to find holes, though I don't know > if that's gone beyond more than "here's some patches"... Yes, for apps, cp(1) will make use of it to replace the old FIEMAP for efficient sparse file copy. I have implemented an extent-scan module to coreutils a few years ago, http://fossies.org/dox/coreutils-8.17/extent-scan_8c_source.html It does extent scan through FIEMAP, however, SEEK_DATA/SEEK_HOLE is more convenient and easy to use considering the call interface. So FIEMAP will be replaced by SEEK_XXX once it got supported by EXT4. Moreover, I have discussed with Jim who is the coreutils maintainer previously, He would like to post extent-scan module to Gnulib so that other GNU utilities which are relied on Gnulib might be a potential user of it, at least, GNU tar will definitely need it for sparse file backup. > > Besides, given that you can punch holes in tmpfs files, it seems > strange to then say "we don't need a method of skipping holes to > find data quickly" So its deserve to keep this feature working on tmpfs considering hole punch. :) Thanks, -Jeff > > Besides, seek-hole/data is still shiny new and lots of developers > aren't even aware of it's presence in recent kernels. Removing new > functionality saying "no-one is using it" is like smashing the egg > before the chicken hatches (or is it cutting of the chickes's head > before it lays the egg?). > > Cheers, > > Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/5] thermal: Add generic cpufreq cooling implementation
On Tue, Jul 10, 2012 at 12:31 PM, Hongbo Zhang wrote: > > > On 12 May 2012 17:40, Amit Daniel Kachhap wrote: >> >> This patch adds support for generic cpu thermal cooling low level >> implementations using frequency scaling up/down based on the registration >> parameters. Different cpu related cooling devices can be registered by the >> user and the binding of these cooling devices to the corresponding >> trip points can be easily done as the registration APIs return the >> cooling device pointer. The user of these APIs are responsible for >> passing clipping frequency . The drivers can also register to recieve >> notification about any cooling action called. >> >> Signed-off-by: Amit Daniel Kachhap >> --- >> Documentation/thermal/cpu-cooling-api.txt | 60 >> drivers/thermal/Kconfig | 11 + >> drivers/thermal/Makefile |3 +- >> drivers/thermal/cpu_cooling.c | 483 >> + >> include/linux/cpu_cooling.h | 99 ++ >> 5 files changed, 655 insertions(+), 1 deletions(-) >> create mode 100644 Documentation/thermal/cpu-cooling-api.txt >> create mode 100644 drivers/thermal/cpu_cooling.c >> create mode 100644 include/linux/cpu_cooling.h >> >> diff --git a/Documentation/thermal/cpu-cooling-api.txt >> b/Documentation/thermal/cpu-cooling-api.txt >> new file mode 100644 >> index 000..557adb8 >> --- /dev/null >> +++ b/Documentation/thermal/cpu-cooling-api.txt >> @@ -0,0 +1,60 @@ >> +CPU cooling APIs How To >> +=== >> + >> +Written by Amit Daniel Kachhap >> + >> +Updated: 12 May 2012 >> + >> +Copyright (c) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com) >> + >> +0. Introduction >> + >> +The generic cpu cooling(freq clipping, cpuhotplug etc) provides >> +registration/unregistration APIs to the caller. The binding of the >> cooling >> +devices to the trip point is left for the user. The registration APIs >> returns >> +the cooling device pointer. >> + >> +1. cpu cooling APIs >> + >> +1.1 cpufreq registration/unregistration APIs >> +1.1.1 struct thermal_cooling_device *cpufreq_cooling_register( >> + struct freq_clip_table *tab_ptr, unsigned int tab_size) >> + >> +This interface function registers the cpufreq cooling device with the >> name >> +"thermal-cpufreq-%x". This api can support multiple instances of >> cpufreq >> +cooling devices. >> + >> +tab_ptr: The table containing the maximum value of frequency to be >> clipped >> +for each cooling state. >> + .freq_clip_max: Value of frequency to be clipped for each allowed >> +cpus. >> + .temp_level: Temperature level at which the frequency clamping >> will >> + happen. >> + .mask_val: cpumask of the allowed cpu's >> +tab_size: the total number of cpufreq cooling states. >> + >> +1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device >> *cdev) >> + >> +This interface function unregisters the "thermal-cpufreq-%x" cooling >> device. >> + >> +cdev: Cooling device pointer which has to be unregistered. >> + >> + >> +1.2 CPU cooling action notifier register/unregister interface >> +1.2.1 int cputherm_register_notifier(struct notifier_block *nb, >> + unsigned int list) >> + >> +This interface registers a driver with cpu cooling layer. The driver >> will >> +be notified when any cpu cooling action is called. >> + >> +nb: notifier function to register >> +list: CPUFREQ_COOLING_START or CPUFREQ_COOLING_STOP >> + >> +1.2.2 int cputherm_unregister_notifier(struct notifier_block *nb, >> + unsigned int list) >> + >> +This interface registers a driver with cpu cooling layer. The driver >> will >> +be notified when any cpu cooling action is called. >> + >> +nb: notifier function to register >> +list: CPUFREQ_COOLING_START or CPUFREQ_COOLING_STOP >> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig >> index 514a691..d9c529f 100644 >> --- a/drivers/thermal/Kconfig >> +++ b/drivers/thermal/Kconfig >> @@ -19,6 +19,17 @@ config THERMAL_HWMON >> depends on HWMON=y || HWMON=THERMAL >> default y >> >> +config CPU_THERMAL >> + bool "generic cpu cooling support" >> + depends on THERMAL && CPU_FREQ >> + help >> + This implements the generic cpu cooling mechanism through >> frequency >> + reduction, cpu hotplug and any other ways of reducing >> temperature. An >> + ACPI version of this already >> exists(drivers/acpi/processor_thermal.c). >> + This will be useful for platforms using the generic thermal >> interface >> + and not the ACPI interface. >> + If you want this support, you should say Y or M here. >> + >> config SPEAR_THERMAL >> bool "SPEAr thermal sensor driver" >> depends on THERMAL >> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile >> index a9fff0b..30c456c 100644 >> ---
Re: [PATCH v2 -mm] memcg: prevent from OOM with too many dirty pages
On Wed, 11 Jul 2012, Andrew Morton wrote: > On Wed, 11 Jul 2012 18:57:43 -0700 (PDT) Hugh Dickins > wrote: > > > --- 3.5-rc6-mm1/mm/vmscan.c 2012-07-11 14:42:13.668335884 -0700 > > +++ linux/mm/vmscan.c 2012-07-11 16:01:20.712814127 -0700 > > @@ -726,7 +726,8 @@ static unsigned long shrink_page_list(st > > * writeback from reclaim and there is nothing else to > > * reclaim. > > */ > > - if (!global_reclaim(sc) && PageReclaim(page)) > > + if (!global_reclaim(sc) && PageReclaim(page) && > > + may_enter_fs) > > wait_on_page_writeback(page); > > else { > > nr_writeback++; > > um, that may_enter_fs test got removed because nobody knew why it was > there. Nobody knew why it was there because it was undocumented. Do > you see where I'm going with this? I was hoping you might do that bit ;) Here's my display of ignorance: --- 3.5-rc6-mm1/mm/vmscan.c 2012-07-11 14:42:13.668335884 -0700 +++ linux/mm/vmscan.c 2012-07-11 20:09:33.182829986 -0700 @@ -725,8 +725,15 @@ static unsigned long shrink_page_list(st * could easily OOM just because too many pages are in * writeback from reclaim and there is nothing else to * reclaim. +* +* Check may_enter_fs, certainly because a loop driver +* thread might enter reclaim, and deadlock if it waits +* on a page for which it is needed to do the write +* (loop masks off __GFP_IO|__GFP_FS for this reason); +* but more thought would probably show more reasons. */ - if (!global_reclaim(sc) && PageReclaim(page)) + if (!global_reclaim(sc) && PageReclaim(page) && + may_enter_fs) wait_on_page_writeback(page); else { nr_writeback++; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 82571EB: Detected Hardware Unit Hang
On 07/12/12 11:07, Dave, Tushar N wrote: >> -Original Message- >> From: Joe Jin [mailto:joe@oracle.com] >> Sent: Wednesday, July 11, 2012 7:58 PM >> To: Dave, Tushar N >> Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- >> ker...@vger.kernel.org >> Subject: Re: 82571EB: Detected Hardware Unit Hang >> >> On 07/12/12 10:52, Dave, Tushar N wrote: >>> What is the exact error messages in BIOS log? >> >> Error message from BIOS event log: >> 07/12/12 05:54:00 >>PCI Express Non-Fatal Error >> >> Thanks, >> Joe > > Thanks. Well, I will check with team tomorrow if this (max payload size) > can be treated as solution to this issue. > We can know more about what exact non-fatal error occurred if we capture bus > trace. > We should check the eeprom on this device to make sure they are up-to-date. > Send me the full eeprom dump in a file and I will confirm with team that it > is up-to-date. > Thanks for your work. > Hi Tushar, Please find eeprom from attachment. Thanks a lot of your help, Joe <>Offset Values -- -- 0x 00 15 17 b9 77 9c 24 05 ff ff a2 50 ff ff ff ff 0x0010 01 d9 04 97 2f 24 bc 11 8e 10 bc 10 86 80 65 b1 0x0020 08 00 bc 10 00 58 00 00 01 50 00 00 00 00 00 01 0x0030 f6 6c b0 37 a6 07 03 84 83 07 00 00 03 c3 02 06 0x0040 08 00 f0 1e 64 21 40 00 01 48 00 00 00 00 00 00 0x0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0060 00 01 00 40 2a 12 07 40 00 01 00 40 ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff 97 01 ff ff 4b e8 0x0080 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0090 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x00f0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0110 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0120 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0130 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0140 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0150 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0160 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0170 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0180 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x0190 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01a0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01b0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01c0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01d0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0x01e0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 6f 0x01f0 87 04 00 00 00 00 00 00 00 00 00 00 ff ff ff 16 0x0200 03 00 22 00 00 00 07 30 00 e5 49 00 df 61 15 34 0x0210 00 36 81 2f 04 50 00 3b 15 34 00 36 04 60 00 49 0x0220 15 34 00 36 04 70 00 9a 15 34 00 36 04 80 00 27 0x0230 15 34 00 36 05 40 00 c1 47 02 00 14 00 10 04 24 0x0240 00 e1 00 14 00 10 38 02 00 15 3f 04 5b 2f 3b 04 0x0250 1b 00 32 04 87 00 3f 04 70 2f 30 04 a4 a8 3f 04 0x0260 90 2f 30 04 c0 0e 3f 04 11 20 31 04 20 04 3f 04 0x0270 00 00 20 04 40 01 3f 04 7a 18 1a 04 00 08 3f 04 0x0280 30 1f 30 04 06 16 35 04 2a 01 3e 04 67 00 3f 04 0x0290 54 1f 34 04 65 00 35 04 2a 00 36 04 2a 00 3f 04 0x02a0 72 1f 32 04 b0 3f 36 04 ff c0 37 04 ec 1d 38 04 0x02b0 ef f9 39 04 10 02 3c 06 00 0c 3f 04 95 18 35 04 0x02c0 03 00 3f 04 96 17 36 04 08 00 3f 04 98 1f 38 04 0x02d0 08 d0 3f 04 00 00 20 04 40 13 3f 04 5b 2f 3b 04 0x02e0 18 90 32 04 00 00 3f 04 70 2f 30 04 e4 29 3f 04 0x02f0 90 2f 30 04 c0 06 3f 04 11 20 31 04 00 04 30 04 0x0300 b0 10 3f 04 b1 2f 31 04 24 8d 32 04 f0 f8 3f 04 0x0310 dc 20 3c 04 00 00 3d 04 0a 00 3e 04 d3 00 3f 04 0x0320 b4 28 34 04 ce 04 3f 04 00 00 20 04 40 13 69 53 0x0330 e0 05 01 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0340 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0350 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0360 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0370 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0380 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0390 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x03a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x03b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x03c0
RE: 82571EB: Detected Hardware Unit Hang
>-Original Message- >From: Joe Jin [mailto:joe@oracle.com] >Sent: Wednesday, July 11, 2012 7:58 PM >To: Dave, Tushar N >Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: Re: 82571EB: Detected Hardware Unit Hang > >On 07/12/12 10:52, Dave, Tushar N wrote: >> What is the exact error messages in BIOS log? > >Error message from BIOS event log: >07/12/12 05:54:00 >PCI Express Non-Fatal Error > >Thanks, >Joe Thanks. Well, I will check with team tomorrow if this (max payload size) can be treated as solution to this issue. We can know more about what exact non-fatal error occurred if we capture bus trace. We should check the eeprom on this device to make sure they are up-to-date. Send me the full eeprom dump in a file and I will confirm with team that it is up-to-date. Thanks for your work. -Tushar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm: Warn about costly page allocation
On Wed, Jul 11, 2012 at 07:33:41PM -0700, David Rientjes wrote: > On Thu, 12 Jul 2012, Minchan Kim wrote: > > > Agreed and that's why I suggested following patch. > > It's not elegant but at least, it could attract interest of configuration > > people and they could find a regression during test phase. > > This description could be improved later by writing new documenation which > > includes more detailed story and method for capturing high order allocation > > by ftrace once we see regression report. > > > > At the moment, I would like to post this patch, simply. > > (Of course, I hope fluent native people will correct a sentence. :) ) > > > > Any objections, Andrew, David? > > > > There are other config options like CONFIG_SLOB that are used for a very > small memory footprint on systems like this. We used to have > CONFIG_EMBEDDED to suggest options like this but that has since been > renamed as CONFIG_EXPERT and has become obscured. > > If size is really the only difference, I would think that people who want > the smallest kernel possible would be doing allnoconfig and then > selectively enabling what they need, so defconfig isn't really relevant > here. And it's very difficult for an admin to know whether or not they > "care about high-order allocations." > > I'd reconsider disabling compaction by default unless there are other > considerations that haven't been mentioned. I agree but it doesn't matter with current problem. The point of current problem is to let admin know dangerous of regression about high order allocation before releasing the product. Although we enable it by defaut, he can change it with "N" unless he knows removing of lumpy reclaim. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 05/14] PCI: add access functions for PCIe capabilities to hide PCIe spec differences
On 2012-7-12 1:52, Bjorn Helgaas wrote: >> Hi Bjorn, >> Seems it would be better to return error code for unimplemented >> registers, otherwise following code will becomes more complex. A special >> error code for unimplemented registers, such as -EIO? > > I think you're asking about returning error for *reads* of > unimplemented registers? I guess I still think it's OK to completely > hide the v1 nastiness inside these accessors, and return success with > a zero value when reading. Having several different error returns > seems like overkill for this case. Nobody wants to distinguish > between different reasons for failure. > > I'm actually not sure that it's worth returning an error even when > *writing* an unimplemented register. What if we return success and > just drop the write? > > Maybe these should even be void functions. It feels like the only > real use of the return value is to detect programmer error, and I > don't think that's very effective. If we remove the return values, > people will have to focus on the *data*, which seems more important > anyway. Hi Bjorn, It's a little risk to change these PCIe capabilities access functions as void. On some platform with hardware error detecting/correcting capabilities, such as EEH on Power, it would be better to return error code if hardware error happens during accessing configuration registers. As I know, coming Intel Xeon processor may provide PCIe hardware error detecting capability similar to EEH on power. >> static void rtl_disable_clock_request(struct pci_dev *pdev) >> { >> u16 ctl; >> >> if (!pci_pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, )) { >> ctl &= ~PCI_EXP_LNKCTL_CLKREQ_EN; >> pci_pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, ctl); >> } >> } > > I would write that as: > > if (!pci_is_pcie(pdev) > return; > > pci_pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, ); > if (ctl & PCI_EXP_LNKCTL_CLKREQ_EN) > pci_pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, ctl & > ~PCI_EXP_LNKCTL_CLKREQ_EN); > > which does the right thing regardless of what we do for return values, > and saves a config write in the case where LNKCTL is implemented and > CLKREQ_EN is already cleared. When clearing a flag, we could do that. But if we are trying to set a flag, it would be better to make sure the target register does exist. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 82571EB: Detected Hardware Unit Hang
On 07/12/12 10:52, Dave, Tushar N wrote: > What is the exact error messages in BIOS log? Error message from BIOS event log: 07/12/12 05:54:00 PCI Express Non-Fatal Error Thanks, Joe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] usb/host/ehci-hub: Fix the issue EG20T USB host controller has long resuming time, when pen drive is attached.
Intel EG20T USB host controller does not send SOF in resuming time after suspending, if the FLR bit was not cleared. When pen drive is attached, the controller has a long resuming time to try re-connect it. This patch clear the FLR bit in suspending time for fixing the issue. Signed-off-by: Tomoya MORINAGA --- v2: Update comments from Alan Stern Add patch description Always clear the STS_FLR flag. --- drivers/usb/host/ehci-hub.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c index fc9e7cc..818a2f1 100644 --- a/drivers/usb/host/ehci-hub.c +++ b/drivers/usb/host/ehci-hub.c @@ -318,6 +318,7 @@ static int ehci_bus_suspend (struct usb_hcd *hcd) ehci_readl(ehci, >regs->intr_enable); ehci->next_statechange = jiffies + msecs_to_jiffies(10); + ehci_writel(ehci, STS_FLR, >regs->status); spin_unlock_irq (>lock); /* ehci_work() may have re-enabled the watchdog timer, which we do not -- 1.7.4.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 82571EB: Detected Hardware Unit Hang
>-Original Message- >From: Joe Jin [mailto:joe@oracle.com] >Sent: Wednesday, July 11, 2012 7:23 PM >To: Dave, Tushar N >Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: Re: 82571EB: Detected Hardware Unit Hang > >On 07/12/12 02:51, Dave, Tushar N wrote: >> >> Joe, >> >> I see couple of errors in lspci output. >> Device capability status register shows UnCorrectable PCIe error. This >means there is certainly something went wrong. The only way to recover >from Uncorrectable errors is reset. >> >> DevSta: CorrErr- *UncorrErr+ FatalErr+ UnsuppReq+ AuxPwr+ >TransPend- >> >> Also AER sections in lspci output shows PCIe completion timeout. >> >> Capabilities: [100 v1] Advanced Error Reporting >> UESta: DLP- SDES- TLP- FCP- *CmpltTO+ CmpltAbrt- UnxCmplt- >RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- >> >> I suggest you should load AER driver and check for any error messages in >log. Also please check any error message reported by system in BIOS log. >Are there any machine check errors? >> >> When did you notice this issue? have 82571 ever been working before on >this server? >> >> One more thing, Cache line size 256 is little unusual( I never seen this >value before, mostly it's 64). Does BIOS settings have been changed? Are >you using default BIOS setting? >> > >I checked BIOS's log found the fault from the device, I changed "PCI-E >Payload Size" >from 256(default) to 128, now the device works. > >I compared lspci output found Address for data of MSI Capabilities's be >changed: > >Old: >Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ >Address: fee21000 Data: 40cb > >New: >Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ >Address: fee24000 Data: 405c > >Mostly like it's a BIOS bug? please comments. > >Thanks, >Joe What is the exact error messages in BIOS log? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/7] i2c-bfin-twi: Use struct dev_pm_ops for power management
Acked-by: Sonic Zhang >-Original Message- >From: Rafael J. Wysocki [mailto:r...@sisk.pl] >Sent: Thursday, July 12, 2012 3:24 AM >To: LKML >Cc: Linux PM list; Linus Walleij; linux-...@vger.kernel.org; Zhang, Sonic; Jean >Delvare; Ben Dooks; Wolfram Sang; Peter Korsgaard; Guan Xuetao; Vitaly Wool; >Colin Cross; Stephen Warren >Subject: [PATCH 2/7] i2c-bfin-twi: Use struct dev_pm_ops for power management > >From: Rafael J. Wysocki > >Make the Blackfin On-Chip Two Wire Interface driver define its PM >callbacks through a struct dev_pm_ops object rather than by using >legacy PM hooks in struct platform_driver. > >Signed-off-by: Rafael J. Wysocki >--- > drivers/i2c/busses/i2c-bfin-twi.c | 18 ++ > 1 file changed, 10 insertions(+), 8 deletions(-) > >Index: linux/drivers/i2c/busses/i2c-bfin-twi.c >= >== >--- linux.orig/drivers/i2c/busses/i2c-bfin-twi.c >+++ linux/drivers/i2c/busses/i2c-bfin-twi.c >@@ -611,9 +611,9 @@ static struct i2c_algorithm bfin_twi_alg > .functionality = bfin_twi_functionality, > }; > >-static int i2c_bfin_twi_suspend(struct platform_device *pdev, pm_message_t >state) >+static int i2c_bfin_twi_suspend(struct device *dev) > { >- struct bfin_twi_iface *iface = platform_get_drvdata(pdev); >+ struct bfin_twi_iface *iface = dev_get_drvdata(dev); > > iface->saved_clkdiv = read_CLKDIV(iface); > iface->saved_control = read_CONTROL(iface); >@@ -626,14 +626,14 @@ static int i2c_bfin_twi_suspend(struct p > return 0; > } > >-static int i2c_bfin_twi_resume(struct platform_device *pdev) >+static int i2c_bfin_twi_resume(struct device *dev) > { >- struct bfin_twi_iface *iface = platform_get_drvdata(pdev); >+ struct bfin_twi_iface *iface = dev_get_drvdata(dev); > > int rc = request_irq(iface->irq, bfin_twi_interrupt_entry, >- 0, pdev->name, iface); >+ 0, to_platform_device(dev)->name, iface); > if (rc) { >- dev_err(>dev, "Can't get IRQ %d !\n", iface->irq); >+ dev_err(dev, "Can't get IRQ %d !\n", iface->irq); > return -ENODEV; > } > >@@ -646,6 +646,9 @@ static int i2c_bfin_twi_resume(struct pl > return 0; > } > >+static SIMPLE_DEV_PM_OPS(i2c_bfin_twi_pm, >+ i2c_bfin_twi_suspend, i2c_bfin_twi_resume); >+ > static int i2c_bfin_twi_probe(struct platform_device *pdev) > { > struct bfin_twi_iface *iface; >@@ -770,11 +773,10 @@ static int i2c_bfin_twi_remove(struct pl > static struct platform_driver i2c_bfin_twi_driver = { > .probe = i2c_bfin_twi_probe, > .remove = i2c_bfin_twi_remove, >- .suspend= i2c_bfin_twi_suspend, >- .resume = i2c_bfin_twi_resume, > .driver = { > .name = "i2c-bfin-twi", > .owner = THIS_MODULE, >+ .pm = _bfin_twi_pm, > }, > }; > > N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: [PATCH 1/3] tmpfs: revert SEEK_DATA and SEEK_HOLE
On Thu, 12 Jul 2012, Dave Chinner wrote: > On Wed, Jul 11, 2012 at 11:55:34AM -0700, Hugh Dickins wrote: > > On Wed, 11 Jul 2012, Cong Wang wrote: > > > > > > If you don't have burden to maintain it, I'd prefer to leave as it is, > > > I don't think 752-bytes is the reason we revert it. > > > > Thank you, your vote has been counted ;) > > and I'll be glad if yours stimulates some agreement or disagreement. > > > > But your vote would count for a lot more if you know of some app which > > would really benefit from this functionality in tmpfs: I've heard of none. > > So what? I've heard of no apps that use this functionality on XFS, > either, but I have heard of a lot of people asking for it to be > implemented over the past couple of years so they can use it. I'd certainly not ask you to remove your support for it from XFS: nobody would call XFS a minimal filesystem. But tmpfs has a tradition and a duty to keep fairly small: it needs to be useful, but it shouldn't be carrying unused baggage. > There's been patches written to make coreutils (cp) make use of it > instead of parsing FIEMAP output to find holes, though I don't know > if that's gone beyond more than "here's some patches" > > Besides, given that you can punch holes in tmpfs files, it seems > strange to then say "we don't need a method of skipping holes to > find data quickly" tmpfs has been punching holes (via MADV_REMOVE) since 2.6.16 (and that wasn't added on my whim, IBM wanted and did it). But I haven't heard of anybody asking for a method of skipping them in six years. > > Besides, seek-hole/data is still shiny new and lots of developers > aren't even aware of it's presence in recent kernels. Removing new > functionality saying "no-one is using it" is like smashing the egg > before the chicken hatches (or is it cutting of the chickes's head > before it lays the egg?). (You remind me of my chicken-and-egg sandwiches - you can't get them, you see, it's chicken and egg.) I'm not trying to remove SEEK_HOLE/SEEK_DATA support from the kernel: I'm just saying that nobody has yet made the case for their usefulness in tmpfs, so they're better removed from it before v3.5 is released. Once we see how useful they have become in the grown-up filesystems, or someone shows how useful they can be on tmpfs, then we reinstate. Of course, I'm on both sides of this argument: I wrote that code, I like it, I'll be glad to put it back when it's useful to someone. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3 v3] mm: bug fix free page check in zone_watermark_ok
In __zone_watermark_ok, free and min are signed long type while z->lowmem_reserve[classzone_idx] is unsigned long type. So comparision of them could be wrong due to type conversion to unsigned although free_pages is minus value. It could return true instead of false in case of order-0 check so that kswapd could sleep forever. It means livelock because direct reclaimer loops forever until kswapd set zone->all_unreclaimable. Aaditya reported this problem when he test my hotplug patch. Reported-off-by: Aaditya Kumar Tested-by: Aaditya Kumar Signed-off-by: Aaditya Kumar Signed-off-by: Minchan Kim --- This patch isn't dependent with this series. It seems to be candidate for -stable but I'm not sure because of this part. So, pass the decision to akpm. " - It must fix a real bug that bothers people (not a, "This could be a problem..." type thing)." mm/page_alloc.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f17e6e4..627653c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1594,6 +1594,7 @@ static bool __zone_watermark_ok(struct zone *z, int order, unsigned long mark, { /* free_pages my go negative - that's OK */ long min = mark; + long lowmem_reserve = z->lowmem_reserve[classzone_idx]; int o; free_pages -= (1 << order) - 1; @@ -1602,7 +1603,7 @@ static bool __zone_watermark_ok(struct zone *z, int order, unsigned long mark, if (alloc_flags & ALLOC_HARDER) min -= min / 4; - if (free_pages <= min + z->lowmem_reserve[classzone_idx]) + if (free_pages <= min + lowmem_reserve) return false; for (o = 0; o < order; o++) { /* At the next order, this order's pages become unavailable */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3 v3] memory-hotplug: fix kswapd looping forever problem
When hotplug offlining happens on zone A, it starts to mark freed page as MIGRATE_ISOLATE type in buddy for preventing further allocation. (MIGRATE_ISOLATE is very irony type because it's apparently on buddy but we can't allocate them). When the memory shortage happens during hotplug offlining, current task starts to reclaim, then wake up kswapd. Kswapd checks watermark, then go sleep because current zone_watermark_ok_safe doesn't consider MIGRATE_ISOLATE freed page count. Current task continue to reclaim in direct reclaim path without kswapd's helping. The problem is that zone->all_unreclaimable is set by only kswapd so that current task would be looping forever like below. __alloc_pages_slowpath restart: wake_all_kswapd rebalance: __alloc_pages_direct_reclaim do_try_to_free_pages if global_reclaim && !all_unreclaimable return 1; /* It means we did did_some_progress */ skip __alloc_pages_may_oom should_alloc_retry goto rebalance; If we apply KOSAKI's patch[1] which doesn't depends on kswapd about setting zone->all_unreclaimable, we can solve this problem by killing some task in direct reclaim path. But it doesn't wake up kswapd, still. It could be a problem still if other subsystem needs GFP_ATOMIC request. So kswapd should consider MIGRATE_ISOLATE when it calculate free pages BEFORE going sleep. This patch counts the number of MIGRATE_ISOLATE page block and zone_watermark_ok_safe will consider it if the system has such blocks (fortunately, it's very rare so no problem in POV overhead and kswapd is never hotpath). Copy/modify from Mel's quote " Ideal solution would be "allocating" the pageblock. It would keep the free space accounting as it is but historically, memory hotplug didn't allocate pages because it would be difficult to detect if a pageblock was isolated or if part of some balloon. Allocating just full pageblocks would work around this, However, it would play very badly with CMA. " [1] http://lkml.org/lkml/2012/6/14/74 * from v2 - rebased on mmotm-2012-07-10-16-59 - Add Tested-by * from v1 - add changelog - make functions simple - remove atomic variable - discard exact isolated free page accounting. - rebased on next-20120626 Suggested-by: KOSAKI Motohiro Tested-by: Aaditya Kumar Cc: KAMEZAWA Hiroyuki Cc: Mel Gorman --- include/linux/mmzone.h |8 mm/page_alloc.c| 31 +++ mm/page_isolation.c| 29 +++-- 3 files changed, 66 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3219014..3bd253e 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -477,6 +477,14 @@ struct zone { * rarely used fields: */ const char *name; +#ifdef CONFIG_MEMORY_ISOLATION + /* +* the number of MIGRATE_ISOLATE *pageblock*. +* We need this for free page counting. Look at zone_watermark_ok_safe. +* It's protected by zone->lock +*/ + int nr_pageblock_isolate; +#endif } cacheline_internodealigned_in_smp; typedef enum { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 627653c..980c75b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -218,6 +218,11 @@ EXPORT_SYMBOL(nr_online_nodes); int page_group_by_mobility_disabled __read_mostly; +/* + * NOTE: + * Don't use set_pageblock_migratetype(page, MIGRATE_ISOLATE) directly. + * Instead, use {un}set_pageblock_isolate. + */ void set_pageblock_migratetype(struct page *page, int migratetype) { @@ -1618,6 +1623,23 @@ static bool __zone_watermark_ok(struct zone *z, int order, unsigned long mark, return true; } +#ifdef CONFIG_MEMORY_ISOLATION +static inline unsigned long nr_zone_isolate_freepages(struct zone *zone) +{ + unsigned long nr_pages = 0; + + if (unlikely(zone->nr_pageblock_isolate)) { + nr_pages = zone->nr_pageblock_isolate * pageblock_nr_pages; + } + return nr_pages; +} +#else +static inline unsigned long nr_zone_isolate_freepages(struct zone *zone) +{ + return 0; +} +#endif + bool zone_watermark_ok(struct zone *z, int order, unsigned long mark, int classzone_idx, int alloc_flags) { @@ -1633,6 +1655,14 @@ bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark, if (z->percpu_drift_mark && free_pages < z->percpu_drift_mark) free_pages = zone_page_state_snapshot(z, NR_FREE_PAGES); + /* +* If the zone has MIGRATE_ISOLATE type free page, +* we should consider it. nr_zone_isolate_freepages is never +* accurate so kswapd might not sleep although she can. +* But it's more desirable for memory hotplug rather than +* forever sleep which cause livelock in direct reclaim path. +*/ + free_pages -=
[PATCH 1/3 v3] mm: Factor out memory isolate functions
Now mm/page_alloc.c has some memory isolation functions but they are used oly when we enable CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}. So let's make it configurable by new CONFIG_MEMORY_ISOLATION so that it can reduce binary size and we can check it simple by CONFIG_MEMORY_ISOLATION, not if defined CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}. * from v2 - rebase on mmotm-2012-07-10-16-59 * from v1 - rebase on next-20120626 Cc: Andi Kleen Cc: Marek Szyprowski Acked-by: KAMEZAWA Hiroyuki Signed-off-by: Minchan Kim --- drivers/base/Kconfig |1 + include/linux/page-isolation.h | 13 +-- mm/Kconfig |5 +++ mm/Makefile|4 +- mm/page_alloc.c| 80 ++-- mm/page_isolation.c| 71 +++ 6 files changed, 92 insertions(+), 82 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 9b21469..08b4c52 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -196,6 +196,7 @@ config CMA bool "Contiguous Memory Allocator (EXPERIMENTAL)" depends on HAVE_DMA_CONTIGUOUS && HAVE_MEMBLOCK && EXPERIMENTAL select MIGRATION + select MEMORY_ISOLATION help This enables the Contiguous Memory Allocator which allows drivers to allocate big physically-contiguous blocks of memory for use with diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h index 3bdcab3..105077a 100644 --- a/include/linux/page-isolation.h +++ b/include/linux/page-isolation.h @@ -1,6 +1,11 @@ #ifndef __LINUX_PAGEISOLATION_H #define __LINUX_PAGEISOLATION_H + +bool has_unmovable_pages(struct zone *zone, struct page *page, int count); +void set_pageblock_migratetype(struct page *page, int migratetype); +int move_freepages_block(struct zone *zone, struct page *page, + int migratetype); /* * Changes migrate type in [start_pfn, end_pfn) to be MIGRATE_ISOLATE. * If specified range includes migrate types other than MOVABLE or CMA, @@ -10,7 +15,7 @@ * free all pages in the range. test_page_isolated() can be used for * test it. */ -extern int +int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, unsigned migratetype); @@ -18,7 +23,7 @@ start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE. * target range is [start_pfn, end_pfn) */ -extern int +int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, unsigned migratetype); @@ -30,8 +35,8 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn); /* * Internal functions. Changes pageblock's migrate type. */ -extern int set_migratetype_isolate(struct page *page); -extern void unset_migratetype_isolate(struct page *page, unsigned migratetype); +int set_migratetype_isolate(struct page *page); +void unset_migratetype_isolate(struct page *page, unsigned migratetype); #endif diff --git a/mm/Kconfig b/mm/Kconfig index 82fed4e..d5c8019 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -140,9 +140,13 @@ config ARCH_DISCARD_MEMBLOCK config NO_BOOTMEM boolean +config MEMORY_ISOLATION + boolean + # eventually, we can have this option just 'select SPARSEMEM' config MEMORY_HOTPLUG bool "Allow for memory hot-add" + select MEMORY_ISOLATION depends on SPARSEMEM || X86_64_ACPI_NUMA depends on HOTPLUG && ARCH_ENABLE_MEMORY_HOTPLUG depends on (IA64 || X86 || PPC_BOOK3S_64 || SUPERH || S390) @@ -272,6 +276,7 @@ config MEMORY_FAILURE depends on MMU depends on ARCH_SUPPORTS_MEMORY_FAILURE bool "Enable recovery from hardware memory errors" + select MEMORY_ISOLATION help Enables code to recover from some memory failures on systems with MCA recovery. This allows a system to continue running diff --git a/mm/Makefile b/mm/Makefile index 262360a..7deaa29 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -15,8 +15,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page_alloc.o page-writeback.o \ readahead.o swap.o truncate.o vmscan.o shmem.o \ prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \ - page_isolation.o mm_init.o mmu_context.o percpu.o \ - compaction.o $(mmu-y) + mm_init.o mmu_context.o percpu.o compaction.o $(mmu-y) obj-y += init-mm.o ifdef CONFIG_NO_BOOTMEM @@ -55,3 +54,4 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o obj-$(CONFIG_CLEANCACHE) += cleancache.o +obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o diff --git
Re: [PATCH RESEND] Fix a dead loop in async_synchronize_full()
On Wed, 2012-07-11 at 15:42 -0700, Andrew Morton wrote: > On Mon, 09 Jul 2012 15:04:25 +0800 > Li Zhong wrote: > > > This patch tries to fix a dead loop in async_synchronize_full(), which > > could be seen when preemption is disabled on a single cpu machine. > > > > void async_synchronize_full(void) > > { > > do { > > async_synchronize_cookie(next_cookie); > > } while (!list_empty(_running) || ! > > list_empty(_pending)); > > } > > > > async_synchronize_cookie() calls async_synchronize_cookie_domain() with > > _running as the default domain to synchronize. > > > > However, there might be some works in the async_pending list from other > > domains. On a single cpu system, without preemption, there is no chance > > for the other works to finish, so async_synchronize_full() enters a dead > > loop. > > > > It seems async_synchronize_full() wants to synchronize all entries in > > all running lists(domains), so maybe we could just check the entry_count > > to know whether all works are finished. > > > > Currently, async_synchronize_cookie_domain() expects a non-NULL running > > list ( if NULL, there would be NULL pointer dereference ), so maybe a > > NULL pointer could be used as an indication for the functions to > > synchronize all works in all domains. > > The patch is fairly wordwrapped - please fix up your email client. Ah, sorry for that, I will check it. > > More seriously, it does not apply to linux-next due to some fairly > significant changes which have been sitting in Dan's tree since May. > What's going on? > Just went through Dan's patches, it seems that they also had async_synchronize_full() to sync all domains. I will test/check those patches, and drop this one if the result is good. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 1/4] staging: OMAP4+: thermal: introduce bandgap temperature sensor
Hello, On Thu, Jul 12, 2012 at 3:32 AM, Greg Kroah-Hartman wrote: > On Wed, Jul 11, 2012 at 11:41:06PM +0300, Eduardo Valentin wrote: >> In the System Control Module, OMAP supplies a voltage reference >> and a temperature sensor feature that are gathered in the band >> gap voltage and temperature sensor (VBGAPTS) module. The band >> gap provides current and voltage reference for its internal >> circuits and other analog IP blocks. The analog-to-digital >> converter (ADC) produces an output value that is proportional >> to the silicon temperature. >> >> This patch provides a platform driver which expose this feature. >> It is moduled as a MFD child of the System Control Module core >> MFD driver. >> >> This driver provides only APIs to access the device properties, >> like temperature, thresholds and update rate. >> >> Signed-off-by: Eduardo Valentin >> Signed-off-by: J Keerthy > > This patch gives me the following build error: > > rivers/staging/omap-thermal/omap-bandgap.c: In function ‘omap_bandgap_build’: > drivers/staging/omap-thermal/omap-bandgap.c:805:2: error: implicit > declaration of function ‘of_match_device’ > [-Werror=implicit-function-declaration] > drivers/staging/omap-thermal/omap-bandgap.c:805:8: warning: assignment makes > pointer from integer without a cast [enabled by default] OK. Those I didn't see while testing on my side. I didn't use -Werror=implicit-function-declaration though. > > So of course I can't accept it :( That's for sure. > > How hard is it to test that the patches build before sending them to me? It should not be. I will check with those compiling flags. > > ugh, > > greg k-h -- Eduardo Valentin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/7] i2c-puv3: Use struct dev_pm_ops for power management
> From: Rafael J. Wysocki > > Make the PKUnity-v3 SoC I2C controller driver define its suspend > callback through a struct dev_pm_ops object rather than by using > a legacy PM hook in struct platform_driver. The empty resume > callback is not necessary, so remove it. > > Signed-off-by: Rafael J. Wysocki Thanks. Acked-by: Guan Xuetao > --- > drivers/i2c/busses/i2c-puv3.c | 15 ++- > 1 file changed, 6 insertions(+), 9 deletions(-) > > Index: linux/drivers/i2c/busses/i2c-puv3.c > === > --- linux.orig/drivers/i2c/busses/i2c-puv3.c > +++ linux/drivers/i2c/busses/i2c-puv3.c > @@ -254,7 +254,7 @@ static int __devexit puv3_i2c_remove(str > } > > #ifdef CONFIG_PM > -static int puv3_i2c_suspend(struct platform_device *dev, pm_message_t > state) > +static int puv3_i2c_suspend(struct device *dev) > { > int poll_count; > /* Disable the IIC */ > @@ -267,23 +267,20 @@ static int puv3_i2c_suspend(struct platf > return 0; > } > > -static int puv3_i2c_resume(struct platform_device *dev) > -{ > - return 0 ; > -} > +static SIMPLE_DEV_PM_OPS(puv3_i2c_pm, puv3_i2c_suspend, NULL); > +#define PUV3_I2C_PM (_i2c_pm) > + > #else > -#define puv3_i2c_suspend NULL > -#define puv3_i2c_resume NULL > +#define PUV3_I2C_PM NULL > #endif > > static struct platform_driver puv3_i2c_driver = { > .probe = puv3_i2c_probe, > .remove = __devexit_p(puv3_i2c_remove), > - .suspend= puv3_i2c_suspend, > - .resume = puv3_i2c_resume, > .driver = { > .name = "PKUnity-v3-I2C", > .owner = THIS_MODULE, > + .pm = PUV3_I2C_PM, > } > }; > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH V3 1/2] driver: add PPI support in tpm driver
Hi Kent, Thanks for your comment on the patch. But there's some confusion on my side. You mentioned not to change the tpm driver name. But the driver is linked from tpm.c and tpm_ppi.c, so I should change the original tpm.c file name, right? Is it acceptable to change tpm.c to tpm_common.c or tpm_utils.c or else? Thanks, Xiaoyan -Original Message- From: Kent Yoder [mailto:k...@linux.vnet.ibm.com] Sent: Wednesday, July 11, 2012 11:25 PM To: Zhang, Xiaoyan Cc: linux-kernel@vger.kernel.org; Cihula, Joseph; Wei, Gang; tpmdd-de...@lists.sourceforge.net; deb...@linux.vnet.ibm.com; sra...@linux.vnet.ibm.com; m.selho...@sirrix.com; shpedoi...@gmail.com; linux-security-mod...@vger.kernel.org; james.l.mor...@oracle.com; h...@zytor.com; linux-...@vger.kernel.org Subject: Re: [PATCH V3 1/2] driver: add PPI support in tpm driver Hi Xiaoyan, On Thu, Jun 21, 2012 at 06:54:51AM +, Zhang, Xiaoyan wrote: > From: Xiaoyan Zhang > > The Physical Presence Interface enables the OS and the BIOS to > cooperate and provides a simple and straightforward platform user > experience for administering the TPM without sacrificing security. > > V2: separate the patch out in a separate source file, add #ifdef > CONFIG_ACPI so it compiles out on ppc, and use standard error instead > of ACPI error as return code of show/store fns. > > V3: move #ifdef CONFIG_ACPI from .c file to .h file > > Signed-off-by: Xiaoyan Zhang > --- > drivers/char/tpm/Makefile |4 +- > drivers/char/tpm/tpm.c |5 + > drivers/char/tpm/tpm.h |8 + > drivers/char/tpm/tpm_ppi.c | 461 > > 4 files changed, 477 insertions(+), 1 deletions(-) create mode > 100644 drivers/char/tpm/tpm_ppi.c > > diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile > index ea3a1e0..132ad95 100644 > --- a/drivers/char/tpm/Makefile > +++ b/drivers/char/tpm/Makefile > @@ -1,8 +1,10 @@ > # > # Makefile for the kernel tpm device drivers. > # > -obj-$(CONFIG_TCG_TPM) += tpm.o > +obj-$(CONFIG_TCG_TPM) += tpm_main.o > +tpm_main-y += tpm.o > ifdef CONFIG_ACPI > + tpm_main-y += tpm_ppi.o > obj-$(CONFIG_TCG_TPM) += tpm_bios.o This will change the name of the tpm driver, which I really don't want to do. Can you apply on top of [1] and resubmit? I've added the following patch which modularizes the event logging code and has a couple Makefile changes too. This is needed for a future driver on PPC, but it should make adding your file obvious. Note that the patch below applies on top of Infineon's I2C driver patches. [1] git://github.com/shpedoikal/linux.git tpmdd-next Thanks, Kent tpm: modularize event log collection Break ACPI-specific pieces of the event log handling into their own file and create tpm_eventlog.[ch] to store common event log handling code. This will be required to integrate future event log sources on platforms without ACPI tables. Signed-off-by: Kent Yoder --- drivers/char/tpm/Makefile |1 + drivers/char/tpm/tpm.c |1 + drivers/char/tpm/tpm_acpi.c | 104 drivers/char/tpm/tpm_bios.c | 556 --- drivers/char/tpm/tpm_eventlog.c | 419 + drivers/char/tpm/tpm_eventlog.h | 71 + 6 files changed, 596 insertions(+), 556 deletions(-) create mode 100644 drivers/char/tpm/tpm_acpi.c delete mode 100644 drivers/char/tpm/tpm_bios.c create mode 100644 drivers/char/tpm/tpm_eventlog.c create mode 100644 drivers/char/tpm/tpm_eventlog.h diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile index a9c3afc..beac52f6 100644 --- a/drivers/char/tpm/Makefile +++ b/drivers/char/tpm/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_TCG_TPM) += tpm.o ifdef CONFIG_ACPI obj-$(CONFIG_TCG_TPM) += tpm_bios.o + tpm_bios-objs += tpm_eventlog.o tpm_acpi.o endif obj-$(CONFIG_TCG_TIS) += tpm_tis.o obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o diff --git a/drivers/char/tpm/tpm.c b/drivers/char/tpm/tpm.c index d39b1f6..beb98c3 100644 --- a/drivers/char/tpm/tpm.c +++ b/drivers/char/tpm/tpm.c @@ -30,6 +30,7 @@ #include #include "tpm.h" +#include "tpm_eventlog.h" enum tpm_const { TPM_MINOR = 224,/* officially assigned */ diff --git a/drivers/char/tpm/tpm_acpi.c b/drivers/char/tpm/tpm_acpi.c new file mode 100644 index 000..a1bb5a18 --- /dev/null +++ b/drivers/char/tpm/tpm_acpi.c @@ -0,0 +1,104 @@ +/* + * Copyright (C) 2005 IBM Corporation + * + * Authors: + * Seiji Munetoh + * Stefan Berger + * Reiner Sailer + * Kylene Hall + * + * Maintained by: + * + * Access to the eventlog extended by the TCG BIOS of PC platform + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + */ +
Re: [patch RT 3/7] Disable RT_GROUP_SCHED in PREEMPT_RT_FULL
On Wed, 2012-07-11 at 22:05 +, Thomas Gleixner wrote: > plain text document attachment > (disable-rt_group_sched-in-preempt_rt_full.patch) > Strange CPU stalls have been observed in RT when RT_GROUP_SCHED > was configured. > > Disable it for now. > > Signed-off-by: Carsten Emde > Signed-off-by: Thomas Gleixner > > --- > init/Kconfig |1 + > 1 file changed, 1 insertion(+) > > Index: linux-3.4.4-rt13-64+/init/Kconfig > === > --- linux-3.4.4-rt13-64+.orig/init/Kconfig > +++ linux-3.4.4-rt13-64+/init/Kconfig > @@ -746,6 +746,7 @@ config RT_GROUP_SCHED > bool "Group scheduling for SCHED_RR/FIFO" > depends on EXPERIMENTAL > depends on CGROUP_SCHED > + depends on !PREEMPT_RT_FULL > default n > help > This feature lets you explicitly allocate real CPU bandwidth > > > > > I turn the thing off because it doesn't make any sense to me for -rt, and because it's busted. The below works around isolation bustage I encountered. Peter didn't like it (what's to like?) but it saves the day, so shall live on in non-rt kernels until I hopefully someday see RT_GROUP_SCHED being fed into a Bitwolf-9000 ;-) sched,rt: fix isolated CPUs leaving root_task_group indefinitely throttled Root task group bandwidth replentishment must service all CPUs regardless of where it was last started. Signed-off-by: Mike Galbraith --- kernel/sched/rt.c | 13 + 1 file changed, 13 insertions(+) --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -782,6 +782,19 @@ static int do_sched_rt_period_timer(stru const struct cpumask *span; span = sched_rt_period_mask(); +#ifdef CONFIG_RT_GROUP_SCHED + /* +* FIXME: isolated CPUs should really leave the root task group, +* whether they are isolcpus or were isolated via cpusets, lest +* the timer run on a CPU which does not service all runqueues, +* potentially leaving other CPUs indefinitely throttled. If +* isolation is really required, the user will turn the throttle +* off to kill the perturbations it causes anyway. Meanwhile, +* this maintains functionallity for boot and/or troubleshooting. +*/ + if (rt_b == _task_group.rt_bandwidth) + span = cpu_online_mask; +#endif for_each_cpu(i, span) { int enqueue = 0; struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AutoNUMA15
> > Ok the problem is that you must not pin anything. If you hard pin > AutoNUMA won't do anything on those processes. > > It is impossible to run faster than the raw hard pinning, impossible > because AutoNUMA has also to migrate memory, hard pinning avoids all > memory migrations. > > > Thanks a lot, and looking forward to see how things goes when you > remove the hard pins. > Andrea: I continue testing specjbb2005 for your patch on 2c7535e100805d9, removed hard pin for openjdk JVM. On my NHM EP machine 12GB memory 16 LCPUs. Following data use each scenario's results on 3.5-rc2 as 100% base. 3.5-rc2 3.5-rc2+autonuma 2 JVM, each 1GBmem 100% 100% 1 JVM with 2GBmem100% 100% 2 JVM, each 4GBmem 100% 98%~100% 1 JVM with 4GB mem 100% 98%~100% So, my testing didn't find the benefit from autonuma patch, and when use bigger memory size, the path introduce more variation and may cause 2% performance drop. my open jdk option is "-Xmx4g -Xms4g -Xincgc" I am wondering if the specjbb can show your patch's advantage. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3] panel: Use pr_err(...) rather than printk(KERN_ERR ...)
This change is inspired by checkpatch. Signed-off-by: Toshiaki Yamane --- drivers/staging/panel/panel.c | 42 +--- 1 files changed, 18 insertions(+), 24 deletions(-) diff --git a/drivers/staging/panel/panel.c b/drivers/staging/panel/panel.c index 7365089..a6d71fd 100644 --- a/drivers/staging/panel/panel.c +++ b/drivers/staging/panel/panel.c @@ -34,6 +34,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include @@ -1987,10 +1989,9 @@ static struct logical_input *panel_bind_key(char *name, char *press, struct logical_input *key; key = kzalloc(sizeof(struct logical_input), GFP_KERNEL); - if (!key) { - printk(KERN_ERR "panel: not enough memory\n"); + if (!key) return NULL; - } + if (!input_name2mask(name, >mask, >value, _mask_i, _mask_o)) { kfree(key); @@ -2030,10 +2031,9 @@ static struct logical_input *panel_bind_callback(char *name, struct logical_input *callback; callback = kmalloc(sizeof(struct logical_input), GFP_KERNEL); - if (!callback) { - printk(KERN_ERR "panel: not enough memory\n"); + if (!callback) return NULL; - } + memset(callback, 0, sizeof(struct logical_input)); if (!input_name2mask(name, >mask, >value, _mask_i, _mask_o)) @@ -2110,10 +2110,8 @@ static void panel_attach(struct parport *port) return; if (pprt) { - printk(KERN_ERR - "panel_attach(): port->number=%d parport=%d, " - "already registered !\n", - port->number, parport); + pr_err("%s: port->number=%d parport=%d, already registered !\n", + __func__, port->number, parport); return; } @@ -2122,16 +2120,14 @@ static void panel_attach(struct parport *port) /*PARPORT_DEV_EXCL */ 0, (void *)); if (pprt == NULL) { - pr_err("panel_attach(): port->number=%d parport=%d, " - "parport_register_device() failed\n", - port->number, parport); + pr_err("%s: port->number=%d parport=%d, parport_register_device() failed\n", + __func__, port->number, parport); return; } if (parport_claim(pprt)) { - printk(KERN_ERR - "Panel: could not claim access to parport%d. " - "Aborting.\n", parport); + pr_err("%s: could not claim access to parport%d. Aborting.\n", + __func__, parport); goto err_unreg_device; } @@ -2165,10 +2161,8 @@ static void panel_detach(struct parport *port) return; if (!pprt) { - printk(KERN_ERR - "panel_detach(): port->number=%d parport=%d, " - "nothing to unregister.\n", - port->number, parport); + pr_err("%s: port->number=%d parport=%d, nothing to unregister.\n", + __func__, port->number, parport); return; } @@ -2278,8 +2272,8 @@ int panel_init(void) init_in_progress = 1; if (parport_register_driver(_driver)) { - printk(KERN_ERR - "Panel: could not register with parport. Aborting.\n"); + pr_err("%s: could not register with parport. Aborting.\n", + __func__); return -EIO; } @@ -2291,8 +2285,8 @@ int panel_init(void) pprt = NULL; } parport_unregister_driver(_driver); - printk(KERN_ERR "Panel driver version " PANEL_VERSION - " disabled.\n"); + pr_err("%s: Panel driver version " PANEL_VERSION " disabled.\n", + __func__); return -ENODEV; } -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mm: Warn about costly page allocation
On Thu, 12 Jul 2012, Minchan Kim wrote: > Agreed and that's why I suggested following patch. > It's not elegant but at least, it could attract interest of configuration > people and they could find a regression during test phase. > This description could be improved later by writing new documenation which > includes more detailed story and method for capturing high order allocation > by ftrace once we see regression report. > > At the moment, I would like to post this patch, simply. > (Of course, I hope fluent native people will correct a sentence. :) ) > > Any objections, Andrew, David? > There are other config options like CONFIG_SLOB that are used for a very small memory footprint on systems like this. We used to have CONFIG_EMBEDDED to suggest options like this but that has since been renamed as CONFIG_EXPERT and has become obscured. If size is really the only difference, I would think that people who want the smallest kernel possible would be doing allnoconfig and then selectively enabling what they need, so defconfig isn't really relevant here. And it's very difficult for an admin to know whether or not they "care about high-order allocations." I'd reconsider disabling compaction by default unless there are other considerations that haven't been mentioned. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 3/3] regulator: s2mps11: Use sec_reg_write rather than sec_reg_update when mask is 0xff
Hi, On Thursday, July 12, 2012 10:39 AM +0900, Axel Lin wrote: > > Signed-off-by: Axel Lin Acked-by: Sangbeom Kim Thanks, Sangbeom. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm v3] mm: have order > 0 compaction start off where it left
Hi Rik, On Wed, Jul 11, 2012 at 04:18:00PM -0400, Rik van Riel wrote: > This patch makes the comment for cc->wrapped longer, explaining > what is really going on. It also incorporates the comment fix > pointed out by Minchan. > > Additionally, Minchan found that, when no pages get isolated, > high_pte could be a value that is much lower than desired, s/high_pte/high_pfn > which might potentially cause compaction to skip a range of > pages. > > Only assign zone->compact_cache_free_pfn if we actually > isolated free pages for compaction. > > Split out the calculation to get the start of the last page > block in a zone into its own, commented function. > > Signed-off-by: Rik van Riel Acked-by: Minchan Kim > --- > include/linux/mmzone.h |2 +- > mm/compaction.c| 30 ++ > mm/internal.h |6 +- > 3 files changed, 28 insertions(+), 10 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index e629594..e957fa1 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -370,7 +370,7 @@ struct zone { > spinlock_t lock; > int all_unreclaimable; /* All pages pinned */ > #if defined CONFIG_COMPACTION || defined CONFIG_CMA > - /* pfn where the last order > 0 compaction isolated free pages */ > + /* pfn where the last incremental compaction isolated free pages */ > unsigned long compact_cached_free_pfn; > #endif > #ifdef CONFIG_MEMORY_HOTPLUG > diff --git a/mm/compaction.c b/mm/compaction.c > index 2668b77..3812c3e 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -472,10 +472,11 @@ static void isolate_freepages(struct zone *zone, >* looking for free pages, the search will restart here as >* page migration may have returned some pages to the allocator >*/ > - if (isolated) > + if (isolated) { > high_pfn = max(high_pfn, pfn); > - if (cc->order > 0) > - zone->compact_cached_free_pfn = high_pfn; > + if (cc->order > 0) > + zone->compact_cached_free_pfn = high_pfn; > + } > } > > /* split_free_page does not map the pages */ > @@ -569,6 +570,21 @@ static isolate_migrate_t isolate_migratepages(struct > zone *zone, > return ISOLATE_SUCCESS; > } > > +/* > + * Returns the start pfn of the laste page block in a zone. s/laste/last/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 2/3] regulator: s2mps11: Fix wrong setting for config.dev
Hi, On Thursday, July 12, 2012 10:38 AM +0900, Axel Lin wrote: > > Signed-off-by: Axel Lin Acked-by: Sangbeom Kim Thanks, Sangbeom. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/3] regulator: s2mps11: Fixup missing commas
Hi! On Thursday, July 12, 2012 10:36 AM +0900, Axel Lin wrote: > Signed-off-by: Axel Lin S2mps11 regulator patch is based on mfd/for-next branch. On mfd/for-next branch, Some regulator features didn't apply like a set_voltage_time_sel. So, I didn't add some feature like a set_voltage_time_sel, vsel_mask. Before send patch, I just added that. Anyway, It's my mistake. Acked-by: Sangbeom Kim Thanks, Sangbeom. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 82571EB: Detected Hardware Unit Hang
On 07/12/12 02:51, Dave, Tushar N wrote: > > Joe, > > I see couple of errors in lspci output. > Device capability status register shows UnCorrectable PCIe error. This means > there is certainly something went wrong. The only way to recover from > Uncorrectable errors is reset. > > DevSta: CorrErr- *UncorrErr+ FatalErr+ UnsuppReq+ AuxPwr+ TransPend- > > Also AER sections in lspci output shows PCIe completion timeout. > > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- *CmpltTO+ CmpltAbrt- UnxCmplt- > RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- > > I suggest you should load AER driver and check for any error messages in log. > Also please check any error message reported by system in BIOS log. Are there > any machine check errors? > > When did you notice this issue? have 82571 ever been working before on this > server? > > One more thing, Cache line size 256 is little unusual( I never seen this > value before, mostly it's 64). Does BIOS settings have been changed? Are you > using default BIOS setting? > I checked BIOS's log found the fault from the device, I changed "PCI-E Payload Size" from 256(default) to 128, now the device works. I compared lspci output found Address for data of MSI Capabilities's be changed: Old: Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee21000 Data: 40cb New: Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee24000 Data: 405c Mostly like it's a BIOS bug? please comments. Thanks, Joe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 0/2] kvm: Improving directed yield in PLE handler
On Wed, 2012-07-11 at 14:23 +0300, Avi Kivity wrote: > On 07/11/2012 02:16 PM, Alexander Graf wrote: > >> > >>> yes the data structure itself seems based on the algorithm > >>> and not on arch specific things. That should work. If we move that to > >>> common > >>> code then s390 will use that scheme automatically for the cases were we > >>> call > >>> kvm_vcpu_on_spin(). All others archs as well. > >> > >> ARM doesn't have an instruction for cpu_relax(), so it can't intercept > >> it. Given ppc's dislike of overcommit, > > > > What dislike of overcommit? > > I understood ppc virtualization is more of the partitioning sort. > Perhaps I misunderstood it. But the reliance on device assignment, the > restrictions on scheduling, etc. all point to it. It historically was but that has changed quite a bit. Essentially the user can configure partitions to be more of the "all virtualized" kind or on the contrary more fixed partitions. The hypervisor does shared processors and we have paravirt APIs to cede our time slice to the lock holder. > >> and the way it implements cpu_relax() by adjusting hw thread priority, > > > > Yeah, I don't think we can intercept relaxing. > > ... and the lack of ability to intercept cpu_relax() ... > > > It's basically a nop-like instruction that gives hardware hints on its > > current priorities. > > That's what x88 PAUSE does. But we can intercept it (and not just any > execution - we can restrict intercept to tight loops executed more than > a specific number of times). > > > That said, we can always add PV code. > > Sure, but that's defeated by advancements like self-tuning PLE exits. > It's hard to get this right. > Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 0/2] kvm: Improving directed yield in PLE handler
> ARM doesn't have an instruction for cpu_relax(), so it can't intercept > it. Given ppc's dislike of overcommit, and the way it implements > cpu_relax() by adjusting hw thread priority, I'm guessing it doesn't > intercept those either, but I'm copying the ppc people in case I'm > wrong. So it's s390 and x86. No but our spinlocks call __spin_yield() (or __rw_yield) which does some paravirt tricks already. We check if the holder is currently running, and if not, we call the H_CONFER hypercall which can be used to "give" our time slice to the holder. Our implementation of H_CONFER in KVM is currently a nop though. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -mm] memcg: prevent from OOM with too many dirty pages
On Wed, 11 Jul 2012 18:57:43 -0700 (PDT) Hugh Dickins wrote: > --- 3.5-rc6-mm1/mm/vmscan.c 2012-07-11 14:42:13.668335884 -0700 > +++ linux/mm/vmscan.c 2012-07-11 16:01:20.712814127 -0700 > @@ -726,7 +726,8 @@ static unsigned long shrink_page_list(st >* writeback from reclaim and there is nothing else to >* reclaim. >*/ > - if (!global_reclaim(sc) && PageReclaim(page)) > + if (!global_reclaim(sc) && PageReclaim(page) && > + may_enter_fs) > wait_on_page_writeback(page); > else { > nr_writeback++; um, that may_enter_fs test got removed because nobody knew why it was there. Nobody knew why it was there because it was undocumented. Do you see where I'm going with this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ARM: why smp_mb() is not needed in the "__mutex_fastpath_lock" and "__mutex_fastpath_unlock" functions
Hello, I wonder why smp_mb() is not needed in the "__mutex_fastpath_lock" and "__mutex_fastpath_unlock" functions which are located in the "arch/arm/include/asm/mutex.h"? I think "dmb" instruction is necessary there. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the net-next tree with the infiniband tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in include/linux/mlx4/device.h between commit 396f2feb05d7 ("mlx4_core: Implement mechanism for reserved Q_Keys") from the infiniband tree and commit 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API") from the net-next tree. Just context changes. I fixed it up (see below) and can carry the fix as necessary. -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc include/linux/mlx4/device.h index 441caf1,6f0d133..000 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@@ -540,83 -542,10 +573,85 @@@ struct mlx4_dev u8 rev_id; charboard_id[MLX4_BOARD_ID_LEN]; int num_vfs; + u64 regid_promisc_array[MLX4_MAX_PORTS + 1]; + u64 regid_allmulti_array[MLX4_MAX_PORTS + 1]; }; +struct mlx4_eqe { + u8 reserved1; + u8 type; + u8 reserved2; + u8 subtype; + union { + u32 raw[6]; + struct { + __be32 cqn; + } __packed comp; + struct { + u16 reserved1; + __be16 token; + u32 reserved2; + u8 reserved3[3]; + u8 status; + __be64 out_param; + } __packed cmd; + struct { + __be32 qpn; + } __packed qp; + struct { + __be32 srqn; + } __packed srq; + struct { + __be32 cqn; + u32 reserved1; + u8 reserved2[3]; + u8 syndrome; + } __packed cq_err; + struct { + u32 reserved1[2]; + __be32 port; + } __packed port_change; + struct { + #define COMM_CHANNEL_BIT_ARRAY_SIZE 4 + u32 reserved; + u32 bit_vec[COMM_CHANNEL_BIT_ARRAY_SIZE]; + } __packed comm_channel_arm; + struct { + u8 port; + u8 reserved[3]; + __be64 mac; + } __packed mac_update; + struct { + __be32 slave_id; + } __packed flr_event; + struct { + __be16 current_temperature; + __be16 warning_threshold; + } __packed warming; + struct { + u8 reserved[3]; + u8 port; + union { + struct { + __be16 mstr_sm_lid; + __be16 port_lid; + __be32 changed_attr; + u8 reserved[3]; + u8 mstr_sm_sl; + __be64 gid_prefix; + } __packed port_info; + struct { + __be32 block_ptr; + __be32 tbl_entries_mask; + } __packed tbl_change_info; + } params; + } __packed port_mgmt_change; + } event; + u8 slave_id; + u8 reserved3[2]; + u8 owner; +} __packed; + struct mlx4_init_port_param { int set_guid0; int set_node_guid; @@@ -783,6 -793,8 +908,10 @@@ int mlx4_wol_write(struct mlx4_dev *dev int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx); void mlx4_counter_free(struct mlx4_dev *dev, u32 idx); +int mlx4_get_parav_qkey(struct mlx4_dev *dev, u32 qpn, u32 *qkey); + + int mlx4_flow_attach(struct mlx4_dev *dev, +struct mlx4_net_trans_rule *rule, u64 *reg_id); + int mlx4_flow_detach(struct mlx4_dev *dev, u64 reg_id); + #endif /* MLX4_DEVICE_H */ pgpT0XaIkGaHg.pgp Description: PGP signature
linux-next: manual merge of the net-next tree with the infiniband tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in drivers/net/ethernet/mellanox/mlx4/main.c between commit 6634961c14d3 ("mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them") from the infiniband tree and commit 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API") from the net-next tree. Just context changes (I think). I have fixed it up (see below) and can carry the fix as necessary. -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/net/ethernet/mellanox/mlx4/main.c index 5df3ac4,4264516..000 --- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@@ -1232,10 -1231,26 +1258,29 @@@ static int mlx4_init_hca(struct mlx4_de goto err_stop_fw; } + if (mlx4_is_master(dev)) + mlx4_parav_master_pf_caps(dev); + + priv->fs_hash_mode = MLX4_FS_L2_HASH; + + switch (priv->fs_hash_mode) { + case MLX4_FS_L2_HASH: + init_hca.fs_hash_enable_bits = 0; + break; + + case MLX4_FS_L2_L3_L4_HASH: + /* Enable flow steering with +* udp unicast and tcp unicast +*/ + init_hca.fs_hash_enable_bits = + MLX4_FS_UDP_UC_EN | MLX4_FS_TCP_UC_EN; + break; + } + profile = default_profile; + if (dev->caps.steering_mode == + MLX4_STEERING_MODE_DEVICE_MANAGED) + profile.num_mcg = MLX4_FS_NUM_MCG; icm_size = mlx4_make_profile(dev, , _cap, _hca); pgpbkXhuBKhzJ.pgp Description: PGP signature
Re: [PATCH 5/6] ftrace/x86: Add separate function to save regs
(2012/07/12 1:28), Steven Rostedt wrote: > On Wed, 2012-07-11 at 12:22 -0400, Steven Rostedt wrote: >> On Tue, 2012-07-03 at 17:29 +0900, Masami Hiramatsu wrote: >> >> >>> + /* Restore flags */ + pushq EFLAGS(%rsp) + popfq + + MCOUNT_RESTORE_FRAME >>> >>> Here, if MCOUNT_RESTORE_FRAME has skip too, I think you don't >>> need to restore flags before restoring other registers, like >>> below; >>> >>> MCOUNT_RESTORE_FRAME 8 >>> popfq >>> >>> And also, this will prevent to modify flags before return by >>> addq in MCOUNT_RESTORE_FRAME. >> >> Ah, because the addq will modify flags :-/ >> >> Grumble, I guess I should implement this, even though it will make it a >> little more complex. I thought it was better to restore flags >> explicitly, but that's not the case. >> > > I know why I did this. Do you want kprobes to be able to modify flags? > If so, then I need to add, before the restore: > > movq EFLAGS(%rsp), %rax > movq %rax, SS(%rsp) Yes, kprobes might be used for modifying flags, so, please :) Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Deadlocks due to per-process plugging
On Wed, 2012-07-11 at 22:16 +0200, Jan Kara wrote: > On Wed 11-07-12 12:05:51, Jeff Moyer wrote: > > Jan Kara writes: > > > > > Hello, > > > > > > we've recently hit a deadlock in our QA runs which is caused by the > > > per-process plugging code. The problem is as follows: > > > process A process B (kjournald) > > > generic_file_aio_write() > > > blk_start_plug(); > > > ... > > > somewhere in here we allocate memory and > > > direct reclaim submits buffer X for IO > > > ... > > > ext3_write_begin() > > > ext3_journal_start() > > > we need more space in a journal > > > so we want to checkpoint old transactions, > > > we block waiting for kjournald to commit > > > a currently running transaction. > > > journal_commit_transaction() > > > wait for IO on buffer X > > > to complete as it is part > > > of the current transaction > > > > > > => deadlock since A waits for B and B waits for A to do unplug. > > > BTW: I don't think this is really ext3/ext4 specific. I think other > > > filesystems can get into problems as well when direct reclaim submits some > > > IO and the process subsequently blocks without submitting the IO. > > > > So, I thought schedule would do the flush. Checking the code: > > > > asmlinkage void __sched schedule(void) > > { > > struct task_struct *tsk = current; > > > > sched_submit_work(tsk); > > __schedule(); > > } > > > > And sched_submit_work looks like this: > > > > static inline void sched_submit_work(struct task_struct *tsk) > > { > > if (!tsk->state || tsk_is_pi_blocked(tsk)) > > return; > > /* > > * If we are going to sleep and we have plugged IO queued, > > * make sure to submit it to avoid deadlocks. > > */ > > if (blk_needs_flush_plug(tsk)) > > blk_schedule_flush_plug(tsk); > > } > > > > This eventually ends in a call to blk_run_queue_async(q) after > > submitting the I/O from the plug list. Right? So is the question > > really why doesn't the kblockd workqueue get scheduled? > Ah, I didn't know this. Thanks for the hint. So in the kdump I have I can > see requests queued in tsk->plug despite the process is sleeping in > TASK_UNINTERRUPTIBLE state. So the only way how unplug could have been > omitted is if tsk_is_pi_blocked() was true. Rummaging through the dump... > indeed task has pi_blocked_on = 0x8802717d79c8. The dump is from an -rt > kernel (I just didn't originally thought that makes any difference) so > actually any mutex is rtmutex and thus tsk_is_pi_blocked() is true whenever > we are sleeping on a mutex. So this seems like a bug in rtmutex code. > Thomas, you seemed to have added that condition... Any idea how to avoid > the deadlock? Tsk tsk, I completely overlooked sched_submit_work(). -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] resource: make sure requested range intersects root range
On Wed, Jul 11, 2012 at 06:26:49PM +0300, Purdila, Octavian wrote: > On Wed, Jul 11, 2012 at 5:54 PM, Ram Pai wrote: > > On Wed, Jul 11, 2012 at 02:06:10PM +0300, Purdila, Octavian wrote: > >> On Wed, Jul 11, 2012 at 5:09 AM, Ram Pai wrote: > >> > >> > > >> > Wait.. I am not sure this will fix the problem entirely. The above check > >> > will handle the case where the range requested is entirey out of the > >> > root's range. But if the requested range overlapps that of the root > >> > range, we will still call __reserve_region_with_split() and end up with > >> > a recursion if there is a overflow. Wont we? > >> > > >> > >> Good catch. I will fix this as well as address Andrew's and Joe's > >> comments in a new patch. The only question is how to handle the > >> overlap case: > >> > >> (a) abort the whole request or > >> > >> (b) try to reserve the part that overlaps (and adjust the request to > >> avoid the overflow) > >> > >> I think (b) is more in line with the current implementation for > >> reservations. > > > > > > I prefer (b). following patch should handle that. > > > > diff --git a/kernel/resource.c b/kernel/resource.c > > index e1d2b8e..dd87fde 100644 > > --- a/kernel/resource.c > > +++ b/kernel/resource.c > > @@ -780,6 +780,10 @@ static void __init __reserve_region_with_split(struct > > resource *root, > > > > if (conflict->start > start) > > __reserve_region_with_split(root, start, conflict->start-1, > > name); > > + > > + if (conflict->end == parent->end ) > > + return; > > + > > if (conflict->end < end) > > __reserve_region_with_split(root, conflict->end+1, end, > > name); > > } > > > > I don't think this covers all cases, e.g. if root range starts > somewhere above 0 and the request is below the root start point. __reserve_region_with_split() is expected to reserve all available requested range within the root's range. Correct? If that is the case, the above patch will reserve the range from the start of the root's range to the request's end? In other words whatever is overlapping and available. No? > > What about something like below? It is maybe too verbose, but it > should make it easier to find the offender. > > diff --git a/kernel/resource.c b/kernel/resource.c > index e1d2b8e..0d71983 100644 > --- a/kernel/resource.c > +++ b/kernel/resource.c > @@ -788,8 +788,29 @@ void __init reserve_region_with_split(struct > resource *root, > resource_size_t start, resource_size_t end, > const char *name) > { > + int abort = 0; > + > write_lock(_lock); > - __reserve_region_with_split(root, start, end, name); > + if (!(root->start >= start && root->end >= end)) { This is checking if the request overlapps with the beginning of the root's range? > + pr_err("Requested range (0x%llx-0x%llx) not in root %pr\n", > +(unsigned long long)start, (unsigned long long)end, > +root); > + if (start > root->end || end < root->start) { and here it is checking if the requested range has no overlapp with the root's range, which will always be false. > + abort = 1; > + pr_err("Unable to fix request, aborting\n"); > + } else { > + if (end > root->end) > + end = root->end; > + else if (start < root->start) > + start = root->start; > + pr_err("Request trimmed to (0x%llx-0x%llx)\n", > +(unsigned long long)start, > +(unsigned long long)end); Yes it is too verbose :), and feels wrong. > + } > + dump_stack(); > + } > + if (!abort) > + __reserve_region_with_split(root, start, end, name); > write_unlock(_lock); > } I think your original patch with Andrew's modification and my above proposal should solve the problem. RP -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 -mm] memcg: prevent from OOM with too many dirty pages
Hi Michal, On Wed, 20 Jun 2012, Michal Hocko wrote: > Hi Andrew, > here is an updated version if it is easier for you to drop the previous > one. > changes since v1 > * added Mel's Reviewed-by > * updated changelog as per Andrew > * updated the condition to be optimized for no-memcg case I mentioned in Johannes's [03/11] thread a couple of days ago, that I was having a problem with your wait_on_page_writeback() in mmotm. It turns out that your original patch was fine, but you let dark angels whisper into your ear, to persuade you to remove the "&& may_enter_fs". Part of my load builds kernels on extN over loop over tmpfs: loop does mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)) because it knows it will deadlock, if the loop thread enters reclaim, and reclaim tries to write back a dirty page, one which needs the loop thread to perform the write. With the may_enter_fs check restored, all is well. I don't entirely like your patch: I think it would be much better to wait in the same place as the wait_iff_congested(), when the pages gathered have been sent for writing and unlocked and putback and freed; and I also wonder if it should go beyond the !global_reclaim case for swap pages, because they don't participate in dirty limiting. But those are things I should investigate later - I did write a patch like that before, when I was having some unexpected OOM trouble with a private kernel; but my OOMs then were because of something silly that I'd left out, and I'm not at present sure if we have a problem in this regard or not. The important thing is to get the may_enter_fs back into your patch: I can't really Sign-off the below because it's yours, but Acked-by: Hugh Dickins --- mm/vmscan.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- 3.5-rc6-mm1/mm/vmscan.c 2012-07-11 14:42:13.668335884 -0700 +++ linux/mm/vmscan.c 2012-07-11 16:01:20.712814127 -0700 @@ -726,7 +726,8 @@ static unsigned long shrink_page_list(st * writeback from reclaim and there is nothing else to * reclaim. */ - if (!global_reclaim(sc) && PageReclaim(page)) + if (!global_reclaim(sc) && PageReclaim(page) && + may_enter_fs) wait_on_page_writeback(page); else { nr_writeback++; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the scsi tree with the libata tree
Hi James, Today's linux-next merge of the scsi tree got a conflict in include/scsi/scsi_device.h between commits 166a2967b45e ("libata: tell scsi layer device supports runtime power off") and a4120295a40a ("sr: support zero power ODD") from the libata tree and commit 2516034c2270 ("[SCSI] set to WCE if usb cache quirk is present") from the scsi tree. Just context changes. I fixed it up (see below) and can carry the fix as necessary. -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc include/scsi/scsi_device.h index cfd951b,7539f52..000 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@@ -153,8 -154,7 +154,9 @@@ struct scsi_device unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */ unsigned try_rc_10_first:1; /* Try READ_CAPACACITY_10 first */ unsigned is_visible:1; /* is the device visible in sysfs */ + unsigned can_power_off:1; /* Device supports runtime power off */ + unsigned wakeup_by_user:1; /* user wakes up the ODD */ + unsigned wce_default_on:1; /* Cache is ON by default */ DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */ struct list_head event_list;/* asserted events */ pgpYLI04jv54I.pgp Description: PGP signature
[PATCH 3/3] regulator: s2mps11: Use sec_reg_write rather than sec_reg_update when mask is 0xff
Signed-off-by: Axel Lin --- drivers/regulator/s2mps11.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c index b3c2705..4669dc9 100644 --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -280,8 +280,7 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay2) >> 6; if (s2mps11->buck3_ramp || s2mps11->buck4_ramp) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay34) >> 4; - sec_reg_update(iodev, S2MPS11_REG_RAMP, - ramp_reg | ramp_enable, 0xff); + sec_reg_write(iodev, S2MPS11_REG_RAMP, ramp_reg | ramp_enable); } ramp_reg &= 0x00; @@ -289,7 +288,7 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay16) >> 4; ramp_reg |= get_ramp_delay(s2mps11->ramp_delay7810) >> 2; ramp_reg |= get_ramp_delay(s2mps11->ramp_delay9); - sec_reg_update(iodev, S2MPS11_REG_RAMP_BUCK, ramp_reg, 0xff); + sec_reg_write(iodev, S2MPS11_REG_RAMP_BUCK, ramp_reg); for (i = 0; i < S2MPS11_REGULATOR_MAX; i++) { -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] regulator: s2mps11: Fix wrong setting for config.dev
Currently s2mps11->iodev, s2mps11->dev and config.dev point to NULL. This patch fixes the settings for config.dev. Current code does not need the *dev and *iodev of struct s2mps11_info, so remove them. Signed-off-by: Axel Lin --- drivers/regulator/s2mps11.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c index da8c3d1..b3c2705 100644 --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -24,8 +24,6 @@ #include struct s2mps11_info { - struct device *dev; - struct sec_pmic_dev *iodev; struct regulator_dev **rdev; int ramp_delay2; @@ -260,8 +258,6 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) } rdev = s2mps11->rdev; - config.dev = >dev; - config.regmap = iodev->regmap; platform_set_drvdata(pdev, s2mps11); s2mps11->ramp_delay2 = pdata->buck2_ramp_delay; @@ -284,7 +280,7 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay2) >> 6; if (s2mps11->buck3_ramp || s2mps11->buck4_ramp) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay34) >> 4; - sec_reg_update(s2mps11->iodev, S2MPS11_REG_RAMP, + sec_reg_update(iodev, S2MPS11_REG_RAMP, ramp_reg | ramp_enable, 0xff); } @@ -293,11 +289,11 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) ramp_reg |= get_ramp_delay(s2mps11->ramp_delay16) >> 4; ramp_reg |= get_ramp_delay(s2mps11->ramp_delay7810) >> 2; ramp_reg |= get_ramp_delay(s2mps11->ramp_delay9); - sec_reg_update(s2mps11->iodev, S2MPS11_REG_RAMP_BUCK, ramp_reg, 0xff); + sec_reg_update(iodev, S2MPS11_REG_RAMP_BUCK, ramp_reg, 0xff); for (i = 0; i < S2MPS11_REGULATOR_MAX; i++) { - config.dev = s2mps11->dev; + config.dev = >dev; config.regmap = iodev->regmap; config.init_data = pdata->regulators[i].initdata; config.driver_data = s2mps11; @@ -305,8 +301,8 @@ static __devinit int s2mps11_pmic_probe(struct platform_device *pdev) rdev[i] = regulator_register([i], ); if (IS_ERR(rdev[i])) { ret = PTR_ERR(rdev[i]); - dev_err(s2mps11->dev, "regulator init failed for %d\n", - i); + dev_err(>dev, "regulator init failed for %d\n", + i); rdev[i] = NULL; goto err; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] regulator: s2mps11: Fixup missing commas
Signed-off-by: Axel Lin --- drivers/regulator/s2mps11.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c index 514cf54..da8c3d1 100644 --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -88,7 +88,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_LDO_STEP1,\ .n_voltages = S2MPS11_LDO_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_L1CTRL + num - 1, \ - .vsel_mask = S2MPS11_LDO_VSEL_MASK \ + .vsel_mask = S2MPS11_LDO_VSEL_MASK,\ .enable_reg = S2MPS11_REG_L1CTRL + num - 1, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -102,7 +102,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_LDO_STEP2,\ .n_voltages = S2MPS11_LDO_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_L1CTRL + num - 1, \ - .vsel_mask = S2MPS11_LDO_VSEL_MASK \ + .vsel_mask = S2MPS11_LDO_VSEL_MASK,\ .enable_reg = S2MPS11_REG_L1CTRL + num - 1, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -117,7 +117,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_BUCK_STEP1, \ .n_voltages = S2MPS11_BUCK_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_B1CTRL2 + (num - 1) * 2, \ - .vsel_mask = S2MPS11_BUCK_VSEL_MASK\ + .vsel_mask = S2MPS11_BUCK_VSEL_MASK, \ .enable_reg = S2MPS11_REG_B1CTRL1 + (num - 1) * 2, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -132,7 +132,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_BUCK_STEP1, \ .n_voltages = S2MPS11_BUCK_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_B5CTRL2, \ - .vsel_mask = S2MPS11_BUCK_VSEL_MASK\ + .vsel_mask = S2MPS11_BUCK_VSEL_MASK, \ .enable_reg = S2MPS11_REG_B5CTRL1, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -147,7 +147,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_BUCK_STEP1, \ .n_voltages = S2MPS11_BUCK_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_B6CTRL2 + (num - 6) * 2, \ - .vsel_mask = S2MPS11_BUCK_VSEL_MASK\ + .vsel_mask = S2MPS11_BUCK_VSEL_MASK, \ .enable_reg = S2MPS11_REG_B6CTRL1 + (num - 6) * 2, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -162,7 +162,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_BUCK_STEP3, \ .n_voltages = S2MPS11_BUCK_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_B9CTRL2, \ - .vsel_mask = S2MPS11_BUCK_VSEL_MASK\ + .vsel_mask = S2MPS11_BUCK_VSEL_MASK, \ .enable_reg = S2MPS11_REG_B9CTRL1, \ .enable_mask= S2MPS11_ENABLE_MASK \ } @@ -177,7 +177,7 @@ static struct regulator_ops s2mps11_buck_ops = { .uV_step= S2MPS11_BUCK_STEP2, \ .n_voltages = S2MPS11_BUCK_N_VOLTAGES, \ .vsel_reg = S2MPS11_REG_B9CTRL2, \ - .vsel_mask = S2MPS11_BUCK_VSEL_MASK\ + .vsel_mask = S2MPS11_BUCK_VSEL_MASK, \ .enable_reg = S2MPS11_REG_B9CTRL1, \ .enable_mask= S2MPS11_ENABLE_MASK \ } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/4] zsmalloc: add details to zs_map_object boiler plate
On Wed, Jul 11, 2012 at 09:15:43AM -0500, Seth Jennings wrote: > On 07/11/2012 02:42 AM, Minchan Kim wrote: > > On 07/11/2012 12:17 AM, Seth Jennings wrote: > >> On 07/09/2012 09:35 PM, Minchan Kim wrote: > >>> Maybe we need local_irq_save/restore in zs_[un]map_object path. > >> > >> I'd rather not disable interrupts since that will create > >> unnecessary interrupt latency for all users, even if they > > > > Agreed. > > Although we guide k[un]map atomic is so fast, it isn't necessary > > to force irq_[enable|disable]. Okay. > > > >> don't need interrupt protection. If a particular user uses > >> zs_map_object() in an interrupt path, it will be up to that > >> user to disable interrupts to ensure safety. > > > > Nope. It shouldn't do that. > > Any user in interrupt context can't assume that there isn't any other user > > using per-cpu buffer > > right before interrupt happens. > > > > The concern is that if such bug happens, it's very hard to find a bug. > > So, how about adding this? > > > > void zs_map_object(...) > > { > > BUG_ON(in_interrupt()); > > } > > I not completely following you, but I think I'm following > enough. Your point is that the per-cpu buffers are shared > by all zsmalloc users and one user doesn't know if another > user is doing a zs_map_object() in an interrupt path. And vise versa is yes. > > However, I think what you are suggesting is to disallow > mapping in interrupt context. This is a problem for zcache > as it already does mapping in interrupt context, namely for > page decompression in the page fault handler. I don't get it. Page fault handler isn't interrupt context. > > What do you think about making the per-cpu buffers local to > each zsmalloc pool? That way each user has their own per-cpu > buffers and don't step on each other's toes. Maybe, It could be a solution if you really need it in interrupt context. But the concern is it could hurt zsmalloc's goal which is memory space efficiency if your system has lots of CPUs. > > Thanks, > Seth > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/16] sched: normalize tg load contributions against runnable time
On Fri, Jul 06, 13:52, Peter Zijlstra wrote: > This then yields: > > P(\Union_{i=1..n} u_i) ~= \Sum_{k=1..n} (-1)^(k-1) (n choose k) u^k > > Which unfortunately isn't a series I found a sane solution for, but > numerically (see below) we can see it very quickly approaches 1 when n > >> 1. Isn't this series just 1 - (1 - u)^n? So yes, it converges quickly to 1 if u is a probability. Andre -- The only person who always got his work done by Friday was Robinson Crusoe signature.asc Description: Digital signature
Re: [PATCH 00/13] rbtree updates
On Wed, Jul 11, 2012 at 6:23 AM, Peter Zijlstra wrote: > Looks nice.. How about something like the below on top.. I couldn't > immediately find a sane reason for the grand-parent to always be red in > the insertion case. Do you mean the case you marked XXX ? it is actually parent that is red, which we know because we tested that a few lines earlier. > @@ -85,12 +104,27 @@ void rb_insert_color(struct rb_node *nod > } else if (rb_is_black(parent)) > break; > > + /* > +* XXX > +*/ > gparent = rb_red_parent(parent); See :) > if (parent == gparent->rb_left) { > tmp = gparent->rb_right; > if (tmp && rb_is_red(tmp)) { > - /* Case 1 - color flips */ > + /* > +* Case 1 - color flips > +* > +* Gg > +* / \ / \ > +* p u --> P U > +*// > +* nN > +* > +* However, since g's parent might be red, and > +* 4) does not allow this, we need to recurse > +* at g. > +*/ I like these diagrams - I initially didn't think they'd work well, given the need for colors etc, but I now see that it's workable. In __rb_erase_color(), some of the cases are more complicated than you drew however, because some node colors aren't known. This is what I ended up with: * 5), then the longest possible path due to 4 is 2B. * * We shall indicate color with case, where black nodes are uppercase and red - * nodes will be lowercase. + * nodes will be lowercase. Unknown color nodes shall be drawn as red with + * some accompanying text comment. */ + /* +* Case 2 - sibling color flip +* (p could be either color here) +* +* p p +*/ \ / \ +* N S--> N s +* / \ / \ +* Sl SrSl Sr +* +* This leaves us violating 5), so +* recurse at p. If p is red, the +* recursion will just flip it to black +* and exit. If coming from Case 1, +* p is known to be red. +*/ + /* +* Case 3 - right rotate at sibling +* (p could be either color here) +* +*p p +* / \ / \ +* N S--> N Sl +* / \ \ +*sl Srs +* \ +*Sr +*/ + /* +* Case 4 - left rotate at parent + color flips +* (p and sl could be either color here. +* After rotation, p becomes black, s acquires +* p's color, and sl keeps its color) +* +* p s +* / \ / \ +* N S --> P Sr +*/ \ / \ +* sl sr N sl +*/ -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC internal PATCH] mfd/mc13xxx: drop modifying driver's id_table in probe
On Wed, Jul 11, 2012 at 01:36:48PM +0200, Uwe Kleine-König wrote: > This was introduced in commit > > 876989d (mfd: Add device tree probe support for mc13xxx) > > for spi and later while introducing support for i2c copied to the i2c > driver. > > Modifying driver details is very strange, for example probing an > mc13892 device (instantiated via dt) removes the driver's ability to > handle (traditionally probed) mc13783 devices in this case. > I'm not aware of any problems that make this hack necessary and if > there were some, they'd have to be fixed in the spi/i2c core, not in > a driver. > > Signed-off-by: Uwe Kleine-König The code was added by me, and it turns out the change is completely unnecessary. So, Acked-by: Shawn Guo Regards, Shawn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] zsmalloc improvements
On Wed, Jul 11, 2012 at 09:00:30AM -0500, Seth Jennings wrote: > On 07/11/2012 02:03 AM, Minchan Kim wrote: > > On 07/03/2012 06:15 AM, Seth Jennings wrote: > >> zsmapbench measures the copy-based mapping at ~560 cycles for a > >> map/unmap operation on spanned object for both KVM guest and bare-metal, > >> while the page table mapping was ~1500 cycles on a VM and ~760 cycles > >> bare-metal. The cycles for the copy method will vary with > >> allocation size, however, it is still faster even for the largest > >> allocation that zsmalloc supports. > >> > >> The result is convenient though, as mempcy is very portable :) > > > > Today, I tested zsmapbench in my embedded board(ARM). > > tlb-flush is 30% faster than copy-based so it's always not win. > > I think it depends on CPU speed/cache size. > > > > zram is already very popular on embedded systems so I want to use > > it continuously without 30% big demage so I want to keep our old approach > > which supporting local tlb flush. > > > > Of course, in case of KVM guest, copy-based would be always bin win. > > So shouldn't we support both approach? It could make code very ugly > > but I think it has enough value. > > > > Any thought? > > Thanks for testing on ARM. > > I can add the pgtable assisted method back in, no problem. > The question is by which criteria are we going to choose > which method to use? By arch (i.e. ARM -> pgtable assist, > x86 -> copy, other archs -> ?)? I prefer your previous version __HAVE_LOCAL_FLUSH_TLB_KERNEL_RANGE. If you didn't implement that function for x86, it simply uses memcpy version while ARM can use tlb flush version if we add the definary. Of course, it would be better to select best choice by testing benchmark for all of architecture but that architecture would be changed in future, too so we need further testing periodically. And we will have no time then, too. For reducing the burden, we can detect it automatically while module is loading or booting but it tackles with booting time. :( So, let's put it aside as further works. At the moment, let's think simply two arch(x86, ARM) until other arch user doesn't raise a hand for volunteering. Yes. it could be a problem in future if other arch which support local flush want to use memcpy but IMHO, it's very hard to kill two bird(portability and performance) with one stone. :( > > Also, what changes did you make to zsmapbench to measure > elapsed time/cycles on ARM? Afaik, rdtscll() is not > supported on ARM. I used local_clock instead of arch dependent code and makes longer test time from 1 sec to 10 sec. > > Thanks, > Seth > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.5-rc6 printk formatting problem during oom-kill.
On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote: > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones wrote: > > > I noticed that the format of the oom-killer output seems to have > > > changed, and > > > now it spews stuff like.. > > > > > > [49461.758070] lowmem_reserve[]: > > > [49461.758071] 0 > > > [49461.758071] 2643 > > > [49461.758071] 3878 > > > [49461.758072] 3878 > > > [49461.758072] > > > [49461.758072] Node 0 > > > > > Does the oom-killer code need modifying, or the printk code ? > > > I know there's been some regressions in this area recently, but this is > > > still > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > This likely fixes it: > > > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > Let me check if it does, and if I can reproduce it. > > It looks fine here with the above mentioned patch: Now that that patch is in Linus tree, I've hit what's probably a different case. Look at the modules list in this oops.. [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295] [10016.470008] rose<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] nf_conntrack_h323<4>[10016.470008] girbil_sir<4>[10016.470008] ath9k_common<4>[10016.470008] hdlcdrv<4>[10016.470008] tun<4>[10016.470008] spcp8x5<4>[10016.470008] rc_streamzap<4>[10016.470008] rc_medion_x10<4>[10016.470008] gspca_mr97310a<4>[10016.470008] hid_multitouch<4>[10016.470008] fam15h_power<4>[10016.470008] sym53c8xx<4>[10016.470008] gunze<4>[10016.470008] pata_ns87410<4>[10016.470008] snd_ymfpci<4>[10016.470008] michael_mic<4>[10016.470008] blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan] [10016.470008] irq event stamp: 82066 [10016.470008] hardirqs last enabled at (82065): [] restore_args+0x0/0x30 [10016.470008] hardirqs last disabled at (82066): [] apic_timer_interrupt+0x6a/0x80 [10016.470008] softirqs last enabled at (82064): [] __do_softirq+0x13c/0x3e0 [10016.470008] softirqs last disabled at (82055): [] call_softirq+0x1c/0x30 [10016.470008] CPU 1 [10016.470008] Modules linked in:<4>[10016.470008] unix_diag<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] nf_conntrack_h323<4>[10016.470008] girbil_sir<4>[10016.470008] ath9k_common<4>[10016.470008] hdlcdrv<4>[10016.470008] tun<4>[10016.470008] spcp8x5<4>[10016.470008] rc_streamzap<4>[10016.470008] rc_medion_x10<4>[10016.470008] gspca_mr97310a<4>[10016.470008] hid_multitouch<4>[10016.470008] fam15h_power<4>[10016.470008] sym53c8xx<4>[10016.470008] gunze<4>[10016.470008] pata_ns87410<4>[10016.470008] snd_ymfpci<4>[10016.470008] michael_mic<4>[10016.470008] blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan] [10016.470008] [10016.470008] Pid: 24295, comm: trinity-child1 Tainted: G C 3.5.0-rc6+ #80 Dell Inc. Precision WorkStation 490/0DT031 [10016.470008] RIP: 0010:[] [] match_held_lock+0x190/0x190 [10016.470008] RSP: 0018:88016bdf1c10 EFLAGS: 0202 [10016.470008] RAX: 0001 RBX: 81021d43 RCX: 000f [10016.470008] RDX: 88016bdf1fd8 RSI: 81c2fb60 RDI: 81c2fb60 [10016.470008] RBP: 88016bdf1c38 R08: 000a R09: [10016.470008] R10: 0001 R11: 0001 R12: 81021db9 [10016.470008] R13: 88016bdf1b88 R14: 81021d43 R15: 88016bdf1b78 [10016.470008] FS: 7fbad33ad700() GS:88022680() knlGS: [10016.470008] CS: 0010 DS: ES:
Re: [PATCH] misc/pch_phub: Enable UART clock setting by module parameter
On Wed, Jul 11, 2012 at 7:45 PM, Arnd Bergmann wrote: > This looks like a rather nonscalable solution if you get to systems > with lots of clocks. This "clock" is internal clock, not external clock. This PacketHub provides clock to the UART module Both the PacketHub and the UART is in 1 chip LSI which is EG20T. So, selectable clock 1.8432MHz or 48MHz or 64MHz or 192MHz are enough. > Given that you are doing it for the uart clock, shouldn't that be > set from the uart driver using an ioctl like other serial ports do? PacketHub is not serial driver but special driver. So, ioctl doesn't suit PacketHub. > What would be the use case for an end user to override the module > parameter? Is it about platform specific settings or policy? I show use case. Currently, UART works with 1.8432MHz. Using this clock, as you know, maximum speed is 115k. A user wants to use 4M speed, the user need to modify pch_phun.c by hand. If this patch is applied, a user can specify uart_clock via a modules parameter and use 4M speed. My reference driver for this patch is drivers/tty/serial/pch_uart.c This driver can set uart_clock via a module parameter(user_uartclk). Thanks. -- ROHM Co., Ltd. tomoya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH -tip ] tracing: make a snapshot feature available from userspace.
On 07/11/2012 12:39 AM, Hiraku Toyooka wrote: > Hello, Rob, > > Thank you very much for your advice. > > (2012/07/05 10:01), Rob Landley wrote: >> On 07/04/2012 05:47 AM, Hiraku Toyooka wrote: >>> Hello, Steven, >>> >>> I've sent below RFC patch, but still have no responses. This patch can >>> be applied to current tip tree. >>> >>> If you have time, could you give any comment about this patch? >> >> My familiarity with ftrace is "I saw a presentation on it at a >> conference a couple years ago", so I'm not the guy to comment on the >> advisability of the patch itself. >> >> As for the documentation: seems reasonable, could use some english >> polishing. >> +Snapshot + +If CONFIG_TRACER_MAX_TRACE is set, the (generic) snapshot +feature is available in all tracers except for the special +tracers which use a snapshot inside themselves(such as "irqsoff" +or "wakeup"). >> >> This is confusing, I'm guessing irqsoff and wakeup already have >> persistent buffers without CONFIG_TRACER_MAX_TRACE? (So there is a >> generic snapshot feature, but some tracers have their own internal >> buffer and don't use the generic one?) >> > > No, CONFIG_TRACER_MAX_TRACE is originally for making the spare buffer > available. Both some special tracers (such as irqsoff) and generic > snapshot uses the common spare buffer. But purpose of each snapshot is > different. In the special tracers, snapshot is used for recording > information related to max latency of something (such as longest > interrupt-disabled area). This is automatically updated only when the > max latency is detected. In other words, those special tracers use the > same spare buffer in the different way. Thus, we can not enable generic > snapshot for those tracers. Ok, those tracers are not _compatible_ with snapshot. Good to know. >> Is the fact that some tracers don't use this feature an important part >> of the description of the feature? Is making them use common code a todo >> item, or just a comment? >> > > Anyway, the fact is not important here. > > >> How about: >> >> CONFIG_TRACER_MAX_TRACE makes a generic snapshot feature available to >> all tracers. (Some tracers, such as "irqsoff" or "wakeup", use their own >> internal snapshot implementation.) >> > > Thanks, but I think the following one is more suitable. > > (Some tracers, such as "irqsoff" or "wakeup", already use the snapshot > implementation internally) This implies that setting flag is a NOP for them, rather than "if you take a snapshot, they'll stomp all over the buffer". +This enables to preserve trace buffer at a particular point in +time without stopping tracing. When a snapshot is taken, ftrace +swaps the current buffer with a spare buffer which is prepared +in advance. This means that the tracing itself continues on the +spare buffer. >> >> Snapshots preserve a trace buffer at a particular point in time without >> stopping tracing; ftrace swaps the current buffer with a spare buffer, >> and tracking continues in the spare buffer. >> +Following debugfs files in "tracing" directory are related with +this feature. >> >> The following debugfs files in "tracing" are related to this feature: >> + snapshot_enabled: + +This is used to set or display whether the snapshot is +enabled. Echo 1 into this file to prepare a spare buffer +or 0 to shrink it. So, the memory for the spare buffer +will be consumed only when this knob is set. >> >> Write 1 to this file to allocate a snapshot buffer, 0 to free it. >> > > I'll fix them. > > >> (Query: do you have to free the buffer after taking a snapshot below?) >> > > No, we don't always need to free the buffer, although we can free it > when the snapshot becomes unnecessary. We can also reuse the buffer if > we'd like to take the next snapshot. > (I'll add this description.) Actually I was worried about the lifetime rules for the buffer (when does it need to be disposed of, and who is responsible for doing so?) but it looks like ftrace only allows one trace to be going on in the entire system at any given time, so all this context is kernel global anyway... + snapshot_pipe: + +This is used to take a snapshot and to read the output +of the snapshot. Echo 1 into this file to take a +snapshot. Reads from this file is the same as the +"trace_pipe" file (described above "The File System" +section), so that both reads from the snapshot and +tracing are executable in parallel. >> >> Echo 1 into this file to take a snapshot, then read the snapshot from >> the file in the same format as "trace_pipe" (described above in the >> section "The File System"). >> > > I'll fix that. > >> Design questions left for the reader: why are allocating a snapshot >> buffer and taking a snapshot separate actions? > > I'll add following description: > Allocating a spare buffer and taking
feature-removal-schedule entry from 2009
IRQF_SAMPLE_RANDOM is 3 years past its sell-by date in feature-removal-schedule: What: IRQF_SAMPLE_RANDOM Check: IRQF_SAMPLE_RANDOM When: July 2009 Why:Many of IRQF_SAMPLE_RANDOM users are technically bogus as entropy sources in the kernel's current entropy model. To resolve this, every input point to the kernel's entropy pool needs to better document the type of entropy source it actually is. This will be replaced with additional add_*_randomness functions in drivers/char/random.c Who:Robin Getz & Matt Mackall There are 12 remaining uses under drivers/ and 14 more under arch/, the rest of the hits look like infrastructure implementing it. Should I run those files through bother-maintainer.pl and try to get people to stop it, or is there a plan underway I don't know about? Rob -- GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code. Either it's "mere aggregation", or a license violation. Pick one. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] DT clk binding support for 3.6
On 20120707-12:03, Rob Herring wrote: > Mike, > > Please pull DT clk binding and highbank clk support for 3.6. The only > real change from 3.5 pull request is returning error values rather than > NULL to align with the rest of the clk framework. There's been a little > discussion but otherwise has been quiet. > Hi Rob, I agree that these patches have seen enough time on the list. I did find one problem in testing: the high bank clock registration function writes to clk_hb->hw.init which is marked const. I fixed it up in the below patch. That fix is now squashed into into the final patch in your series on the clk-next branch. Let me know if you have any objections. Regards, Mike diff --git a/drivers/clk/clk-highbank.c b/drivers/clk/clk-highbank.c index 2f61065..52fecad 100644 --- a/drivers/clk/clk-highbank.c +++ b/drivers/clk/clk-highbank.c @@ -277,7 +277,7 @@ static __init struct clk *hb_clk_init(struct device_node *node, const struct clk struct hb_clk *hb_clk; const char *clk_name = node->name; const char *parent_name; - struct clk_init_data init_data; + struct clk_init_data init; int rc; rc = of_property_read_u32(node, "reg", ); @@ -292,13 +292,14 @@ static __init struct clk *hb_clk_init(struct device_node *node, const struct clk of_property_read_string(node, "clock-output-names", _name); - hb_clk->hw.init = _data; - hb_clk->hw.init->name = clk_name; - hb_clk->hw.init->num_parents = 1; + init.name = clk_name; + init.ops = ops; + init.flags = 0; parent_name = of_clk_get_parent_name(node, 0); - hb_clk->hw.init->parent_names = _name; - hb_clk->hw.init->ops = ops; - hb_clk->hw.init->flags = 0; + init.parent_names = _name; + init.num_parents = 1; + + hb_clk->hw.init = clk = clk_register(NULL, _clk->hw); if (WARN_ON(IS_ERR(clk))) { > Rob > > The following changes since commit 6887a4131da3adaab011613776d865f4bcfb5678: > > Linux 3.5-rc5 (2012-06-30 16:08:57 -0700) > > are available in the git repository at: > > git://sources.calxeda.com/kernel/linux.git clk-for-3.6 > > for you to fetch changes up to 39a8e38a03823c3acaec02c6d7c551e268cb2139: > > clk: add highbank clock support (2012-07-01 17:04:45 -0500) > > > Grant Likely (2): > clk: add DT clock binding support > clk: add DT fixed-clock binding support > > Rob Herring (2): > dt: add clock binding doc to primecell bindings > clk: add highbank clock support > > .../devicetree/bindings/arm/primecell.txt |6 + > .../devicetree/bindings/clock/calxeda.txt | 17 + > .../devicetree/bindings/clock/clock-bindings.txt | 117 +++ > .../devicetree/bindings/clock/fixed-clock.txt | 21 ++ > arch/arm/Kconfig |1 + > arch/arm/boot/dts/highbank.dts | 91 +- > arch/arm/mach-highbank/Makefile|2 +- > arch/arm/mach-highbank/clock.c | 62 > arch/arm/mach-highbank/highbank.c |7 + > drivers/clk/Makefile |1 + > drivers/clk/clk-fixed-rate.c | 23 ++ > drivers/clk/clk-highbank.c | 345 > > drivers/clk/clk.c | 140 > drivers/clk/clkdev.c | 77 + > include/linux/clk-provider.h | 16 + > include/linux/clk.h| 19 ++ > 16 files changed, 881 insertions(+), 64 deletions(-) > create mode 100644 Documentation/devicetree/bindings/clock/calxeda.txt > create mode 100644 > Documentation/devicetree/bindings/clock/clock-bindings.txt > create mode 100644 Documentation/devicetree/bindings/clock/fixed-clock.txt > delete mode 100644 arch/arm/mach-highbank/clock.c > create mode 100644 drivers/clk/clk-highbank.c > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] copyleft-next: more project name updates Copyleft.next->copyleft-next
From: "Luis R. Rodriguez" This reflects the present gitorious.org name and reflects better with other foo-next git trees out there. --- ABOUT | 10 +- COPYLEFT.next |6 +++--- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/ABOUT b/ABOUT index 0e8ce0b..791000f 100644 --- a/ABOUT +++ b/ABOUT @@ -1,15 +1,15 @@ -Copyleft.next +copyleft-next = "Exploring ideas for a modified copyleft license can't hurt." - Richard M. Stallman, July 2012 -Copyleft.next is an experimental "-ng"-type modification of the GNU +copyleft-next is an experimental "-ng"-type modification of the GNU General Public License, version 3. (It is *not* a fork.) Contributions of patches, ideas, and criticism are welcome. The goal of this effort is to develop an improved strong copyleft free software license. Needless to say, no one should actually *use* a development -version of Copyleft.next as an actual license. Anyone interested in +version of copyleft-next as an actual license. Anyone interested in actually *using* a strong copyleft license for code will wish to use one or more of the following licenses: GNU GPLv2, GNU GPLv3, GNU AGPLv3, and/or later versions of those licenses. For more information, @@ -21,7 +21,7 @@ with (e.g.) Red Hat, or any other corporate entity.** Contributors are expected to participate in their individual capacity. All communications with journalists shall be handled by the -Copyleft.next Marketing Committee, which does not exist yet and +copyleft-next Marketing Committee, which does not exist yet and probably won't exist for at least another year or three. For the avoidance of doubt, Simon Phipps is not considered a journalist. @@ -36,7 +36,7 @@ existing (and future) versions of the GNU GPL, which addresses one of the FSF's concerns about modified versions of the GNU GPL. The meta-license from the FSF stated in its FAQ shall be the license -of all versions of the Copyleft.next license text (to the extent that +of all versions of the copyleft-next license text (to the extent that such versions retain any copyrightable material from versions of the GNU GPL in which the FSF has asserted copyright). Based on the FSF FAQ, that meta-license may be stated as follows: diff --git a/COPYLEFT.next b/COPYLEFT.next index a741985..1afa01e 100644 --- a/COPYLEFT.next +++ b/COPYLEFT.next @@ -1,4 +1,4 @@ - Copyleft.next + copyleft-next 0. Definitions. @@ -392,9 +392,9 @@ survive such relicensing. 14. New Versions of this License. The initial License Steward is [?]. The License Steward may publish -new versions of Copyleft.next. Each version will be given a +new versions of copyleft-next. Each version will be given a distinguishing version number. You may distribute a Covered Work under -the terms of the version of Copyleft.next under which the Program is +the terms of the version of copyleft-next under which the Program is licensed, or under the terms of any subsequent version published by the License Steward. -- 1.7.10.rc1.22.gf5241 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] copyleft-next: rename the file COPYLEFT.next to copyleft-next
From: "Luis R. Rodriguez" Also update the CONTRIBUTING to reflect the new file name change. --- CONTRIBUTING |2 +- COPYLEFT.next => copyleft-next |0 2 files changed, 1 insertion(+), 1 deletion(-) rename COPYLEFT.next => copyleft-next (100%) diff --git a/CONTRIBUTING b/CONTRIBUTING index d06f5da..8f214b1 100644 --- a/CONTRIBUTING +++ b/CONTRIBUTING @@ -1,7 +1,7 @@ Contributions of any sort (text suggestions, ideas, feedback, criticism) from all interested individuals are welcome and encouraged. -All original contributions to Copyleft.next are dedicated to the +All original contributions to copyleft-next are dedicated to the public domain to the maximum extent permitted by applicable law, pursuant to CC0. See CC0 for further details. diff --git a/COPYLEFT.next b/copyleft-next similarity index 100% rename from COPYLEFT.next rename to copyleft-next -- 1.7.10.rc1.22.gf5241 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] copyleft-next: embrace the Signed-off-by practice
From: "Luis R. Rodriguez" The idea is taken from Linus Torvald's subsurface project [0] README file. The Signed-off-by is widely used in public projects and we stand to gain to make its usage more prevalent. The meaning of the Signed-off-by is borrowed from the Linux kernel's. [0] git://github.com/torvalds/subsurface.git Signed-off-by: Luis R. Rodriguez --- CONTRIBUTING | 30 ++ 1 file changed, 30 insertions(+) diff --git a/CONTRIBUTING b/CONTRIBUTING index 8f214b1..966366c 100644 --- a/CONTRIBUTING +++ b/CONTRIBUTING @@ -5,6 +5,36 @@ All original contributions to copyleft-next are dedicated to the public domain to the maximum extent permitted by applicable law, pursuant to CC0. See CC0 for further details. +Please either send me signed-off patches or a pull request with +signed-off commits. If you don't sign off on them, I will not accept +them. This means adding a line that says "Signed-off-by: Name " +at the end of each commit, indicating that you wrote the code and have +the right to pass it on as an open source patch. + +See: http://gerrit.googlecode.com/svn/documentation/2.0/user-signedoffby.html + +Also, please write good git commit messages. A good commit message +looks like this: + + Header line: explaining the commit in one line + + Body of commit message is a few lines of text, explaining things + in more detail, possibly giving some background about the issue + being fixed, etc etc. + + The body of the commit message can be several paragraphs, and + please do proper word-wrap and keep columns shorter than about + 74 characters or so. That way "git log" will show things + nicely even when it's indented. + + Reported-by: whoever-reported-it + Signed-off-by: Your Name + +where that header line really should be meaningful, and really should be +just one line. That header line is what is shown by tools like gitk and +shortlog, and should summarize the change in one readable line of text, +independently of the longer explanation. + Contributions from individual free/libre/open source software project participants, regardless of their views on copyleft, and regardless of their opinions on existing licenses such as the GNU GPLv2 and its -- 1.7.10.rc1.22.gf5241 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] copyleft-next: remove issue tracker references
From: "Luis R. Rodriguez" This uses github, lets not confuse the focus for development for now. --- CONTRIBUTING | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/CONTRIBUTING b/CONTRIBUTING index 1db3cd2..d06f5da 100644 --- a/CONTRIBUTING +++ b/CONTRIBUTING @@ -47,15 +47,4 @@ Public Source Locations === Gitorious (https://gitorious.org/copyleft-next) is now the centralized -location of source for this project. It will be mirrored at a GitHub -repo (currently https://github.com/richardfontana/Copyleft.next) for -the time being, and perhaps indefinitely. - - -Issue Tracking -== - -The issue tracker associated with -https://github.com/richardfontana/Copyleft.next shall be kept open for the -time being, but at some point it is likely to be disabled. - +location of source for this project. -- 1.7.10.rc1.22.gf5241 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] copyleft-next: first set of patches
From: "Luis R. Rodriguez" Fontana, Here is my first series of patches against the new copyleft-next.git project [0]. These patches consists of a few cosmetic changes along with the idea of embracing the usage of the Signed-off-by tag. I've decided to use lkml given since there is no mailing list set up yet and I could not think of a more public mailing list I could use. Maybe we could just use it for now, what's a little more lkml spam ? I've decided to Cc bkuhn, perhaps he doesn't want to be, but oh well, he knows a lot on these matters. [0] git://gitorious.org/copyleft-next/copyleft-next.git Luis R. Rodriguez (4): copyleft-next: remove issue tracker references copyleft-next: more project name updates Copyleft.next->copyleft-next copyleft-next: rename the file COPYLEFT.next to copyleft-next copyleft-next: embrace the Signed-off-by practice ABOUT | 10 - CONTRIBUTING | 45 COPYLEFT.next => copyleft-next |6 +++--- 3 files changed, 40 insertions(+), 21 deletions(-) rename COPYLEFT.next => copyleft-next (99%) -- 1.7.10.rc1.22.gf5241 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 1/4] staging: OMAP4+: thermal: introduce bandgap temperature sensor
On Wed, Jul 11, 2012 at 11:41:06PM +0300, Eduardo Valentin wrote: > In the System Control Module, OMAP supplies a voltage reference > and a temperature sensor feature that are gathered in the band > gap voltage and temperature sensor (VBGAPTS) module. The band > gap provides current and voltage reference for its internal > circuits and other analog IP blocks. The analog-to-digital > converter (ADC) produces an output value that is proportional > to the silicon temperature. > > This patch provides a platform driver which expose this feature. > It is moduled as a MFD child of the System Control Module core > MFD driver. > > This driver provides only APIs to access the device properties, > like temperature, thresholds and update rate. > > Signed-off-by: Eduardo Valentin > Signed-off-by: J Keerthy This patch gives me the following build error: rivers/staging/omap-thermal/omap-bandgap.c: In function ‘omap_bandgap_build’: drivers/staging/omap-thermal/omap-bandgap.c:805:2: error: implicit declaration of function ‘of_match_device’ [-Werror=implicit-function-declaration] drivers/staging/omap-thermal/omap-bandgap.c:805:8: warning: assignment makes pointer from integer without a cast [enabled by default] So of course I can't accept it :( How hard is it to test that the patches build before sending them to me? ugh, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT] selinux: fix regression
On Wed, 11 Jul 2012, Andrew Morton wrote: > The patch was authored by eparis, not me. I don't even know what it does (I > never looked). But it lets me log into my (old) Fedora test box, which > is a distinct improvement over mainline. Ok, it needs his signoff, then. Not sure why it doesn't already ? - James -- James Morris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] zsmalloc: remove x86 dependency
On 07/11/2012 05:42 PM, Nitin Gupta wrote: > On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings > wrote: >> On 07/11/2012 01:26 PM, Nitin Gupta wrote: >>> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On >>> zs_unmap_object() we would just do the reverse and restore objects as in >>> figure-1. >> >> Hey Nitin, thanks for the feedback. >> >> Correct me if I'm wrong, but it seems like you wouldn't be able to map >> ob2 while ob1 was mapped with this design. You'd need some sort of >> zspage level protection against concurrent object mappings. The >> code for that protection might cancel any benefit you would gain by >> doing it this way. >> > > Do you think blocking access of just one particular object (or > blocking an entire zspage, for simplicity) for a short time would be > an issue, apart from the complexity of implementing per zspage > locking? It would only need to prevent the mapping of the temporarily displaced object, but I said zspage because I don't know how we would do per-object locking. I actually don't know how we would do zspage locking either unless there is a lock in the struct page we can use. Either way, I think it is a complexity I think we'd be better to avoid for now. I'm trying to get zsmalloc in shape to bring into mainline, so I'm really focusing on portability first and low hanging performance fruit second. This optimization would be more like top-of-the-tree performance fruit :-/ However, if you want to try it out, don't let me stop you :) Thanks, Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dma-fence: dma-buf synchronization
On Wed, Jul 11, 2012 at 6:49 PM, Maarten Lankhorst wrote: > Op 12-07-12 00:29, Rob Clark schreef: >> From: Rob Clark >> >> A dma-fence can be attached to a buffer which is being filled or consumed >> by hw, to allow userspace to pass the buffer without waiting to another >> device. For example, userspace can call page_flip ioctl to display the >> next frame of graphics after kicking the GPU but while the GPU is still >> rendering. The display device sharing the buffer with the GPU would >> attach a callback to get notified when the GPU's rendering-complete IRQ >> fires, to update the scan-out address of the display, without having to >> wake up userspace. >> >> A dma-fence is transient, one-shot deal. It is allocated and attached >> to dma-buf's list of fences. When the one that attached it is done, >> with the pending operation, it can signal the fence removing it from the >> dma-buf's list of fences: >> >> + dma_buf_attach_fence() >> + dma_fence_signal() >> >> Other drivers can access the current fence on the dma-buf (if any), >> which increment's the fences refcnt: >> >> + dma_buf_get_fence() >> + dma_fence_put() >> >> The one pending on the fence can add an async callback (and optionally >> cancel it.. for example, to recover from GPU hangs): >> >> + dma_fence_add_callback() >> + dma_fence_cancel_callback() >> >> Or wait synchronously (optionally with timeout or from atomic context): >> >> + dma_fence_wait() > Waiting for an undefined time from atomic context is probably > not a good idea. However just checking non-blocking if the fence > has passed would be fine. yeah, the intention was to use short timeout or no-blocking if from atomic ctxt, or interruptible with whatever timeout if non-atomic (for example, to implement a CPU_PREP sort of ioctl) >> A default software-only implementation is provided, which can be used >> by drivers attaching a fence to a buffer when they have no other means >> for hw sync. But a memory backed fence is also envisioned, because it >> is common that GPU's can write to, or poll on some memory location for >> synchronization. For example: >> >> fence = dma_buf_get_fence(dmabuf); >> if (fence->ops == _dma_fence_ops) { >> dma_buf *fence_buf; >> mem_dma_fence_get_buf(fence, _buf, ); >> ... tell the hw the memory location to wait on ... >> } else { >> /* fall-back to sw sync * / >> dma_fence_add_callback(fence, my_cb); >> } > This will probably have to be done on dma-buf attach time instead, > so drivers that support both know if an interrupt needs to be inserted > in the command stream or not. probably a hint, ie. add a flags parameter to attach() would do the job? >> The memory location is itself backed by dma-buf, to simplify mapping >> to the device's address space, an idea borrowed from Maarten Lankhorst. >> >> NOTE: the memory location fence is not implemented yet, the above is >> just for explaining how it would work. >> >> On SoC platforms, if some other hw mechanism is provided for synchronizing >> between IP blocks, it could be supported as an alternate implementation >> with it's own fence ops in a similar way. >> >> The other non-sw implementations would wrap the add/cancel_callback and >> wait fence ops, so that they can keep track if a device not supporting >> hw sync is waiting on the fence, and in this case should arrange to > Standardizing an errno in case the device already signalled the fence > would be nice. I was just using EINVAL, but perhaps there is a better choice? >> call dma_fence_signal() at some point after the condition has changed, >> to notify other devices waiting on the fence. If there are no sw >> waiters, this can be skipped to avoid waking the CPU unnecessarily. > Can this be done inside interrupt context? I could insert some > semaphores into intel that would block execution, but I would > save a context switch if intel could release the command blocking > from inside irq context. yeah, it was the intention that signal() could be from irq handler directly (and that registered cb's can be called from atomic ctxt.. which is sufficient if they just have to bang a register or two, otherwise they can schedule a worker) >> The intention is to provide a userspace interface (presumably via eventfd) >> later, to be used in conjunction with dma-buf's mmap support for sw access >> to buffers (or for userspace apps that would prefer to do their own >> synchronization). > I'll have to look at this more in the morning but I see no barrier for > this being used with dmabufmgr right now. > > The fence lock should probably not be static but shared with the > dmabufmgr code, with _locked variants. > Oh and in your example code I noticed inconsistent use of spin_lock > and spin_lock_irqsave, do you intend it to be used in hardirq context? oh, whoops, I started w/ spin_lock() an then realized I wanted signal() from irq handlers and forgot to update all the other places where spin_lock() was used