Re: [PATCH] ARM: OMAP2+: am335x-bone*: add DT for BeagleBone Black
On Sep 10, 2013, at 12:45 AM, "Koen Kooi" wrote: > > Op 10 sep. 2013, om 01:42 heeft Joel Fernandes het volgende > geschreven: > >> On 09/09/2013 03:12 PM, Joel Fernandes wrote: >>> On 09/09/2013 03:00 PM, Koen Kooi wrote: Op 9 sep. 2013, om 21:50 heeft Joel Fernandes het volgende geschreven: > On 09/09/2013 01:51 PM, Joel Fernandes wrote: >> On 09/09/2013 01:43 PM, Koen Kooi wrote: >>> >>> Op 9 sep. 2013, om 20:27 heeft Joel Fernandes het >>> volgende geschreven: >>> On 09/09/2013 10:51 AM, Koen Kooi wrote: > > Op 9 sep. 2013, om 17:23 heeft Kevin Hilman het > volgende geschreven: > >> Koen Kooi writes: >> >>> The BeagleBone Black is basically a regular BeagleBone with eMMC >>> and HDMI added, >>> so create a common dtsi both can use. >>> >>> IMPORTANT: booting the existing am335x-bone.dts will blow up the >>> HDMI transceiver >>> after a dozen boots with an uSD card inserted because LDO will be >>> at 3.3V instead >>> of 1.8. >>> >>> MMC support for AM335x still isn't in, so only the LDO change has >>> been added. >>> >>> Cc: sta...@vger.kernel.org >>> >>> Signed-off-by: Koen Kooi >>> Tested-by: Tom Rini >>> Tested-by: Matt Porter >> >> I guess the subject should've included v5? > > Yes, I blame it on being monday :) The series you're posting will require rebasing on the current MMC DT series that is being discussed last couple of weeks on the mailing list which were waiting until now as DMA support was missing. Now that DMA support is pulled in, it is safe to apply those patches so I will be reposting them shortly. Please hold off any changes until those patches are posted. This will avoid unnecessary conflicts. >>> >>> Or you can rebase on top of this patch since it has no dependencies >>> *and* fixes blowing up boards. FWIW, git-rebase, git-cherry-pick and >>> git-am -3 track the rename in my patch just fine. >> >> That's fair enough, since Kevin Acked and Benoit is pulling it, I'm >> fine with >> rebasing on top of it and we avoid any merge conflicts. > > I noticed - there were still some comments from Felipe on the v4 series > of this > patch regarding RF cape and HDMI may be breaking it. How are you > addressing that? Capes will never go into the .dts and HDMI support needs some serious patching before it can get enabled in the DT. And the RF cape isn't being sold since it has no sw support. No need to worry about things in the 3.15/3.16 timeframe. Unless you want this LDO3 fix not to go in ASAP. Joel, is there anything relevant *right now* blocking this patch going in? If not, please test it and add your Tested-by: line. >>> >>> We don't merge things in hurry and focus is to do things the right way.. I >>> just >>> want to make sure that all possible comments have been addressed. >>> >>> Otherwise patch looks OK and hope everyone else thinks so too. I am dealing >>> with >>> some merge conflicts right now with my series on top of this though, but >>> they >>> should be easy enough to fix up. That's delaying my testing, but otherwise >>> as >>> such I don't have any objection to this patch (provided the conclusion is >>> that >>> all comments have been addressed..). Thanks! >> >> Koen, >> >> One note though, since I don't use HDMI (or BBB much for that matter), I was >> ok >> with taking a risk of upping the ldo3 regulator voltage to 3.3v on my board >> which I needed to do to get to the boot prompt. >> I applied my AM335x DMA and MMC patches and tried to boot with rootfs as >> MMC1. >> >> With 1.8v, I get the following during boot: >> [2.236043] mmc0: host doesn't support card's voltages >> [2.241659] mmc0: error -22 whilst initialising SD card >> >> That's strange because I do have an SDHC card. With 3.3v it works fine. >> >> I will add a note about this to my series. Since this more of an MMC issue >> than >> anything, and your patch series doesn't enable MMC, you can add my tested-by: >> >> Tested-by: Joel Fernandes >> >> Later on, the regulator voltage may need to be tweaked for MMC support. > > See https://lkml.org/lkml/2013/9/6/95 and https://lkml.org/lkml/2013/9/6/183 Ok since you said in above thread you'll rebase the card detect and the regulator fixes , I'll let you do that and drop my drop my hacked mmc1 patch for BBB from my series. Regards, -Joel-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 10:39 PM, Wei Ni wrote: On 09/10/2013 12:50 PM, Guenter Roeck wrote: On 09/09/2013 09:05 PM, Wei Ni wrote: On 09/10/2013 04:39 AM, Mark Brown wrote: * PGP Signed by an unknown key On Mon, Sep 09, 2013 at 09:17:35AM -0700, Guenter Roeck wrote: On Mon, Sep 09, 2013 at 05:02:37PM +0100, Mark Brown wrote: It does, though it gets complicated trying to use it for a case like this since you can't really tell if the regulator was powered on immediately before the device got probed by another device on the bus. Why not ? Just keep a timestamp. The support is a callback on state changes; we could keep a timestamp but there's still going to be race conditions around bootloaders. It's doable though. On a higher level, I wonder if such functionality should be added in the i2c subsystem and not in i2c client drivers. Has anyone thought about this ? I'm not sure what the subsystem would do for such delays? It's fairly common for things that need this to also want to do things like manipulate GPIOs as part of the power on sequence so the applicability is relatively limited, plus it's not even I2C specific, the same applies to other buses so it ought to be a driver core thing. Possibly. I just thought about i2c since it also takes care of basic devicetree bindings. Something along the line of if devicetree bindings for this device declare one or more regulators, enable those regulators before calling the driver probe function. That's definitely a driver core thing, not I2C - there's nothing specific to I2C in there at all, needing power is pretty generic. I have considered this before, something along the lines of what we have for pinctrl, but unfortunately the generic case isn't quite generic enough to make it easy. It'd need to be an explicit list of regulators (partly just to make it opt in and avoid breaking things) and you'd want to have a way of handling the different suspend/resume behaviour that devices want. There's a few patterns there. It's definitely something I think about from time to time and it would be useful to factor things out, the issue is getting a good enough model of what's going on. There was some work on a generic helper for power on sequences but it stalled since it wasn't accepted for the original purpose (LCD panel power ons IIRC). Too bad. I think it could be kept quite simple, though, by handling it through the regulator subsystem as suggested above. A generic binding for a per-regulator and per-device poweron delay should solve that and possibly even make it transparent to the actual driver code. Lots of things have a GPIO for reset too, and some want clocks too. For maximum usefulness this should be cross subsystem. I suspect the reset controller API may be able to handle some of it. The regulator power on delays are already handled transparently, by the time regulator_enable() returns the ramp should be finished. I think the regulator should encoded its own startup delay. Each individual device should handle its own requirements for delay after power is stable. The regulator_enable() will handle the delays for the regulator device. And adding the msleep(25) is for lm90 device. If without delay, sometimes the device can't work properly. If read lm90 register immediately after enabling regulator, the reading may be failed. I'm not sure if 25ms is the right value, I read the LM90 SPEC, the max of "SMBus Clock Low Time" is 25ms, so I supposed that it may need about 25ms to stable after power on. Problem is that you are always waiting, even if the same regulator was turned on already, and even if it is a dummy regulator. Imagine every driver doing that. Booting would take forever, just because of unnecessary delays all over the place. There has to be a better solution which does not include a mandatory and potentially unnecessary wait time in the driver. At a previous company we had a design with literally dozens of those chip. You really want to force such a boot delay on every user ? But essentially you don't even know if it is needed; you are just guessing. That is not an acceptable reason to add such a delay, mandatory or not. I think the device need time to wait stable after power on, but it's difficult to get an exact delay value, and this delay may also relate with platform design, so how about to add a optional property in the DT node, such as "power-on-delay-ms" ? Possibly, but that still doesn't solve the problem that you are going to wait even if the regulator was already turned on. Simple example: A system with two sensors, both of which share the same regulator. Each of them will require a delay after turning on power, but only if it was just turned on and not if it was already active. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ARM: OMAP2+: am335x-bone*: add DT for BeagleBone Black
Op 10 sep. 2013, om 01:42 heeft Joel Fernandes het volgende geschreven: > On 09/09/2013 03:12 PM, Joel Fernandes wrote: >> On 09/09/2013 03:00 PM, Koen Kooi wrote: >>> >>> Op 9 sep. 2013, om 21:50 heeft Joel Fernandes het volgende >>> geschreven: >>> On 09/09/2013 01:51 PM, Joel Fernandes wrote: > On 09/09/2013 01:43 PM, Koen Kooi wrote: >> >> Op 9 sep. 2013, om 20:27 heeft Joel Fernandes het >> volgende geschreven: >> >>> On 09/09/2013 10:51 AM, Koen Kooi wrote: Op 9 sep. 2013, om 17:23 heeft Kevin Hilman het volgende geschreven: > Koen Kooi writes: > >> The BeagleBone Black is basically a regular BeagleBone with eMMC and >> HDMI added, >> so create a common dtsi both can use. >> >> IMPORTANT: booting the existing am335x-bone.dts will blow up the >> HDMI transceiver >> after a dozen boots with an uSD card inserted because LDO will be at >> 3.3V instead >> of 1.8. >> >> MMC support for AM335x still isn't in, so only the LDO change has >> been added. >> >> Cc: sta...@vger.kernel.org >> >> Signed-off-by: Koen Kooi >> Tested-by: Tom Rini >> Tested-by: Matt Porter > > I guess the subject should've included v5? Yes, I blame it on being monday :) >>> >>> The series you're posting will require rebasing on the current MMC DT >>> series >>> that is being discussed last couple of weeks on the mailing list which >>> were >>> waiting until now as DMA support was missing. Now that DMA support is >>> pulled in, >>> it is safe to apply those patches so I will be reposting them shortly. >>> >>> Please hold off any changes until those patches are posted. This will >>> avoid >>> unnecessary conflicts. >> >> Or you can rebase on top of this patch since it has no dependencies >> *and* fixes blowing up boards. FWIW, git-rebase, git-cherry-pick and >> git-am -3 track the rename in my patch just fine. >> > > That's fair enough, since Kevin Acked and Benoit is pulling it, I'm fine > with > rebasing on top of it and we avoid any merge conflicts. > I noticed - there were still some comments from Felipe on the v4 series of this patch regarding RF cape and HDMI may be breaking it. How are you addressing that? >>> >>> Capes will never go into the .dts and HDMI support needs some serious >>> patching before it can get enabled in the DT. And the RF cape isn't being >>> sold since it has no sw support. No need to worry about things in the >>> 3.15/3.16 timeframe. Unless you want this LDO3 fix not to go in ASAP. >>> >>> Joel, is there anything relevant *right now* blocking this patch going in? >>> If not, please test it and add your Tested-by: line. >>> >> >> We don't merge things in hurry and focus is to do things the right way.. I >> just >> want to make sure that all possible comments have been addressed. >> >> Otherwise patch looks OK and hope everyone else thinks so too. I am dealing >> with >> some merge conflicts right now with my series on top of this though, but they >> should be easy enough to fix up. That's delaying my testing, but otherwise as >> such I don't have any objection to this patch (provided the conclusion is >> that >> all comments have been addressed..). Thanks! > > Koen, > > One note though, since I don't use HDMI (or BBB much for that matter), I was > ok > with taking a risk of upping the ldo3 regulator voltage to 3.3v on my board > which I needed to do to get to the boot prompt. > I applied my AM335x DMA and MMC patches and tried to boot with rootfs as MMC1. > > With 1.8v, I get the following during boot: > [2.236043] mmc0: host doesn't support card's voltages > [2.241659] mmc0: error -22 whilst initialising SD card > > That's strange because I do have an SDHC card. With 3.3v it works fine. > > I will add a note about this to my series. Since this more of an MMC issue > than > anything, and your patch series doesn't enable MMC, you can add my tested-by: > > Tested-by: Joel Fernandes > > Later on, the regulator voltage may need to be tweaked for MMC support. See https://lkml.org/lkml/2013/9/6/95 and https://lkml.org/lkml/2013/9/6/183-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] intel-iommu: Quiesce devices before disabling IOMMU
(2013/09/09 18:07), David Woodhouse wrote: > On Wed, 2013-08-21 at 16:15 +0900, Takao Indoh wrote: >> >> This causes problem on kdump. Devices are working in first kernel, and >> after switching to second kernel and initializing IOMMU, many DMAR faults >> occur and it causes problems like driver error or PCI SERR, at last >> kdump fails. This patch fixes this problem. > > I'm not sure I'd call this a fix. > > If the driver is so broken that it cannot get the device working again > after a fault, surely the driver needs to be fixed? Yes,this problem may be solved by fixing driver. Actually megaraid sas driver is recently fixed for this problem. (See commit 6431f5d7) But I think root cause of this problem is initializing IOMMU while DMA is still working, and I want to solve the root cause rather than handling it in each driver, otherwise we have to fix driver each time we find this kind of problem. > > If the system is suffering an IRQ storm because device doesn't give up > after the first few faults, then we should switch off the fault > *reporting* for that device so that its faults get ignored (until it > next actually sets up a DMA mapping, or something). In such a case, yeah limiting messages is enough. > > For the IOMMU code to reset individual devices, just because they still > have an active DMA mapping even if they're not *doing* DMA, seems wrong. > You'll even end up resetting devices just because they have an RMRR, > won't you? (Although I wouldn't lose any sleep over that, I suppose. In > fact it might be a *feature*... :) Right, current code is resetting devices which *may* be doing DMA. The ideal way is finding devices which are actually doing DMA and reset only them but I don't know how we can do this, though I think current code is sufficient. Thanks, Takao Indoh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REPOST PATCH 3/4] slab: introduce byte sized index for the freelist of a slab
On Mon, Sep 09, 2013 at 02:44:03PM +, Christoph Lameter wrote: > On Mon, 9 Sep 2013, Joonsoo Kim wrote: > > > 32 byte is not minimum object size, minimum *kmalloc* object size > > in default configuration. There are some slabs that their object size is > > less than 32 byte. If we have a 8 byte sized kmem_cache, it has 512 objects > > in 4K page. > > As far as I can recall only SLUB supports 8 byte objects. SLABs mininum > has always been 32 bytes. No. There are many slabs that their object size are less than 32 byte. And I can also create a 8 byte sized slab in my kernel with SLAB. js1304@js1304-P5Q-DELUXE:~/Projects/remote_git/linux$ sudo cat /proc/slabinfo | awk '{if($4 < 32) print $0}' slabinfo - version: 2.1 ecryptfs_file_cache 0 0 16 2401 : tunables 120 608 : slabdata 0 0 0 jbd2_revoke_table_s 2240 16 2401 : tunables 120 608 : slabdata 1 1 0 journal_handle 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 revoke_table 0 0 16 2401 : tunables 120 608 : slabdata 0 0 0 scsi_data_buffer 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 fsnotify_event_holder 0 0 24 1631 : tunables 120 608 : slabdata 0 0 0 numa_policy3163 24 1631 : tunables 120 608 : slabdata 1 1 0 > > > Moreover, we can configure slab_max_order in boot time so that we can't know > > how many object are in a certain slab in compile time. Therefore we can't > > decide the size of the index in compile time. > > You can ignore the slab_max_order if necessary. > > > I think that byte and short int sized index support would be enough, but > > it should be determined at runtime. > > On x86 f.e. it would add useless branching. The branches are never taken. > You only need these if you do bad things to the system like requiring > large contiguous allocs. As I said before, since there is a possibility that some runtime loaded modules use a 8 byte sized slab, we can't determine index size in compile time. Otherwise we should always use short int sized index and I think that it is worse than adding a branch. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On 09/10/2013 01:22 PM, Igor Gnatenko wrote: > On Tue, 2013-09-10 at 13:16 +0800, Aaron Lu wrote: >> On 09/10/2013 01:13 PM, Igor Gnatenko wrote: >>> On Tue, 2013-09-10 at 11:27 +0800, Aaron Lu wrote: On 09/09/2013 07:44 PM, Igor Gnatenko wrote: > On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: >> diff --git a/drivers/gpu/drm/i915/i915_dma.c >> b/drivers/gpu/drm/i915/i915_dma.c >> index f466980..75fba17 100644 >> --- a/drivers/gpu/drm/i915/i915_dma.c >> +++ b/drivers/gpu/drm/i915/i915_dma.c >> @@ -1650,7 +1650,7 @@ int i915_driver_load(struct drm_device *dev, >> unsigned long flags) >> if (INTEL_INFO(dev)->num_pipes) { >> /* Must be done after probing outputs */ >> intel_opregion_init(dev); >> -acpi_video_register(); >> +__acpi_video_register(i915_take_over_backlight); >> } >> >> if (IS_GEN5(dev)) > > I can't compile: > > > DEBUG: drivers/gpu/drm/i915/i915_dma.c: In function 'i915_driver_load': > DEBUG: drivers/gpu/drm/i915/i915_dma.c:1661:3: error: implicit > declaration of function > '__acpi_video_register' [-Werror=implicit-function-declaration] > DEBUG:__acpi_video_register(i915_take_over_backlight); > DEBUG:^ > DEBUG: cc1: some warnings being treated as errors > DEBUG: make[4]: *** [drivers/gpu/drm/i915/i915_dma.o] Error 1 > DEBUG: make[3]: *** [drivers/gpu/drm/i915] Error 2 > DEBUG: make[2]: *** [drivers/gpu/drm] Error 2 > DEBUG: make[1]: *** [drivers/gpu] Error 2 > DEBUG: make: *** [drivers] Error 2 > The two patches are based on top of Rafael's linux-next tree. I just tried it again, no compile problem for me. I also tried on today Linus' master tree, as there are some updates from i915, two conflicts exist. I've just resolved them and will update it in next revision. If you want to try it now, please use: https://github.com/aaronlu/linux acpi_video_rework Thanks, Aaron >>> >>> Thanks. this patch fixes my problems w/ compilation. I've tested this >>> two patches and after apply I have: >>> $ tree /sys/class/backlight/ >>> /sys/class/backlight/ >>> |-- acpi_video0 >>> -> ../../devices/pci:00/:00:02.0/backlight/acpi_video0 >>> `-- intel_backlight >>> -> >>> ../../devices/pci:00/:00:02.0/drm/card0/card0-LVDS-1/intel_backlight >>> >>> 2 directories, 0 files >>> >>> I think it's didn't unregistered.. I may forget. I need to apply one of >>> patch from Matthew ? >> >> You need to specify i915.take_over_backlight=1 in kernel cmdline, that >> module option is set to false by default for now. >> >> Thanks for the test. >> >> -Aaron >> >>> >>> Some strings from logs: >>> DMI: LENOVO 23205NG/23205NG, BIOS G2ET92WW (2.52 ) 02/22/2013 >>> thinkpad_acpi: Standard ACPI backlight interface available, not loading >>> native one >>> >> > > Thanks for quick answer. Yes. This option do unregister. Thanks. but for > this patch-set I also need "[PATCH 2/3] ACPI / video: Always call > acpi_video_init_brightness() on init" from Matthew (for notifications in > DE). That patch is reverted as it cause problem for other system: https://bugs.freedesktop.org/show_bug.cgi?id=68355 OTOH, the thinkpad-acpi module already has a call to _BCL except that the tpacpi_acpi_handle_locate failed to locate video controller's handle: https://bugzilla.kernel.org/show_bug.cgi?id=51231#c121 I'll see if I can figure out why. Thanks, Aaron -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/10/2013 12:50 PM, Guenter Roeck wrote: > On 09/09/2013 09:05 PM, Wei Ni wrote: >> On 09/10/2013 04:39 AM, Mark Brown wrote: >>> * PGP Signed by an unknown key >>> >>> On Mon, Sep 09, 2013 at 09:17:35AM -0700, Guenter Roeck wrote: On Mon, Sep 09, 2013 at 05:02:37PM +0100, Mark Brown wrote: >>> > It does, though it gets complicated trying to use it for a case like > this since you can't really tell if the regulator was powered on > immediately before the device got probed by another device on the bus. >>> Why not ? Just keep a timestamp. >>> >>> The support is a callback on state changes; we could keep a timestamp >>> but there's still going to be race conditions around bootloaders. It's >>> doable though. >>> >> On a higher level, I wonder if such functionality should be added in the >> i2c >> subsystem and not in i2c client drivers. Has anyone thought about this ? >>> > I'm not sure what the subsystem would do for such delays? It's fairly > common for things that need this to also want to do things like > manipulate GPIOs as part of the power on sequence so the applicability > is relatively limited, plus it's not even I2C specific, the same applies > to other buses so it ought to be a driver core thing. >>> Possibly. I just thought about i2c since it also takes care of basic devicetree bindings. Something along the line of if devicetree bindings for this device declare one or more regulators, enable those regulators before calling the driver probe function. >>> >>> That's definitely a driver core thing, not I2C - there's nothing >>> specific to I2C in there at all, needing power is pretty generic. I >>> have considered this before, something along the lines of what we have >>> for pinctrl, but unfortunately the generic case isn't quite generic >>> enough to make it easy. It'd need to be an explicit list of regulators >>> (partly just to make it opt in and avoid breaking things) and you'd want >>> to have a way of handling the different suspend/resume behaviour that >>> devices want. There's a few patterns there. >>> >>> It's definitely something I think about from time to time and it would >>> be useful to factor things out, the issue is getting a good enough model >>> of what's going on. >>> > There was some work on a generic helper for power on sequences but it > stalled since it wasn't accepted for the original purpose (LCD panel > power ons IIRC). >>> Too bad. I think it could be kept quite simple, though, by handling it through the regulator subsystem as suggested above. A generic binding for a per-regulator and per-device poweron delay should solve that and possibly even make it transparent to the actual driver code. >>> >>> Lots of things have a GPIO for reset too, and some want clocks too. For >>> maximum usefulness this should be cross subsystem. I suspect the reset >>> controller API may be able to handle some of it. >>> >>> The regulator power on delays are already handled transparently, by the >>> time regulator_enable() returns the ramp should be finished. >> >> I think the regulator should encoded its own startup delay. Each >> individual device should handle its own requirements for delay after >> power is stable. >> The regulator_enable() will handle the delays for the regulator device. >> And adding the msleep(25) is for lm90 device. If without delay, >> sometimes the device can't work properly. If read lm90 register >> immediately after enabling regulator, the reading may be failed. >> I'm not sure if 25ms is the right value, I read the LM90 SPEC, the max >> of "SMBus Clock Low Time" is 25ms, so I supposed that it may need about >> 25ms to stable after power on. >> > > Problem is that you are always waiting, even if the same regulator was > turned on already, and even if it is a dummy regulator. > > Imagine every driver doing that. Booting would take forever, just because of > unnecessary delays all over the place. There has to be a better solution > which does not include a mandatory and potentially unnecessary wait time > in the driver. At a previous company we had a design with literally dozens > of those chip. You really want to force such a boot delay on every user ? > > But essentially you don't even know if it is needed; you are just guessing. > That is not an acceptable reason to add such a delay, mandatory or not. I think the device need time to wait stable after power on, but it's difficult to get an exact delay value, and this delay may also relate with platform design, so how about to add a optional property in the DT node, such as "power-on-delay-ms" ? > > Guenter > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Subject: [PATCH] md: avoid deadlock when raid5 array has unack badblocks during md_stop_writes.
On Tue, 10 Sep 2013 13:00:52 +0800 y b wrote: > When raid5 hit a fresh badblock, this badblock will flagged as unack > badblock until md_update_sb is called. > But md_stop/reboot/md_set_readonly will avoid raid5d call md_update_sb > in md_check_recovery, the badblock will always be unack, so raid5d > thread enter a infinite loop and never can unregister sync_thread > that cause deadlock. > > To solve this, before md_stop_writes call md_unregister_thread, set > MD_STOPPING_WRITES on mddev->flags. In raid5.c analyse_stripe judge > MD_STOPPING_WRITES bit on mddev->flags, if setted don't block rdev > to wait md_update_sb. so raid5d thread can be finished. > Signed-off-by: Bian Yu Have you actually seen this deadlock happen? Because I don't think it can happen. By the time we get to md_stop or md_set_readonly all dirty buffers should have been flushed and there should be no pending writes so nothing to wait for an unacked bad block. If you have seen this happen, any details you can give about the exact state of the RAID5 when it deadlocked, the stack trace of any relevant processes etc would be very helpful. Thanks, NeilBrown > --- > drivers/md/md.c|2 ++ > drivers/md/md.h|3 +++ > drivers/md/raid5.c |3 ++- > 3 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index adf4d7e..54ef71f 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -5278,6 +5278,7 @@ static void md_clean(struct mddev *mddev) > static void __md_stop_writes(struct mddev *mddev) > { > set_bit(MD_RECOVERY_FROZEN, >recovery); > +set_bit(MD_STOPPING_WRITES, >flags); > if (mddev->sync_thread) { > set_bit(MD_RECOVERY_INTR, >recovery); > md_reap_sync_thread(mddev); > @@ -5294,6 +5295,7 @@ static void __md_stop_writes(struct mddev *mddev) > mddev->in_sync = 1; > md_update_sb(mddev, 1); > } > +clear_bit(MD_STOPPING_WRITES, >flags); > } > > void md_stop_writes(struct mddev *mddev) > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 608050c..c998b82 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -214,6 +214,9 @@ struct mddev { > #define MD_STILL_CLOSED4/* If set, then array has not been opened > since > * md_ioctl checked on it. > */ > +#define MD_STOPPING_WRITES 5 /* If set, raid5 shouldn't set > unacknowledged > + * badblock blocked in analyse_stripe to avoid infinite loop > + */ > > intsuspended; > atomic_tactive_io; > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index f9972e2..ff1aecf 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -3446,7 +3446,8 @@ static void analyse_stripe(struct stripe_head > *sh, struct stripe_head_state *s) > if (rdev) { > is_bad = is_badblock(rdev, sh->sector, STRIPE_SECTORS, > _bad, _sectors); > -if (s->blocked_rdev == NULL > +if (!test_bit(MD_STOPPING_WRITES, >mddev->flags) > +&& s->blocked_rdev == NULL > && (test_bit(Blocked, >flags) > || is_bad < 0)) { > if (is_bad < 0) signature.asc Description: PGP signature
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: > According to Matthew Garrett, "Windows 8 leaves backlight control up > to individual graphics drivers rather than making ACPI calls itself. > There's plenty of evidence to suggest that the Intel driver for > Windows [8] doesn't use the ACPI interface, including the fact that > it's broken on a bunch of machines when the OS claims to support > Windows 8. The simplest thing to do appears to be to disable the > ACPI backlight interface on these systems". > > There's a problem with that approach, however, because simply > avoiding to register the ACPI backlight interface if the firmware > calls _OSI for Windows 8 may not work in the following situations: > (1) The ACPI backlight interface actually works on the given system > and the i915 driver is not loaded (e.g. another graphics driver > is used). > (2) The ACPI backlight interface doesn't work on the given system, > but there is a vendor platform driver that will register its > own, equally broken, backlight interface if not prevented from > doing so by the ACPI subsystem. > Therefore we need to allow the ACPI backlight interface to be > registered until the i915 driver is loaded which then will unregister > it if the firmware has called _OSI for Windows 8 (or will register > the ACPI video driver without backlight support if not already > present). > > For this reason, introduce an alternative function for registering > ACPI video, __acpi_video_register(bool), that if ture is passed, > will check whether or not the ACPI video driver has already been > registered and whether or not the backlight Windows 8 quirk has to > be applied. If the quirk has to be applied, it will block the ACPI > backlight support and either unregister the backlight interface if > the ACPI video driver has already been registered, or register the > ACPI video driver without the backlight interface otherwise. Make > the i915 driver use __acpi_video_register() instead of > acpi_video_register() in i915_driver_load(), and the param passed > there is controlled by the i915 module level parameter > i915_take_over_backlight, which is set to false by default. > > This change is evolved from earlier patches of Matthew Garrett, > Chun-Yi Lee and Seth Forshee and is heavily based on two patches > from Rafael: > https://lkml.org/lkml/2013/7/17/720 > https://lkml.org/lkml/2013/7/24/806 > > Signed-off-by: Aaron Lu Tested-by: Igor Gnatenko > --- > drivers/acpi/internal.h | 2 ++ > drivers/acpi/video.c| 24 > drivers/acpi/video_detect.c | 15 ++- > drivers/gpu/drm/i915/i915_dma.c | 2 +- > drivers/gpu/drm/i915/i915_drv.c | 5 + > drivers/gpu/drm/i915/i915_drv.h | 1 + > include/acpi/video.h| 9 +++-- > include/linux/acpi.h| 1 + > 8 files changed, 47 insertions(+), 12 deletions(-) -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-3.fc20.x86_64 -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-1.fc20.x86_64 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ACPI / video: seperate backlight control and event interface
On Mon, 2013-09-09 at 16:40 +0800, Aaron Lu wrote: > The backlight control and event delivery functionality provided by ACPI > video module is mixed together and registered all during video device > enumeration time. As a result, the two functionality are also removed > together on module unload time or by the acpi_video_unregister function. > The two functionalities are actually independent and one may be useful > while the other one may be broken, so it is desirable to seperate the > two functionalities such that it is clear and easy to disable one > functionality without affecting the other one. This patch does the > seperation and as a result, a new video_bus_head list is introduced to > store all registered video bus structure and a new function > acpi_video_unregister_backlight is introduced to unregister backlight > interfaces for all video devices belonging to stored video buses. > > Currently, there is no need to unregister ACPI video's event delivery > functionality alone so the function acpi_video_remove_notify_handler is > not introduced, it can be easily added when needed. > > Signed-off-by: Aaron Lu Tested-by: Igor Gnatenko > --- > drivers/acpi/video.c | 451 > ++- > include/acpi/video.h | 2 + > 2 files changed, 264 insertions(+), 189 deletions(-) -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-3.fc20.x86_64 -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-1.fc20.x86_64 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Tue, 2013-09-10 at 13:16 +0800, Aaron Lu wrote: > On 09/10/2013 01:13 PM, Igor Gnatenko wrote: > > On Tue, 2013-09-10 at 11:27 +0800, Aaron Lu wrote: > >> On 09/09/2013 07:44 PM, Igor Gnatenko wrote: > >>> On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: > diff --git a/drivers/gpu/drm/i915/i915_dma.c > b/drivers/gpu/drm/i915/i915_dma.c > index f466980..75fba17 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1650,7 +1650,7 @@ int i915_driver_load(struct drm_device *dev, > unsigned long flags) > if (INTEL_INFO(dev)->num_pipes) { > /* Must be done after probing outputs */ > intel_opregion_init(dev); > -acpi_video_register(); > +__acpi_video_register(i915_take_over_backlight); > } > > if (IS_GEN5(dev)) > >>> > >>> I can't compile: > >>> > >>> > >>> DEBUG: drivers/gpu/drm/i915/i915_dma.c: In function 'i915_driver_load': > >>> DEBUG: drivers/gpu/drm/i915/i915_dma.c:1661:3: error: implicit > >>> declaration of function > >>> '__acpi_video_register' [-Werror=implicit-function-declaration] > >>> DEBUG:__acpi_video_register(i915_take_over_backlight); > >>> DEBUG:^ > >>> DEBUG: cc1: some warnings being treated as errors > >>> DEBUG: make[4]: *** [drivers/gpu/drm/i915/i915_dma.o] Error 1 > >>> DEBUG: make[3]: *** [drivers/gpu/drm/i915] Error 2 > >>> DEBUG: make[2]: *** [drivers/gpu/drm] Error 2 > >>> DEBUG: make[1]: *** [drivers/gpu] Error 2 > >>> DEBUG: make: *** [drivers] Error 2 > >>> > >> > >> The two patches are based on top of Rafael's linux-next tree. I just > >> tried it again, no compile problem for me. I also tried on today Linus' > >> master tree, as there are some updates from i915, two conflicts exist. > >> I've just resolved them and will update it in next revision. > >> If you want to try it now, please use: > >> https://github.com/aaronlu/linux acpi_video_rework > >> > >> Thanks, > >> Aaron > > > > Thanks. this patch fixes my problems w/ compilation. I've tested this > > two patches and after apply I have: > > $ tree /sys/class/backlight/ > > /sys/class/backlight/ > > |-- acpi_video0 > > -> ../../devices/pci:00/:00:02.0/backlight/acpi_video0 > > `-- intel_backlight > > -> > > ../../devices/pci:00/:00:02.0/drm/card0/card0-LVDS-1/intel_backlight > > > > 2 directories, 0 files > > > > I think it's didn't unregistered.. I may forget. I need to apply one of > > patch from Matthew ? > > You need to specify i915.take_over_backlight=1 in kernel cmdline, that > module option is set to false by default for now. > > Thanks for the test. > > -Aaron > > > > > Some strings from logs: > > DMI: LENOVO 23205NG/23205NG, BIOS G2ET92WW (2.52 ) 02/22/2013 > > thinkpad_acpi: Standard ACPI backlight interface available, not loading > > native one > > > Thanks for quick answer. Yes. This option do unregister. Thanks. but for this patch-set I also need "[PATCH 2/3] ACPI / video: Always call acpi_video_init_brightness() on init" from Matthew (for notifications in DE). -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-1.fc20.x86_64 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] staging: usbip: stub_main: correctly handle return value
ret == 0 means success, anything else is failure. Signed-off-by: navin patidar --- drivers/staging/usbip/stub_main.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/usbip/stub_main.c b/drivers/staging/usbip/stub_main.c index 33027cc..baf857f 100644 --- a/drivers/staging/usbip/stub_main.c +++ b/drivers/staging/usbip/stub_main.c @@ -255,14 +255,14 @@ static int __init usbip_host_init(void) } ret = usb_register(_driver); - if (ret < 0) { + if (ret) { pr_err("usb_register failed %d\n", ret); goto err_usb_register; } ret = driver_create_file(_driver.drvwrap.driver, _attr_match_busid); - if (ret < 0) { + if (ret) { pr_err("driver_create_file failed\n"); goto err_create_file; } -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] staging: usbip: vhci_hcd: correctly handle return value
ret == 0 means success, anything else is failure. Signed-off-by: navin patidar --- drivers/staging/usbip/vhci_hcd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/usbip/vhci_hcd.c b/drivers/staging/usbip/vhci_hcd.c index d7974cb..b3c9217 100644 --- a/drivers/staging/usbip/vhci_hcd.c +++ b/drivers/staging/usbip/vhci_hcd.c @@ -1146,11 +1146,11 @@ static int __init vhci_hcd_init(void) return -ENODEV; ret = platform_driver_register(_driver); - if (ret < 0) + if (ret) goto err_driver_register; ret = platform_device_register(_pdev); - if (ret < 0) + if (ret) goto err_platform_device_register; pr_info(DRIVER_DESC " v" USBIP_VERSION "\n"); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On 09/10/2013 01:13 PM, Igor Gnatenko wrote: > On Tue, 2013-09-10 at 11:27 +0800, Aaron Lu wrote: >> On 09/09/2013 07:44 PM, Igor Gnatenko wrote: >>> On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index f466980..75fba17 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1650,7 +1650,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) if (INTEL_INFO(dev)->num_pipes) { /* Must be done after probing outputs */ intel_opregion_init(dev); - acpi_video_register(); + __acpi_video_register(i915_take_over_backlight); } if (IS_GEN5(dev)) >>> >>> I can't compile: >>> >>> >>> DEBUG: drivers/gpu/drm/i915/i915_dma.c: In function 'i915_driver_load': >>> DEBUG: drivers/gpu/drm/i915/i915_dma.c:1661:3: error: implicit >>> declaration of function >>> '__acpi_video_register' [-Werror=implicit-function-declaration] >>> DEBUG:__acpi_video_register(i915_take_over_backlight); >>> DEBUG:^ >>> DEBUG: cc1: some warnings being treated as errors >>> DEBUG: make[4]: *** [drivers/gpu/drm/i915/i915_dma.o] Error 1 >>> DEBUG: make[3]: *** [drivers/gpu/drm/i915] Error 2 >>> DEBUG: make[2]: *** [drivers/gpu/drm] Error 2 >>> DEBUG: make[1]: *** [drivers/gpu] Error 2 >>> DEBUG: make: *** [drivers] Error 2 >>> >> >> The two patches are based on top of Rafael's linux-next tree. I just >> tried it again, no compile problem for me. I also tried on today Linus' >> master tree, as there are some updates from i915, two conflicts exist. >> I've just resolved them and will update it in next revision. >> If you want to try it now, please use: >> https://github.com/aaronlu/linux acpi_video_rework >> >> Thanks, >> Aaron > > Thanks. this patch fixes my problems w/ compilation. I've tested this > two patches and after apply I have: > $ tree /sys/class/backlight/ > /sys/class/backlight/ > |-- acpi_video0 > -> ../../devices/pci:00/:00:02.0/backlight/acpi_video0 > `-- intel_backlight > -> > ../../devices/pci:00/:00:02.0/drm/card0/card0-LVDS-1/intel_backlight > > 2 directories, 0 files > > I think it's didn't unregistered.. I may forget. I need to apply one of > patch from Matthew ? You need to specify i915.take_over_backlight=1 in kernel cmdline, that module option is set to false by default for now. Thanks for the test. -Aaron > > Some strings from logs: > DMI: LENOVO 23205NG/23205NG, BIOS G2ET92WW (2.52 ) 02/22/2013 > thinkpad_acpi: Standard ACPI backlight interface available, not loading > native one > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Tue, 2013-09-10 at 11:27 +0800, Aaron Lu wrote: > On 09/09/2013 07:44 PM, Igor Gnatenko wrote: > > On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: > >> diff --git a/drivers/gpu/drm/i915/i915_dma.c > >> b/drivers/gpu/drm/i915/i915_dma.c > >> index f466980..75fba17 100644 > >> --- a/drivers/gpu/drm/i915/i915_dma.c > >> +++ b/drivers/gpu/drm/i915/i915_dma.c > >> @@ -1650,7 +1650,7 @@ int i915_driver_load(struct drm_device *dev, > >> unsigned long flags) > >>if (INTEL_INFO(dev)->num_pipes) { > >>/* Must be done after probing outputs */ > >>intel_opregion_init(dev); > >> - acpi_video_register(); > >> + __acpi_video_register(i915_take_over_backlight); > >>} > >> > >>if (IS_GEN5(dev)) > > > > I can't compile: > > > > > > DEBUG: drivers/gpu/drm/i915/i915_dma.c: In function 'i915_driver_load': > > DEBUG: drivers/gpu/drm/i915/i915_dma.c:1661:3: error: implicit > > declaration of function > > '__acpi_video_register' [-Werror=implicit-function-declaration] > > DEBUG:__acpi_video_register(i915_take_over_backlight); > > DEBUG:^ > > DEBUG: cc1: some warnings being treated as errors > > DEBUG: make[4]: *** [drivers/gpu/drm/i915/i915_dma.o] Error 1 > > DEBUG: make[3]: *** [drivers/gpu/drm/i915] Error 2 > > DEBUG: make[2]: *** [drivers/gpu/drm] Error 2 > > DEBUG: make[1]: *** [drivers/gpu] Error 2 > > DEBUG: make: *** [drivers] Error 2 > > > > The two patches are based on top of Rafael's linux-next tree. I just > tried it again, no compile problem for me. I also tried on today Linus' > master tree, as there are some updates from i915, two conflicts exist. > I've just resolved them and will update it in next revision. > If you want to try it now, please use: > https://github.com/aaronlu/linux acpi_video_rework > > Thanks, > Aaron Thanks. this patch fixes my problems w/ compilation. I've tested this two patches and after apply I have: $ tree /sys/class/backlight/ /sys/class/backlight/ |-- acpi_video0 -> ../../devices/pci:00/:00:02.0/backlight/acpi_video0 `-- intel_backlight -> ../../devices/pci:00/:00:02.0/drm/card0/card0-LVDS-1/intel_backlight 2 directories, 0 files I think it's didn't unregistered.. I may forget. I need to apply one of patch from Matthew ? Some strings from logs: DMI: LENOVO 23205NG/23205NG, BIOS G2ET92WW (2.52 ) 02/22/2013 thinkpad_acpi: Standard ACPI backlight interface available, not loading native one -- Igor Gnatenko Fedora release 20 (Heisenbug) Linux 3.11.0-1.fc20.x86_64 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] staging: usbip: vhci_hcd: remove check for dma
vhci_hcd is a virtual usb host controller, so no need to check for dma. Signed-off-by: navin patidar --- drivers/staging/usbip/vhci_hcd.c |6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/staging/usbip/vhci_hcd.c b/drivers/staging/usbip/vhci_hcd.c index b3c9217..e810ad5 100644 --- a/drivers/staging/usbip/vhci_hcd.c +++ b/drivers/staging/usbip/vhci_hcd.c @@ -999,12 +999,6 @@ static int vhci_hcd_probe(struct platform_device *pdev) usbip_dbg_vhci_hc("name %s id %d\n", pdev->name, pdev->id); - /* will be removed */ - if (pdev->dev.dma_mask) { - dev_info(>dev, "vhci_hcd DMA not supported\n"); - return -EINVAL; - } - /* * Allocate and initialize hcd. * Our private data is also allocated automatically. -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Sep 10
Hi all, Please do not add any code for v3.13 to your linux-next included branches until after v3.12-rc1 is released. Changes since 20130909: The vfs tree lost its build failure. The dmaengine tree gained a conflict against the slave-dma tree. The akpm tree gained conflicts against the vfs and Linus' trees. I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. We are up to 222 trees (counting Linus' and 30 trees of patches pending for Linus' tree), more are welcome (even if they are currently empty). Thanks to those who have contributed, and to those who haven't, please do. Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwells...@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master (6404141 Merge tag 'late-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging fixes/master (fa8218d Merge tag 'regmap-v3.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap) Merging kbuild-current/rc-fixes (ad81f05 Linux 3.11-rc1) Merging arc-current/for-curr (07b9b65 ARC: fix new Section mismatches in build (post __cpuinit cleanup)) Merging arm-current/fixes (e1f0203 Merge branch 'security-fixes' into fixes) Merging m68k-current/for-linus (5549005 m68k/atari: ARAnyM - Always use physical addresses in NatFeat calls) Merging metag-fixes/fixes (3b2f64d Linux 3.11-rc2) Merging powerpc-merge/merge (d220980 powerpc/hvsi: Increase handshake timeout from 200ms to 400ms.) Merging sparc/master (4de9ad9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile) Merging net/master (c19d65c bnx2x: Fix configuration of doorbell block) Merging ipsec/master (302a50b xfrm: Fix potential null pointer dereference in xdst_queue_output) Merging sound-current/for-linus (83f7215 ALSA: hda - Add Toshiba Satellite C870 to MSI blacklist) Merging pci-current/for-linus (a923874 Merge tag 'pci-v3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci) Merging wireless/master (f4e1a4d rt2800: change initialization sequence to fix system freeze) Merging driver-core.current/driver-core-linus (816434e Merge branch 'x86-spinlocks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip) Merging tty.current/tty-linus (c095ba7 Linux 3.11-rc4) Merging usb.current/usb-linus (d347404 USB: OHCI: fix build error related to ohci_suspend/resume) Merging staging.current/staging-linus (d8dfad3 Linux 3.11-rc7) Merging char-misc.current/char-misc-linus (b36f4be Linux 3.11-rc6) Merging input-current/for-linus (c7dc657 Input: evdev - add EVIOCREVOKE ioctl) Merging md-current/for-linus (f94c0b6 md/raid5: fix interaction of 'replace' and 'recovery'.) Merging audit-current/for-linus (c158a35 audit: no leading space in audit_log_d_path prefix) Merging crypto-current/master (77dbd7a crypto: api - Fix race condition in larval lookup) Merging ide/master (64110c1 ide: sgiioc4: Staticize ioc4_ide_attach_one()) Merging dwmw2/master (5950f08 pcmcia: remove RPX board stuff) Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to inline functions) Merging devicetree-current/devicetree/merge (cf9e236 of/irq: init struct resource to 0 in of_irq_to_resource()) Merging rr-fixes/fixes (6c2580c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32) Merging mfd-fixes/master (5649d8f mfd: ab8500-sysctrl: Let sysctrl driver work without pdata
[GIT PULL] slave-dmaengine update for 3.12
Hey Linus, Back from the longish weekend here, so time to send you the pull request. Okay we have fairly longish pull this time, and NO recent rebase which got you annoyed last time :( This pull brings: - Andy's DW driver updates - Guennadi's sh driver updates - Pl08x driver fixes from Tomasz & Alban - Improvements to mmp_pdma by Daniel - TI EDMA fixes by Joel - New drivers: - Hisilicon k3dma driver - Renesas rcar dma driver - New API for publishing slave driver capablities - Various fixes across the subsystem by Andy, Jingoo, Sachin etc... The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f: Linus Torvalds (1): Linux 3.11-rc4 are available in the git repository at: git://git.infradead.org/users/vkoul/slave-dma.git for-linus Alban Bedel (2): dmaengine: PL08x: Fix reading the byte count in cctl dmaengine: PL08x: Add cyclic transfer support Andy Shevchenko (20): imx-sdma: remove useless variable mxs-dma: remove useless variable edma: no need to assign residue to 0 explicitly ep93xx_dma: remove useless use of lock fsldma: remove useless use of lock mmp_pdma: remove useless use of lock mpc512x_dma: remove useless use of lock pch_dma: remove useless use of lock tegra20-apb-dma: remove useless use of lock ipu_idmac: re-use dma_cookie_status() mmp_tdma: set cookies as well when asked for tx status txx9dmac: return DMA_SUCCESS immediately from device_tx_status() acpi-dma: fix sparse warning dma: dw: append MODULE_DEVICE_TABLE for ACPI case dma: dw: improve comparison with ~0 dma: dw: allow shared interrupts dma: dw: return DMA_SUCCESS immediately from device_tx_status() dma: dw: return DMA_PAUSED only if cookie status is DMA_IN_PROGRESS acpi-dma, doc: append managed function to the list acpi-dma: remove ugly conversion Barry Song (1): dmaengine: sirf: add PM entries for sleep and runtime Ben Hutchings (1): pch_dma: Add MODULE_DEVICE_TABLE Chanho Park (1): dma: pl330: split off common code to give back descriptors Dan Carpenter (1): dmaengine: ste_dma40: off by one in d40_of_probe() Daniel Mack (13): dma: mmp_pdma: factor out DRCMR register calculation dma: mmp_pdma: refactor unlocking path in lookup_phy() dma: mmp_pdma: fix maximum transfer length dma: mmp_pdma: add filter function dma: mmp_pdma: make the controller a DMA provider dma: mmp_pdma: print the number of channels at probe time dma: mmp_pdma: remove duplicate assignment dma: mmp_pdma: add support for byte-aligned transfers dma: mmp_pdma: only complete one transaction from dma_do_tasklet() dma: mmp_pdma: don't clear DCMD_ENDIRQEN at end of pending chain dma: mmp_pdma: add support for cyclic DMA descriptors dma: mmp_pdma: set DMA_PRIVATE dma: dmagengine: fix function names in comments Fabio Estevam (2): dma: ste_dma: Fix warning when CONFIG_ARM_LPAE=y dma: imx-sdma: Staticize sdma_driver_data structures Guennadi Liakhovetski (11): DMA: shdma: fix CHCLR register address calculation DMA: shdma: switch all __iomem pointers to void DMA: shdma: support the new CHCLR register layout DMA: shdma: switch to managed resource allocation DMA: sudmac: fix compiler warning DMA: shdma: make a pointer const DMA: shdma: switch DT mode to use configuration data from a match table DMA: shdma: remove private and unused defines from a global header DMA: shdma: add a header with common for ARM SoCs defines DMA: shdma: add r8a73a4 DMAC data to the device ID table DMA: shdma: fix a bad merge - remove free_irq() Huang Shijie (1): dma: imx-sdma: remove the unused completion Jingoo Han (7): dma: mmp_pdma: Staticize mmp_pdma_alloc_descriptor() dma: mv_xor: use NULL instead of 0 dma: sirf: use NULL instead of 0 dma: use dev_get_platdata() dma: ipu: remove unnecessary platform_set_drvdata() dma: sh: remove unnecessary platform_set_drvdata() dma: k3dma: use devm_ioremap_resource() instead of devm_request_and_ioremap() Joel Fernandes (6): dma: edma: Setup parameters to DMA MAX_NR_SG at a time dma: edma: Write out and handle MAX_NR_SG at a given time ARM: edma: Add function to manually trigger an EDMA channel dma: edma: Find missed events and issue them dma: edma: Leave linked to Null slot instead of DUMMY slot dma: edma: Remove limits on number of slots Julia Lawall (2): dma: mmp: simplify use of devm_ioremap_resource dma: replace devm_request_and_ioremap by devm_ioremap_resource Lars-Peter Clausen (2): dma: pl330: Implement device_slave_caps dma: pl330: Fix handling of TERMINATE_ALL while processing completed descriptors Lothar Waßmann (1): dma: of: make error message more meaningful by adding
Subject: [PATCH] md: avoid deadlock when raid5 array has unack badblocks during md_stop_writes.
When raid5 hit a fresh badblock, this badblock will flagged as unack badblock until md_update_sb is called. But md_stop/reboot/md_set_readonly will avoid raid5d call md_update_sb in md_check_recovery, the badblock will always be unack, so raid5d thread enter a infinite loop and never can unregister sync_thread that cause deadlock. To solve this, before md_stop_writes call md_unregister_thread, set MD_STOPPING_WRITES on mddev->flags. In raid5.c analyse_stripe judge MD_STOPPING_WRITES bit on mddev->flags, if setted don't block rdev to wait md_update_sb. so raid5d thread can be finished. Signed-off-by: Bian Yu --- drivers/md/md.c|2 ++ drivers/md/md.h|3 +++ drivers/md/raid5.c |3 ++- 3 files changed, 7 insertions(+), 1 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index adf4d7e..54ef71f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5278,6 +5278,7 @@ static void md_clean(struct mddev *mddev) static void __md_stop_writes(struct mddev *mddev) { set_bit(MD_RECOVERY_FROZEN, >recovery); +set_bit(MD_STOPPING_WRITES, >flags); if (mddev->sync_thread) { set_bit(MD_RECOVERY_INTR, >recovery); md_reap_sync_thread(mddev); @@ -5294,6 +5295,7 @@ static void __md_stop_writes(struct mddev *mddev) mddev->in_sync = 1; md_update_sb(mddev, 1); } +clear_bit(MD_STOPPING_WRITES, >flags); } void md_stop_writes(struct mddev *mddev) diff --git a/drivers/md/md.h b/drivers/md/md.h index 608050c..c998b82 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -214,6 +214,9 @@ struct mddev { #define MD_STILL_CLOSED4/* If set, then array has not been opened since * md_ioctl checked on it. */ +#define MD_STOPPING_WRITES 5 /* If set, raid5 shouldn't set unacknowledged + * badblock blocked in analyse_stripe to avoid infinite loop + */ intsuspended; atomic_tactive_io; diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index f9972e2..ff1aecf 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3446,7 +3446,8 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) if (rdev) { is_bad = is_badblock(rdev, sh->sector, STRIPE_SECTORS, _bad, _sectors); -if (s->blocked_rdev == NULL +if (!test_bit(MD_STOPPING_WRITES, >mddev->flags) +&& s->blocked_rdev == NULL && (test_bit(Blocked, >flags) || is_bad < 0)) { if (is_bad < 0) -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 09:05 PM, Wei Ni wrote: On 09/10/2013 04:39 AM, Mark Brown wrote: * PGP Signed by an unknown key On Mon, Sep 09, 2013 at 09:17:35AM -0700, Guenter Roeck wrote: On Mon, Sep 09, 2013 at 05:02:37PM +0100, Mark Brown wrote: It does, though it gets complicated trying to use it for a case like this since you can't really tell if the regulator was powered on immediately before the device got probed by another device on the bus. Why not ? Just keep a timestamp. The support is a callback on state changes; we could keep a timestamp but there's still going to be race conditions around bootloaders. It's doable though. On a higher level, I wonder if such functionality should be added in the i2c subsystem and not in i2c client drivers. Has anyone thought about this ? I'm not sure what the subsystem would do for such delays? It's fairly common for things that need this to also want to do things like manipulate GPIOs as part of the power on sequence so the applicability is relatively limited, plus it's not even I2C specific, the same applies to other buses so it ought to be a driver core thing. Possibly. I just thought about i2c since it also takes care of basic devicetree bindings. Something along the line of if devicetree bindings for this device declare one or more regulators, enable those regulators before calling the driver probe function. That's definitely a driver core thing, not I2C - there's nothing specific to I2C in there at all, needing power is pretty generic. I have considered this before, something along the lines of what we have for pinctrl, but unfortunately the generic case isn't quite generic enough to make it easy. It'd need to be an explicit list of regulators (partly just to make it opt in and avoid breaking things) and you'd want to have a way of handling the different suspend/resume behaviour that devices want. There's a few patterns there. It's definitely something I think about from time to time and it would be useful to factor things out, the issue is getting a good enough model of what's going on. There was some work on a generic helper for power on sequences but it stalled since it wasn't accepted for the original purpose (LCD panel power ons IIRC). Too bad. I think it could be kept quite simple, though, by handling it through the regulator subsystem as suggested above. A generic binding for a per-regulator and per-device poweron delay should solve that and possibly even make it transparent to the actual driver code. Lots of things have a GPIO for reset too, and some want clocks too. For maximum usefulness this should be cross subsystem. I suspect the reset controller API may be able to handle some of it. The regulator power on delays are already handled transparently, by the time regulator_enable() returns the ramp should be finished. I think the regulator should encoded its own startup delay. Each individual device should handle its own requirements for delay after power is stable. The regulator_enable() will handle the delays for the regulator device. And adding the msleep(25) is for lm90 device. If without delay, sometimes the device can't work properly. If read lm90 register immediately after enabling regulator, the reading may be failed. I'm not sure if 25ms is the right value, I read the LM90 SPEC, the max of "SMBus Clock Low Time" is 25ms, so I supposed that it may need about 25ms to stable after power on. Problem is that you are always waiting, even if the same regulator was turned on already, and even if it is a dummy regulator. Imagine every driver doing that. Booting would take forever, just because of unnecessary delays all over the place. There has to be a better solution which does not include a mandatory and potentially unnecessary wait time in the driver. At a previous company we had a design with literally dozens of those chip. You really want to force such a boot delay on every user ? But essentially you don't even know if it is needed; you are just guessing. That is not an acceptable reason to add such a delay, mandatory or not. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [performance regression, bisected] scheduler: should_we_balance() kills filesystem performance
On Tue, Sep 10, 2013 at 02:02:54PM +1000, Dave Chinner wrote: > Hi folks, > > I just updated my performance test VM to the current 3.12-git > tree after the XFS dev branch was merged. The first test I ran > which was a 16-way concurrent fsmark test to create lots of files > gave me a number about 30% lower than I expected - ~180k files/s > when I was expecting somewhere around 250k files/s. > > I did a bisect, and the bisect landed on this commit: > > commit 23f0d2093c789e612185180c468fa09063834e87 > Author: Joonsoo Kim > Date: Tue Aug 6 17:36:42 2013 +0900 > > sched: Factor out code to should_we_balance() > > Now checking whether this cpu is appropriate to balance or not > is embedded into update_sg_lb_stats() and this checking has no direct > relationship to this function. There is not enough reason to place > this checking at update_sg_lb_stats(), except saving one iteration > for sched_group_cpus. > > > Now, i couldn't revert that patch by itself, but I reverted the > series of about 10 scheduler patches in that series total from a > current TOT and the regression went away. Hence I'm pretty confident > that the this is the patch causing the issue as i've verified it in > more than one way and the difference between "good" and "bad" was > signficantlt greater than the variance of the test (1.5-2 stddev > difference). > > In more detail: > > v4 filesystem v5 filesystem > 3.11+xfsdev: 220k files/s225k files/s > 3.12-git 180k files/s185k files/s > 3.12-git-revert 245k files/s247k files/s > > The test vm is a 16p/16GB RAM VM, with a sparse 100TB filesystem > image sitting on a 4-way RAID0 SSD array formatted with XFS and the > image file is accessed by virtio+direct IO. The fsmark command line > is: > > time ./fs_mark -D 1 -S0 -n 10 -s 0 -L 32 \ > -d /mnt/scratch/0 -d /mnt/scratch/1 \ > -d /mnt/scratch/2 -d /mnt/scratch/3 \ > -d /mnt/scratch/4 -d /mnt/scratch/5 \ > -d /mnt/scratch/6 -d /mnt/scratch/7 \ > -d /mnt/scratch/8 -d /mnt/scratch/9 \ > -d /mnt/scratch/10 -d /mnt/scratch/11 \ > -d /mnt/scratch/12 -d /mnt/scratch/13 \ > -d /mnt/scratch/14 -d /mnt/scratch/15 \ > | tee >(stats --trim-outliers | tail -1 1>&2) > > The workload on XFS runs to almost being CPU bound - the effect of > the above patch was that there was a lot of idle time left in the > system. The workload consumed the same amount of user and system > CPU, just instantaneous CPU usage was reduced by 20-30% and the > elaspsed time was increased by 20-30%. Hello, Dave. Now, I look again this patch and find one mistake. If we find that we are appropriate cpu for balancing, should_we_balance() should return 1. But current code doesn't do so. This correspond with your observation that a lot of idle time left. Could you re-test your benchmark with below? Thanks. --->8- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7f0a5e6..9b3fe1c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5151,7 +5151,7 @@ static int should_we_balance(struct lb_env *env) * First idle cpu or the first cpu(busiest) in this sched group * is eligible for doing load balancing at this and above domains. */ - return balance_cpu != env->dst_cpu; + return balance_cpu == env->dst_cpu; } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 09:13 PM, Stephen Warren wrote: On 09/09/2013 09:53 PM, Guenter Roeck wrote: On 09/09/2013 08:40 PM, Stephen Warren wrote: On 09/09/2013 09:36 PM, Guenter Roeck wrote: ... My understanding is that by adding regulator support you essentially committed to adding regulators (if necessary dummy ones) for this driver to all those platforms. This is quite similar to other drivers in the same situation. Once you start along that route, you'll have to go it all the way. By using regulator_get_optional(), the regulator should be optional, hence you only have to add it to platforms that need it. Earlier comments suggest that this is not the intended use case for regulator_get_optional(). Isn't the issue only whether the optional aspect of the regulator is implemented by: a) regulator_get_optional() returning failure, then the driver having to check for that and either using or not-using the regulator. b) regulator_get_optional() returning a dummy regulator automatically when none is specified in DT or the regulator lookup table, and hence the driver can always call regulator_enable/disable on the returned value. I don't know. The regulator folks would have to answer that. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [RFC] seqcount: Add lockdep functionality to seqcount/seqlock structures
Currently seqlocks and seqcounts don't support lockdep. After running across a seqcount related deadlock in the timekeeping code, I used a less-refined and more focused varient of this patch to narrow down the cause of the issue. This is a first-pass attempt to properly enable lockdep functionality on seqlocks and seqcounts. Due to seqlocks/seqcounts having slightly different possible semantics then standard locks (ie: reader->reader and reader->writer recursion is fine, but writer->reader is not), this implementation is probably not as exact as I'd like (currently using a hack by only spot checking readers), and may be overly strict in some cases. I've handled one cases where there were nested seqlock writers, and there may be more edge cases, as while I've gotten it to run cleanly, depending on config its reporting issues that I'm not sure if they are flaws in the implementation or actual bugs. But I wanted to send this out for some initial thoughts as until today I hadn't looked at much of the lockdep infrastructure. So I'm sure there are improvements that could be made. Comments and feedback would be appreciated! Cc: Steven Rostedt Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: John Stultz --- arch/x86/vdso/vclock_gettime.c | 24 +- fs/dcache.c| 4 +-- fs/fs_struct.c | 2 +- include/linux/init_task.h | 8 ++--- include/linux/lockdep.h| 15 + include/linux/seqlock.h| 72 -- mm/filemap_xip.c | 2 +- 7 files changed, 108 insertions(+), 19 deletions(-) diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c index c74436e..1797387 100644 --- a/arch/x86/vdso/vclock_gettime.c +++ b/arch/x86/vdso/vclock_gettime.c @@ -178,13 +178,15 @@ notrace static int __always_inline do_realtime(struct timespec *ts) ts->tv_nsec = 0; do { - seq = read_seqcount_begin(>seq); + seq = __read_seqcount_begin(>seq); + smp_rmb(); mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->wall_time_sec; ns = gtod->wall_time_snsec; ns += vgetsns(); ns >>= gtod->clock.shift; - } while (unlikely(read_seqcount_retry(>seq, seq))); + smp_rmb(); + } while (unlikely(__read_seqcount_retry(>seq, seq))); timespec_add_ns(ts, ns); return mode; @@ -198,13 +200,15 @@ notrace static int do_monotonic(struct timespec *ts) ts->tv_nsec = 0; do { - seq = read_seqcount_begin(>seq); + seq = __read_seqcount_begin(>seq); + smp_rmb(); mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->monotonic_time_sec; ns = gtod->monotonic_time_snsec; ns += vgetsns(); ns >>= gtod->clock.shift; - } while (unlikely(read_seqcount_retry(>seq, seq))); + smp_rmb(); + } while (unlikely(__read_seqcount_retry(>seq, seq))); timespec_add_ns(ts, ns); return mode; @@ -214,10 +218,12 @@ notrace static int do_realtime_coarse(struct timespec *ts) { unsigned long seq; do { - seq = read_seqcount_begin(>seq); + seq = __read_seqcount_begin(>seq); + smp_rmb(); ts->tv_sec = gtod->wall_time_coarse.tv_sec; ts->tv_nsec = gtod->wall_time_coarse.tv_nsec; - } while (unlikely(read_seqcount_retry(>seq, seq))); + smp_rmb(); + } while (unlikely(__read_seqcount_retry(>seq, seq))); return 0; } @@ -225,10 +231,12 @@ notrace static int do_monotonic_coarse(struct timespec *ts) { unsigned long seq; do { - seq = read_seqcount_begin(>seq); + seq = __read_seqcount_begin(>seq); + smp_rmb(); ts->tv_sec = gtod->monotonic_time_coarse.tv_sec; ts->tv_nsec = gtod->monotonic_time_coarse.tv_nsec; - } while (unlikely(read_seqcount_retry(>seq, seq))); + smp_rmb(); + } while (unlikely(__read_seqcount_retry(>seq, seq))); return 0; } diff --git a/fs/dcache.c b/fs/dcache.c index 96655f4..9f97a88 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2259,7 +2259,7 @@ static void __d_move(struct dentry * dentry, struct dentry * target) dentry_lock_for_move(dentry, target); write_seqcount_begin(>d_seq); - write_seqcount_begin(>d_seq); + write_seqcount_begin_nested(>d_seq, DENTRY_D_LOCK_NESTED); /* __d_drop does write_seqcount_barrier, but they're OK to nest. */ @@ -2391,7 +2391,7 @@ static void __d_materialise_dentry(struct dentry *dentry, struct dentry *anon) dentry_lock_for_move(anon, dentry); write_seqcount_begin(>d_seq); - write_seqcount_begin(>d_seq); +
linux-next: manual merge of the akpm tree with the vfs tree
Hi Andrew, Today's linux-next merge of the akpm tree got a conflict in fs/super.c between commit d040790391f2 ("prune_super(): sb->s_op is never NULL") from the vfs tree and commit "fs: convert inode and dentry shrinking to be node aware" from the akpm tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc fs/super.c index cd3c2cd,efeabe8..000 --- a/fs/super.c +++ b/fs/super.c @@@ -75,11 -75,11 +75,11 @@@ static unsigned long super_cache_scan(s if (!grab_super_passive(sb)) return SHRINK_STOP; - if (sb->s_op && sb->s_op->nr_cached_objects) + if (sb->s_op->nr_cached_objects) - fs_objects = sb->s_op->nr_cached_objects(sb); + fs_objects = sb->s_op->nr_cached_objects(sb, sc->nid); - inodes = list_lru_count(>s_inode_lru); - dentries = list_lru_count(>s_dentry_lru); + inodes = list_lru_count_node(>s_inode_lru, sc->nid); + dentries = list_lru_count_node(>s_dentry_lru, sc->nid); total_objects = dentries + inodes + fs_objects + 1; /* proportion the scan between the caches */ pgpM9F0WVONSa.pgp Description: PGP signature
linux-next: manual merge of the akpm tree with Linus' tree
Hi Andrew, Today's linux-next merge of the akpm tree got a conflict in fs/dcache.c between commit 8aab6a27332b ("vfs: reorganize dput() memory accesses") from Linus' tree and commit "dcache: convert to use new lru list infrastructure" from the akpm tree. /me mutters about development happening during the merge window - especially when Andrew is absent. I have no idea if this will be correct, but I just used the version from the akpm tree (effectively reverting parts of commit 8aab6a27332b) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpWZILjsta3m.pgp Description: PGP signature
Re: [PATCH v3 2/2] Documentation: dt: hwmon: add OF document for LM90
On 09/10/2013 12:35 PM, Wei Ni wrote: > On 09/09/2013 06:57 PM, Ramkumar Ramachandra wrote: >> Wei Ni wrote: >>> diff --git a/Documentation/devicetree/bindings/hwmon/lm90.txt >>> b/Documentation/devicetree/bindings/hwmon/lm90.txt >>> new file mode 100644 >>> index 000..5570875 >>> --- /dev/null >>> +++ b/Documentation/devicetree/bindings/hwmon/lm90.txt >> >> While at it, please update and rename ads1015.txt. > > This series is for lm90 compatible devices, it seems ads1015 is not > register-compatible with lm90. > >> >>> @@ -0,0 +1,44 @@ >>> +* LM90 series thermometer. >>> + >>> +Required node properties: >>> +- compatible: manufacture and chip name, one of >>> + ",adm1032" >>> + ",adt7461" >>> + ",adt7461a" >>> + ",g781" >>> + ",lm90" >>> + ",lm86" >>> + ",lm89" >>> + ",lm99" >> >> More versions of this of different ages are required. > > In here, the chip names, such as "lm99", come from lm90.c id_table, the > i2c subsystem want to use it to load driver, so we can add any other names. Sorry, it's typo, it should be "can't add any other names". > >> >>> + ",max6646" >>> + ",max6647" >>> + ",max6649" >>> + ",max6657" >>> + ",max6658" >>> + ",max6659" >>> + ",max6680" >>> + ",max6681" >>> + ",max6695" >>> + ",max6696" >> >> SSDNow devices are required. > What's the SSDNow device? It seems the lm90.c doesn't support it. > > Thanks. > Wei. > >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-tegra" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] Documentation: dt: hwmon: add OF document for LM90
On 09/09/2013 06:57 PM, Ramkumar Ramachandra wrote: > Wei Ni wrote: >> diff --git a/Documentation/devicetree/bindings/hwmon/lm90.txt >> b/Documentation/devicetree/bindings/hwmon/lm90.txt >> new file mode 100644 >> index 000..5570875 >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/hwmon/lm90.txt > > While at it, please update and rename ads1015.txt. This series is for lm90 compatible devices, it seems ads1015 is not register-compatible with lm90. > >> @@ -0,0 +1,44 @@ >> +* LM90 series thermometer. >> + >> +Required node properties: >> +- compatible: manufacture and chip name, one of >> + ",adm1032" >> + ",adt7461" >> + ",adt7461a" >> + ",g781" >> + ",lm90" >> + ",lm86" >> + ",lm89" >> + ",lm99" > > More versions of this of different ages are required. In here, the chip names, such as "lm99", come from lm90.c id_table, the i2c subsystem want to use it to load driver, so we can add any other names. > >> + ",max6646" >> + ",max6647" >> + ",max6649" >> + ",max6657" >> + ",max6658" >> + ",max6659" >> + ",max6680" >> + ",max6681" >> + ",max6695" >> + ",max6696" > > SSDNow devices are required. What's the SSDNow device? It seems the lm90.c doesn't support it. Thanks. Wei. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] Documentation: dt: hwmon: add OF document for LM90
On 09/10/2013 06:23 AM, Guenter Roeck wrote: > On Mon, Sep 09, 2013 at 04:15:57PM -0600, Stephen Warren wrote: >> On 09/09/2013 04:29 AM, Wei Ni wrote: >>> Add OF document for LM90 in Documentation/devicetree/. >> >>> diff --git a/Documentation/devicetree/bindings/hwmon/lm90.txt >>> b/Documentation/devicetree/bindings/hwmon/lm90.txt >> >>> +Optional properties: >> >>> +- interrupts: Contains a single interrupt specifier which >>> describes the + LM90 pin6 output. >> >> Does the pin also have a standard name you could include here, rather >> than (or in addition to) just the pin number? >> > This is the "ALERT" pin (or maybe "-ALERT" to reflect that it is low-active). Thanks, Guenter, I will update it. > > Guenter > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] Documentation: dt: hwmon: add OF document for LM90
On 09/10/2013 06:14 AM, Stephen Warren wrote: > On 09/09/2013 04:52 AM, Guenter Roeck wrote: >> On 09/09/2013 03:29 AM, Wei Ni wrote: >>> Add OF document for LM90 in Documentation/devicetree/. > >>> diff --git a/Documentation/devicetree/bindings/hwmon/lm90.txt > >>> +* LM90 series thermometer. >>> + >>> +Required node properties: >>> +- compatible: manufacture and chip name, one of >> >> manufacturer >> >>> +",adm1032" > ... >>> +",sa56004" >> >> Shouldn't the manufacturers be listed explicitly ? > > Yes, and they must all be present in > Documentation/devicetree/bindings/vendor-prefixes.txt. Oh, got it, I will update it, thanks. > >>> +Example LM90 node: >>> + >>> +temp-sensor { >>> +compatible = "onnn,nct1008"; >>> +reg = <0x4c>; >>> +vcc-supply = <_ldo6_reg>; >>> +interrupt-parent = <>; >> >> List above as optional property. > > I believe it's common not to document this, since it's implicitly > supported as part of any node that is an interrupt source. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the akpm tree with Linus' tree
Hi Andrew, Today's linux-next merge of the akpm tree got a conflict in fs/dcache.c between commits 8aab6a27332b ("vfs: reorganize dput() memory accesses") and 0d98439ea3c6 ("vfs: use lockred "dead" flag to mark unrecoverably dead dentries") from Linus' tree and commit "dcache: remove dentries from LRU before putting on dispose list" from the akpm tree. I fixed it up (hopefully - see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc fs/dcache.c index a4cc2eb,43a1c0e..000 --- a/fs/dcache.c +++ b/fs/dcache.c @@@ -374,7 -344,6 +374,7 @@@ static void dentry_lru_add(struct dentr static void __dentry_lru_del(struct dentry *dentry) { list_del_init(>d_lru); - dentry->d_flags &= ~(DCACHE_SHRINK_LIST | DCACHE_LRU_LIST); ++ dentry->d_flags &= ~DCACHE_LRU_LIST; dentry->d_sb->s_nr_dentry_unused--; this_cpu_dec(nr_dentry_unused); } @@@ -393,14 -372,15 +403,16 @@@ static void dentry_lru_del(struct dentr static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list) { + BUG_ON(dentry->d_flags & DCACHE_SHRINK_LIST); + spin_lock(>d_sb->s_dentry_lru_lock); if (list_empty(>d_lru)) { + dentry->d_flags |= DCACHE_LRU_LIST; list_add_tail(>d_lru, list); - dentry->d_sb->s_nr_dentry_unused++; - this_cpu_inc(nr_dentry_unused); } else { list_move_tail(>d_lru, list); + dentry->d_sb->s_nr_dentry_unused--; + this_cpu_dec(nr_dentry_unused); } spin_unlock(>d_sb->s_dentry_lru_lock); } @@@ -498,7 -478,8 +510,8 @@@ EXPORT_SYMBOL(d_drop) * If ref is non-zero, then decrement the refcount too. * Returns dentry requiring refcount drop, or NULL if we're done. */ - static inline struct dentry *dentry_kill(struct dentry *dentry) + static inline struct dentry * -dentry_kill(struct dentry *dentry, int ref, int unlock_on_failure) ++dentry_kill(struct dentry *dentry, int unlock_on_failure) __releases(dentry->d_lock) { struct inode *inode; @@@ -591,7 -573,7 +606,7 @@@ repeat return; kill_it: - dentry = dentry_kill(dentry); - dentry = dentry_kill(dentry, 1, 1); ++ dentry = dentry_kill(dentry, 1); if (dentry) goto repeat; } @@@ -816,7 -798,7 +831,7 @@@ static struct dentry * try_prune_one_de { struct dentry *parent; - parent = dentry_kill(dentry); - parent = dentry_kill(dentry, 0, 0); ++ parent = dentry_kill(dentry, 0); /* * If dentry_kill returns NULL, we have nothing more to do. * if it returns the same dentry, trylocks failed. In either @@@ -836,9 -818,10 +851,10 @@@ dentry = parent; while (dentry) { if (lockref_put_or_lock(>d_lockref)) - return; - dentry = dentry_kill(dentry); + return NULL; - dentry = dentry_kill(dentry, 1, 1); ++ dentry = dentry_kill(dentry, 1); } + return NULL; } static void shrink_dentry_list(struct list_head *list) pgpbBbUvZa07K.pgp Description: PGP signature
[PULL REQUEST] md update for v3.12
The following changes since commit d8dfad3876e438b759da3c833d62fb8b2267: Linux 3.11-rc7 (2013-08-25 17:43:22 -0700) are available in the git repository at: git://neil.brown.name/md/ tags/md/3.12 for you to fetch changes up to bfc90cb0936f5b972706625f38f72c7cb726c20a: raid5: only wakeup necessary threads (2013-09-02 10:31:29 +1000) md update for v3.12 Headline item is multithreading for RAID5 so that more IO/sec can be supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6 calculations and an assortment of bug fixes. Dave Jones (1): md: Fix apparent cut-and-paste error in super_90_validate Ken Steele (1): RAID: add tilegx SIMD implementation of raid6 Max Filippov (1): raid6/test: replace echo -e with printf NeilBrown (6): md: don't call md_allow_write in get_bitmap_file. md: fix safe_mode buglet. md: Don't test all of mddev->flags at once. md: avoid deadlock when dirty buffers during md_stop. md/raid5: use seqcount to protect access to shape in make_request. md/raid5: flush out all pending requests before proceeding with reshape. Shaohua Li (5): raid5: make release_stripe lockless raid5: fix stripe release order raid5: offload stripe handle to workqueue raid5: sysfs entry to control worker thread number raid5: only wakeup necessary threads drivers/md/md.c | 54 +--- drivers/md/md.h | 8 +- drivers/md/raid5.c | 362 +--- drivers/md/raid5.h | 22 +++ include/linux/raid/pq.h | 1 + lib/raid6/Makefile | 6 + lib/raid6/algos.c | 3 + lib/raid6/test/Makefile | 9 +- lib/raid6/tilegx.uc | 86 9 files changed, 510 insertions(+), 41 deletions(-) create mode 100644 lib/raid6/tilegx.uc signature.asc Description: PGP signature
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 09:53 PM, Guenter Roeck wrote: > On 09/09/2013 08:40 PM, Stephen Warren wrote: >> On 09/09/2013 09:36 PM, Guenter Roeck wrote: ... >>> My understanding is that by adding regulator support you essentially >>> committed to adding regulators (if necessary dummy ones) for this driver >>> to all those platforms. This is quite similar to other drivers in the >>> same situation. Once you start along that route, you'll have to go it >>> all the way. >> >> By using regulator_get_optional(), the regulator should be optional, >> hence you only have to add it to platforms that need it. >> > > Earlier comments suggest that this is not the intended use case for > regulator_get_optional(). Isn't the issue only whether the optional aspect of the regulator is implemented by: a) regulator_get_optional() returning failure, then the driver having to check for that and either using or not-using the regulator. b) regulator_get_optional() returning a dummy regulator automatically when none is specified in DT or the regulator lookup table, and hence the driver can always call regulator_enable/disable on the returned value. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/10/2013 11:53 AM, Guenter Roeck wrote: > On 09/09/2013 08:40 PM, Stephen Warren wrote: >> On 09/09/2013 09:36 PM, Guenter Roeck wrote: >>> On 09/09/2013 08:22 PM, Wei Ni wrote: On 09/09/2013 11:50 PM, Guenter Roeck wrote: > On Mon, Sep 09, 2013 at 02:50:22PM +0100, Mark Brown wrote: >> On Mon, Sep 09, 2013 at 04:34:43AM -0700, Guenter Roeck wrote: >>> On 09/09/2013 04:12 AM, Mark Brown wrote: On Mon, Sep 09, 2013 at 06:29:11PM +0800, Wei Ni wrote: >> This doesn't look good, it is going to ignore actual errors - I *really* doubt that vcc is optional, it looks like it's the main power supply for the device. You should use normal regulator_get(), _optional() is for supplies which could physically not be provided in a system (eg, if the device can generate them internally if required). >> >>> Then he'll have to make sure that all devicetree files in the system >>> contain references to this regulator. >> >> Or get the patches applied on top of the code that'll be going in this >> cycle implementing get_optional() properly - when that's done the >> default will be to provide a dummy supply for regulator_get(). If you >> ack the patch I'd be happy to carry it. >> > Jean will have to ack it. > I think it's better to use get_optional(), and ignore the errors except -EPROBE_DEFER. Because many platform may always power on this device, and will not provide regulator for it, so if we get errors from regulator subsystem and return it directly, then the probe() can't be implemented, this driver can't work properly, even though it can work without regulator support. Mark, do you mean you have patches for regulator_get_optional() and regulator_get()? >>> >>> My understanding is that by adding regulator support you essentially >>> committed to adding regulators (if necessary dummy ones) for this driver >>> to all those platforms. This is quite similar to other drivers in the >>> same situation. Once you start along that route, you'll have to go it >>> all the way. >> >> By using regulator_get_optional(), the regulator should be optional, >> hence you only have to add it to platforms that need it. >> > > Earlier comments suggest that this is not the intended use case for > regulator_get_optional(). So I just need to use the regulator_get() instead, is it right? > > Guenter > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the akpm tree with Linus' tree
[ Just adding Dave Chinner to the cc list] On Tue, 10 Sep 2013 14:09:23 +1000 Stephen Rothwell wrote: > > Hi Andrew, > > Today's linux-next merge of the akpm tree got a conflict in fs/dcache.c > between commit 8aab6a27332b ("vfs: reorganize dput() memory accesses") > from Linus' tree and commit "dentry: move to per-sb LRU locks" from the > akpm tree. > > I fixed it up (I think - see below) and can carry the fix as necessary > (no action is required). > > -- > Cheers, > Stephen Rothwells...@canb.auug.org.au > > diff --cc fs/dcache.c > index 664554e,6e212bd..000 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@@ -362,9 -332,8 +361,9 @@@ static void dentry_unlink_inode(struct >*/ > static void dentry_lru_add(struct dentry *dentry) > { > -if (list_empty(>d_lru)) { > +if (unlikely(!(dentry->d_flags & DCACHE_LRU_LIST))) { > - spin_lock(_lru_lock); > + spin_lock(>d_sb->s_dentry_lru_lock); > +dentry->d_flags |= DCACHE_LRU_LIST; > list_add(>d_lru, >d_sb->s_dentry_lru); > dentry->d_sb->s_nr_dentry_unused++; > this_cpu_inc(nr_dentry_unused); > @@@ -394,9 -363,8 +393,9 @@@ static void dentry_lru_del(struct dentr > > static void dentry_lru_move_list(struct dentry *dentry, struct list_head > *list) > { > - spin_lock(_lru_lock); > + spin_lock(>d_sb->s_dentry_lru_lock); > if (list_empty(>d_lru)) { > +dentry->d_flags |= DCACHE_LRU_LIST; > list_add_tail(>d_lru, list); > dentry->d_sb->s_nr_dentry_unused++; > this_cpu_inc(nr_dentry_unused); -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpWKVa1msSuL.pgp Description: PGP signature
linux-next: manual merge of the akpm tree with Linus' tree
Hi Andrew, Today's linux-next merge of the akpm tree got a conflict in fs/dcache.c between commit 8aab6a27332b ("vfs: reorganize dput() memory accesses") from Linus' tree and commit "dentry: move to per-sb LRU locks" from the akpm tree. I fixed it up (I think - see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc fs/dcache.c index 664554e,6e212bd..000 --- a/fs/dcache.c +++ b/fs/dcache.c @@@ -362,9 -332,8 +361,9 @@@ static void dentry_unlink_inode(struct */ static void dentry_lru_add(struct dentry *dentry) { - if (list_empty(>d_lru)) { + if (unlikely(!(dentry->d_flags & DCACHE_LRU_LIST))) { - spin_lock(_lru_lock); + spin_lock(>d_sb->s_dentry_lru_lock); + dentry->d_flags |= DCACHE_LRU_LIST; list_add(>d_lru, >d_sb->s_dentry_lru); dentry->d_sb->s_nr_dentry_unused++; this_cpu_inc(nr_dentry_unused); @@@ -394,9 -363,8 +393,9 @@@ static void dentry_lru_del(struct dentr static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list) { - spin_lock(_lru_lock); + spin_lock(>d_sb->s_dentry_lru_lock); if (list_empty(>d_lru)) { + dentry->d_flags |= DCACHE_LRU_LIST; list_add_tail(>d_lru, list); dentry->d_sb->s_nr_dentry_unused++; this_cpu_inc(nr_dentry_unused); pgpWW1PQOk_DL.pgp Description: PGP signature
Re: [PATCH] rcu: Is it safe to enter an RCU read-side critical section?
On Mon, 2013-09-09 at 14:49 +, Christoph Lameter wrote: > Its just that PREEMPT kernels are > not in use and AFAICT the full preempt stuff requires significant developer > support and complicates the code without much benefit. The openSUSE desktop kernel is PREEMPT, and presumably has users. I use VOLUNTARY for my desktop, having zero tight latency constraints, and not being into inflicting needless pain on poor defenseless boxen, but lots of folks do use PREEMPT, and some of them may even need it. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/10/2013 04:39 AM, Mark Brown wrote: > * PGP Signed by an unknown key > > On Mon, Sep 09, 2013 at 09:17:35AM -0700, Guenter Roeck wrote: >> On Mon, Sep 09, 2013 at 05:02:37PM +0100, Mark Brown wrote: > >>> It does, though it gets complicated trying to use it for a case like >>> this since you can't really tell if the regulator was powered on >>> immediately before the device got probed by another device on the bus. > >> Why not ? Just keep a timestamp. > > The support is a callback on state changes; we could keep a timestamp > but there's still going to be race conditions around bootloaders. It's > doable though. > On a higher level, I wonder if such functionality should be added in the i2c subsystem and not in i2c client drivers. Has anyone thought about this ? > >>> I'm not sure what the subsystem would do for such delays? It's fairly >>> common for things that need this to also want to do things like >>> manipulate GPIOs as part of the power on sequence so the applicability >>> is relatively limited, plus it's not even I2C specific, the same applies >>> to other buses so it ought to be a driver core thing. > >> Possibly. I just thought about i2c since it also takes care of basic >> devicetree bindings. Something along the line of >> if devicetree bindings for this device declare one or more >> regulators, enable those regulators before calling the driver >> probe function. > > That's definitely a driver core thing, not I2C - there's nothing > specific to I2C in there at all, needing power is pretty generic. I > have considered this before, something along the lines of what we have > for pinctrl, but unfortunately the generic case isn't quite generic > enough to make it easy. It'd need to be an explicit list of regulators > (partly just to make it opt in and avoid breaking things) and you'd want > to have a way of handling the different suspend/resume behaviour that > devices want. There's a few patterns there. > > It's definitely something I think about from time to time and it would > be useful to factor things out, the issue is getting a good enough model > of what's going on. > >>> There was some work on a generic helper for power on sequences but it >>> stalled since it wasn't accepted for the original purpose (LCD panel >>> power ons IIRC). > >> Too bad. I think it could be kept quite simple, though, by handling it >> through the regulator subsystem as suggested above. A generic binding >> for a per-regulator and per-device poweron delay should solve that >> and possibly even make it transparent to the actual driver code. > > Lots of things have a GPIO for reset too, and some want clocks too. For > maximum usefulness this should be cross subsystem. I suspect the reset > controller API may be able to handle some of it. > > The regulator power on delays are already handled transparently, by the > time regulator_enable() returns the ramp should be finished. I think the regulator should encoded its own startup delay. Each individual device should handle its own requirements for delay after power is stable. The regulator_enable() will handle the delays for the regulator device. And adding the msleep(25) is for lm90 device. If without delay, sometimes the device can't work properly. If read lm90 register immediately after enabling regulator, the reading may be failed. I'm not sure if 25ms is the right value, I read the LM90 SPEC, the max of "SMBus Clock Low Time" is 25ms, so I supposed that it may need about 25ms to stable after power on. Thanks. Wei. > > * Unknown Key > * 0x7EA229BD > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [112/121] m32r: consistently use "suffix-$(...)"
Acked-by: Hirokazu Takata Sorry, it is my old mistake that still remained in the m32r kernel. Please apply this patch. Thanks, -- Takata From: Ben Hutchings Subject: [112/121] m32r: consistently use "suffix-$(...)" Date: Sun, 08 Sep 2013 03:52:01 +0100 > 3.2.51-rc1 review patch. If anyone has any objections, please let me know. > > -- > > From: Geert Uytterhoeven > > commit df12aef6a19bb2d69859a94936bda0e6ccaf3327 upstream. > > Commit a556bec9955c ("m32r: fix arch/m32r/boot/compressed/Makefile") > changed "$(suffix_y)" to "$(suffix-y)", but didn't update any location > where "suffix_y" is set, causing: > > make[5]: *** No rule to make target > `arch/m32r/boot/compressed/vmlinux.bin.', needed by > `arch/m32r/boot/compressed/piggy.o'. Stop. > make[4]: *** [arch/m32r/boot/compressed/vmlinux] Error 2 > make[3]: *** [zImage] Error 2 > > Correct the other locations to fix this. > > Signed-off-by: Geert Uytterhoeven > Cc: Hirokazu Takata > Signed-off-by: Andrew Morton > Signed-off-by: Linus Torvalds > Signed-off-by: Ben Hutchings > --- > arch/m32r/boot/compressed/Makefile | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/m32r/boot/compressed/Makefile > b/arch/m32r/boot/compressed/Makefile > index 177716b..01729c2 100644 > --- a/arch/m32r/boot/compressed/Makefile > +++ b/arch/m32r/boot/compressed/Makefile > @@ -43,9 +43,9 @@ endif > > OBJCOPYFLAGS += -R .empty_zero_page > > -suffix_$(CONFIG_KERNEL_GZIP) = gz > -suffix_$(CONFIG_KERNEL_BZIP2)= bz2 > -suffix_$(CONFIG_KERNEL_LZMA) = lzma > +suffix-$(CONFIG_KERNEL_GZIP) = gz > +suffix-$(CONFIG_KERNEL_BZIP2)= bz2 > +suffix-$(CONFIG_KERNEL_LZMA) = lzma > > $(obj)/piggy.o: $(obj)/vmlinux.scr $(obj)/vmlinux.bin.$(suffix-y) FORCE > $(call if_changed,ld) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[performance regression, bisected] scheduler: should_we_balance() kills filesystem performance
Hi folks, I just updated my performance test VM to the current 3.12-git tree after the XFS dev branch was merged. The first test I ran which was a 16-way concurrent fsmark test to create lots of files gave me a number about 30% lower than I expected - ~180k files/s when I was expecting somewhere around 250k files/s. I did a bisect, and the bisect landed on this commit: commit 23f0d2093c789e612185180c468fa09063834e87 Author: Joonsoo Kim Date: Tue Aug 6 17:36:42 2013 +0900 sched: Factor out code to should_we_balance() Now checking whether this cpu is appropriate to balance or not is embedded into update_sg_lb_stats() and this checking has no direct relationship to this function. There is not enough reason to place this checking at update_sg_lb_stats(), except saving one iteration for sched_group_cpus. Now, i couldn't revert that patch by itself, but I reverted the series of about 10 scheduler patches in that series total from a current TOT and the regression went away. Hence I'm pretty confident that the this is the patch causing the issue as i've verified it in more than one way and the difference between "good" and "bad" was signficantlt greater than the variance of the test (1.5-2 stddev difference). In more detail: v4 filesystem v5 filesystem 3.11+xfsdev:220k files/s225k files/s 3.12-git180k files/s185k files/s 3.12-git-revert 245k files/s247k files/s The test vm is a 16p/16GB RAM VM, with a sparse 100TB filesystem image sitting on a 4-way RAID0 SSD array formatted with XFS and the image file is accessed by virtio+direct IO. The fsmark command line is: time ./fs_mark -D 1 -S0 -n 10 -s 0 -L 32 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ -d /mnt/scratch/8 -d /mnt/scratch/9 \ -d /mnt/scratch/10 -d /mnt/scratch/11 \ -d /mnt/scratch/12 -d /mnt/scratch/13 \ -d /mnt/scratch/14 -d /mnt/scratch/15 \ | tee >(stats --trim-outliers | tail -1 1>&2) The workload on XFS runs to almost being CPU bound - the effect of the above patch was that there was a lot of idle time left in the system. The workload consumed the same amount of user and system CPU, just instantaneous CPU usage was reduced by 20-30% and the elaspsed time was increased by 20-30%. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
On 09/09/2013 08:40 PM, George Spelvin wrote: I'm really wondering about only trying once before taking the write lock. Yes, using the lsbit is a cute hack, but are we using it for its cuteness rather than its effectiveness? Renames happen occasionally. If that causes all the current pathname translations to fall back to the write lock, that is fairly heavy. Worse, all of those translations will (unnecessarily) bump the write seqcount, triggering *other* translations to fail back to the write-lock path. One patch to fix this would be to have the fallback read algorithm take sl->lock but *not* touch sl->seqcount, so it wouldn't break concurrent readers. Actually, a follow-up patch that I am planning to do is to introduce a read_seqlock() primitive in seqlock.h that does exactly that. Then the write_seqlock() in this patch will be modified to read_seqlock(). -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] depmod: warn on invalid devname specification
Hi Tom, On Mon, Sep 9, 2013 at 3:01 PM, Tom Gundersen wrote: > During the last merge window (3.12) a couple of modules gained devname > aliases, but without the necessary major and minor information. These were > then silently ignored when generating modules.devname. > > Complain loudly to avoid such errors sneaking in undetected in the future: > > depmod: ERROR: Module 'zram' has devname (zram) but lacks major and minor > information. Ignoring. > depmod: ERROR: Module 'uhid' has devname (uhid) but lacks major and minor > information. Ignoring. > > Cc: Kay Sievers > Cc: Lucas De Marchi > --- > tools/depmod.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/tools/depmod.c b/tools/depmod.c > index 985cf3a..5855b2a 100644 > --- a/tools/depmod.c > +++ b/tools/depmod.c > @@ -2120,11 +2120,18 @@ static int output_devname(struct depmod *depmod, FILE > *out) > minor = min; > } > > - if (type != '\0' && devname != NULL) { > + if (type != '\0' && devname != NULL) > + break; > + } > + > + if (devname != NULL) { > + if (type != '\0') > fprintf(out, "%s %s %c%u:%u\n", mod->modname, > devname, type, major, minor); > - break; > - } > + else > + ERR("Module '%s' has devname (%s) but " > + "lacks major and minor information. " > + "Ignoring.\n", mod->modname, devname); > } > } > > -- Patch has been applied. Thanks. Lucas De Marchi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/10/2013 07:36 AM, Michael Ellerman wrote: > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: >> This patchset is the re-spin of the original branch stack sampling >> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset >> also enables SW based branch filtering support for PPC64 platforms which have >> branch stack sampling support. With this new enablement, the branch filter >> support >> for PPC64 platforms have been extended to include all these combinations >> discussed >> below with a sample test application program. > > ... > >> Mixed filters >> - >> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog >> Error: >> The perf.data file has no samples! >> >> NOTE: As expected. The HW filters all the branches which are calls and SW >> tries to find return >> branches in that given set. Both the filters are mutually exclussive, so >> obviously no samples >> found in the end profile. > > The semantics of multiple filters is not clear to me. It could be an OR, > or an AND. You have implemented AND, does that match existing behaviour > on x86 for example? I believe it does match. X86 code drops the branch records (originally captured in the LBR) while applying the SW filters. Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 08:40 PM, Stephen Warren wrote: On 09/09/2013 09:36 PM, Guenter Roeck wrote: On 09/09/2013 08:22 PM, Wei Ni wrote: On 09/09/2013 11:50 PM, Guenter Roeck wrote: On Mon, Sep 09, 2013 at 02:50:22PM +0100, Mark Brown wrote: On Mon, Sep 09, 2013 at 04:34:43AM -0700, Guenter Roeck wrote: On 09/09/2013 04:12 AM, Mark Brown wrote: On Mon, Sep 09, 2013 at 06:29:11PM +0800, Wei Ni wrote: This doesn't look good, it is going to ignore actual errors - I *really* doubt that vcc is optional, it looks like it's the main power supply for the device. You should use normal regulator_get(), _optional() is for supplies which could physically not be provided in a system (eg, if the device can generate them internally if required). Then he'll have to make sure that all devicetree files in the system contain references to this regulator. Or get the patches applied on top of the code that'll be going in this cycle implementing get_optional() properly - when that's done the default will be to provide a dummy supply for regulator_get(). If you ack the patch I'd be happy to carry it. Jean will have to ack it. I think it's better to use get_optional(), and ignore the errors except -EPROBE_DEFER. Because many platform may always power on this device, and will not provide regulator for it, so if we get errors from regulator subsystem and return it directly, then the probe() can't be implemented, this driver can't work properly, even though it can work without regulator support. Mark, do you mean you have patches for regulator_get_optional() and regulator_get()? My understanding is that by adding regulator support you essentially committed to adding regulators (if necessary dummy ones) for this driver to all those platforms. This is quite similar to other drivers in the same situation. Once you start along that route, you'll have to go it all the way. By using regulator_get_optional(), the regulator should be optional, hence you only have to add it to platforms that need it. Earlier comments suggest that this is not the intended use case for regulator_get_optional(). Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ACPI: Move acpi_bus_get_device() from bus.c to scan.c
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 07/27/2013 09:24 AM, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki Subject: ACPI: Move > acpi_bus_get_device() > from bus.c to scan.c > > Move acpi_bus_get_device() from bus.c to scan.c which allows > acpi_bus_data_handler() to become > static and clean up the latter. > > Signed-off-by: Rafael J. Wysocki --- > drivers/acpi/bus.c | > 21 - drivers/acpi/scan.c | 30 > ++ > include/acpi/acpi_bus.h |1 - 3 files changed, 22 insertions(+), 30 > deletions(-) > > Index: linux-pm/drivers/acpi/bus.c > === --- > linux-pm.orig/drivers/acpi/bus.c +++ linux-pm/drivers/acpi/bus.c @@ -89,27 > +89,6 @@ static > struct dmi_system_id dsdt_dmi_tab Device Management [cut] > - -EXPORT_SYMBOL(acpi_bus_get_device); - acpi_status > acpi_bus_get_status_handle(acpi_handle > handle, unsigned long long *sta) { Index: linux-pm/drivers/acpi/scan.c > === --- > linux-pm.orig/drivers/acpi/scan.c +++ linux-pm/drivers/acpi/scan.c @@ -970,6 > +970,28 @@ struct > bus_type acpi_bus_type = { .uevent= acpi_device_uevent, }; > [cut] > +} +EXPORT_SYMBOL_GPL(acpi_bus_get_device); + Was it intentional to change the EXPORT_SYMBOL to EXPORT_SYMBOL_GPL here? While I understand that it is completely unsupported anyway, this change does break at least the latest version of the proprietary nvidia graphics driver. - -- Jonathan Callen -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.21 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCgAGBQJSLpeMAAoJELHSF2kinlg4EGEP/2lav3ETWSAs7BqeDGCuacZ4 CqkWJlMJDvYawW443KdBEb8aM27tx+gqNp+ktHkWXuXG97X79yAVTRMggOCuaqkb AgYJkDXRY+64UWtn2GeCM4vSsBOB86UoSspdZXHzF6JBDXrdd2ZXkpb1q7u1xBdi SZIz3PxZmBLMVcxu+Bh67e1hRXbxKxfKlF1Zl7MC/9NDnxkZmrOklqY2VamG864a OkFU12bvgdGCwOnq1gHzDjt2PIjL7RNxY4wPWvnyJ1DDOi2tb5tC7hw8VoUTqUPx h81QmAY2T/XFWTpfNnmJgrQYw+AE8B1TNz+ciKWRi0VCAAjYdomPCYfjxnfYBvN2 a1RRAUMG7DFOZxzBNE9Yv371zEAKLAt2Kbt5PynSqTiivRu7XJOpz5326Q1P9Csy HZ369Gu7oC2ErAEv3mO82do5pM6/10DpwFPhY5eGq8l7rdStk8cRdUW1Ivc/d0sx CDt0ZDVV+bvtduwnMTRiXWIgR2w/iUuNcwU5RfU/2NsJLyXeUoMdR0OHLa1sDSVk bK3Jc6AAPQG8aV83Je50eFeYUQgYN9DZ2eRR+KtzBFaI4TXXzq3H+1E5QBBqbfc8 6PZuVC+CwMYmmhsozvkyyKIdYVJL0KiWOgaZueNq6Z2Fqiupr4BrMpKvS1rkix4y Pn0TU7V4arGbXGzJN1VZ =SHrw -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/12] One more attempt at useful kernel lockdown
On Mon, 2013-09-09 at 20:09 -0700, David Lang wrote: > On Tue, 10 Sep 2013, Matthew Garrett wrote: > > Someone adds a new "install_evil()" syscall and adds a disable bit. If I > > don't disable it, I'm now vulnerable. Please pay attention to earlier > > discussion. > > so instead they add install_evil() and don't have it be disabled by your big > switch. And that's a bug, so we fix it. > > Describe the security case for disabling PCI BAR access but permitting > > i/o port access. > > Simple, I have some device that lives on those I/O ports that I want to be > able > to use. That's a great argument for permitting i/o port access. But in that case, why are you disabling BAR access? You can use the i/o port access to reprogram the device in question into a range you can modify, which allows you to avoid the BAR restriction. > > Because then someone disables selinux on the kernel command line. > > If they can modify the command line, they can remove your command line switch > to > turn this on. Not in the secure boot case. > If you really care about this, why are you using a bootloader that lets you > modify the kernel command line in the first place? And then the user just modifies the configuration file instead. > In any case, even if you make it impossible to change from the command line, > you > won't prevent people from changing your system (unless you take control > completely away from them with TPM or similar) That's fine. People are free to modify their own systems. > remember that the system integrity checking of the original Tivo was defeated > by > someone dong a binary patch to bypass a routine that they didn't understand > that > took a lot of time, so they did a binary patch to the bios to speed up boot > and > discovered that what they did was disable the system integrity checking. That's why modern systems require signed firmware updates. > users who own the systems are going to modify them and bypass any > restrictions > you want to impose. The idea isn't to produce something that's impossible for the owner of a system to disable. The idea is to produce something that can't be used to circumvent the security policy that the system owner has chosen. Could you please give me the benefit of the doubt and assume that I'm not completely unaware of how computers work? -- Matthew Garrett N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
[PATCH] slub: Fix calculation of cpu slabs
/sys/kernel/slab/:t-048 # cat cpu_slabs 231 N0=16 N1=215 /sys/kernel/slab/:t-048 # cat slabs 145 N0=36 N1=109 See, the number of slabs is smaller than that of cpu slabs. The bug was introduced by commit 49e2258586b423684f03c278149ab46d8f8b6700 ("slub: per cpu cache for partial pages"). We should use page->pages instead of page->pobjects when calculating the number of cpu partial slabs. This also fixes the mapping of slabs and nodes. As there's no variable storing the number of total/active objects in cpu partial slabs, and we don't have user interfaces requiring those statistics, I just add WARN_ON for those cases. Cc: # 3.2+ Signed-off-by: Li Zefan --- mm/slub.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index e3ba1f2..6ea461d 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4300,7 +4300,13 @@ static ssize_t show_slab_objects(struct kmem_cache *s, page = ACCESS_ONCE(c->partial); if (page) { - x = page->pobjects; + node = page_to_nid(page); + if (flags & SO_TOTAL) + WARN_ON_ONCE(1); + else if (flags & SO_OBJECTS) + WARN_ON_ONCE(1); + else + x = page->pages; total += x; nodes[node] += x; } -- 1.8.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 09:36 PM, Guenter Roeck wrote: > On 09/09/2013 08:22 PM, Wei Ni wrote: >> On 09/09/2013 11:50 PM, Guenter Roeck wrote: >>> On Mon, Sep 09, 2013 at 02:50:22PM +0100, Mark Brown wrote: On Mon, Sep 09, 2013 at 04:34:43AM -0700, Guenter Roeck wrote: > On 09/09/2013 04:12 AM, Mark Brown wrote: >> On Mon, Sep 09, 2013 at 06:29:11PM +0800, Wei Ni wrote: >> This doesn't look good, it is going to ignore actual errors - I >> *really* >> doubt that vcc is optional, it looks like it's the main power >> supply for >> the device. You should use normal regulator_get(), _optional() is >> for >> supplies which could physically not be provided in a system (eg, >> if the >> device can generate them internally if required). > Then he'll have to make sure that all devicetree files in the system > contain references to this regulator. Or get the patches applied on top of the code that'll be going in this cycle implementing get_optional() properly - when that's done the default will be to provide a dummy supply for regulator_get(). If you ack the patch I'd be happy to carry it. >>> Jean will have to ack it. >>> >> I think it's better to use get_optional(), and ignore the errors except >> -EPROBE_DEFER. Because many platform may always power on this device, >> and will not provide regulator for it, so if we get errors from >> regulator subsystem and return it directly, then the probe() can't be >> implemented, this driver can't work properly, even though it can work >> without regulator support. >> Mark, do you mean you have patches for regulator_get_optional() and >> regulator_get()? >> > > My understanding is that by adding regulator support you essentially > committed to adding regulators (if necessary dummy ones) for this driver > to all those platforms. This is quite similar to other drivers in the > same situation. Once you start along that route, you'll have to go it > all the way. By using regulator_get_optional(), the regulator should be optional, hence you only have to add it to platforms that need it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 08:22 PM, Wei Ni wrote: On 09/09/2013 11:50 PM, Guenter Roeck wrote: On Mon, Sep 09, 2013 at 02:50:22PM +0100, Mark Brown wrote: On Mon, Sep 09, 2013 at 04:34:43AM -0700, Guenter Roeck wrote: On 09/09/2013 04:12 AM, Mark Brown wrote: On Mon, Sep 09, 2013 at 06:29:11PM +0800, Wei Ni wrote: This doesn't look good, it is going to ignore actual errors - I *really* doubt that vcc is optional, it looks like it's the main power supply for the device. You should use normal regulator_get(), _optional() is for supplies which could physically not be provided in a system (eg, if the device can generate them internally if required). Then he'll have to make sure that all devicetree files in the system contain references to this regulator. Or get the patches applied on top of the code that'll be going in this cycle implementing get_optional() properly - when that's done the default will be to provide a dummy supply for regulator_get(). If you ack the patch I'd be happy to carry it. Jean will have to ack it. I think it's better to use get_optional(), and ignore the errors except -EPROBE_DEFER. Because many platform may always power on this device, and will not provide regulator for it, so if we get errors from regulator subsystem and return it directly, then the probe() can't be implemented, this driver can't work properly, even though it can work without regulator support. Mark, do you mean you have patches for regulator_get_optional() and regulator_get()? My understanding is that by adding regulator support you essentially committed to adding regulators (if necessary dummy ones) for this driver to all those platforms. This is quite similar to other drivers in the same situation. Once you start along that route, you'll have to go it all the way. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpu/mem hotplug: Add try_online_node() for cpu_up()
(2013/09/10 12:31), Yasuaki Ishimatsu wrote: > (2013/09/10 9:24), Toshi Kani wrote: >> cpu_up() has #ifdef CONFIG_MEMORY_HOTPLUG code blocks, which >> call mem_online_node() to put its node online if offlined and >> then call build_all_zonelists() to initialize the zone list. >> These steps are specific to memory hotplug, and should be >> managed in mm/memory_hotplug.c. lock_memory_hotplug() should >> also be held for the whole steps. >> >> For this reason, this patch replaces mem_online_node() with >> try_online_node(), which performs the whole steps with >> lock_memory_hotplug() held. try_online_node() is named after >> try_offline_node() as they have similar purpose. >> >> There is no functional change in this patch. >> >> Signed-off-by: Toshi Kani >> --- >>include/linux/memory_hotplug.h |8 +++- >>kernel/cpu.c | 29 +++-- >>mm/memory_hotplug.c| 15 +-- >>3 files changed, 23 insertions(+), 29 deletions(-) >> >> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h >> index dd38e62..22203c2 100644 >> --- a/include/linux/memory_hotplug.h >> +++ b/include/linux/memory_hotplug.h >> @@ -94,6 +94,8 @@ extern void __online_page_set_limits(struct page *page); >>extern void __online_page_increment_counters(struct page *page); >>extern void __online_page_free(struct page *page); >> >> +extern int try_online_node(int nid); >> + >>#ifdef CONFIG_MEMORY_HOTREMOVE >>extern bool is_pageblock_removable_nolock(struct page *page); >>extern int arch_remove_memory(u64 start, u64 size); >> @@ -225,6 +227,11 @@ static inline void >> register_page_bootmem_info_node(struct pglist_data *pgdat) >>{ >>} >> >> +static inline int try_online_node(int nid) >> +{ >> +return 0; >> +} >> + >>static inline void lock_memory_hotplug(void) {} >>static inline void unlock_memory_hotplug(void) {} >> >> @@ -256,7 +263,6 @@ static inline void remove_memory(int nid, u64 start, u64 >> size) {} >> >>extern int walk_memory_range(unsigned long start_pfn, unsigned long >> end_pfn, >> void *arg, int (*func)(struct memory_block *, void *)); >> -extern int mem_online_node(int nid); >>extern int add_memory(int nid, u64 start, u64 size); >>extern int arch_add_memory(int nid, u64 start, u64 size); >>extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); >> diff --git a/kernel/cpu.c b/kernel/cpu.c >> index d7f07a2..c10b285 100644 >> --- a/kernel/cpu.c >> +++ b/kernel/cpu.c >> @@ -420,11 +420,6 @@ int cpu_up(unsigned int cpu) >>{ >> int err = 0; >> >> -#ifdef CONFIG_MEMORY_HOTPLUG >> -int nid; >> -pg_data_t *pgdat; >> -#endif >> - >> if (!cpu_possible(cpu)) { >> printk(KERN_ERR "can't online cpu %d because it is not " >> "configured as may-hotadd at boot time\n", cpu); >> @@ -435,27 +430,9 @@ int cpu_up(unsigned int cpu) >> return -EINVAL; >> } >> >> -#ifdef CONFIG_MEMORY_HOTPLUG >> -nid = cpu_to_node(cpu); >> -if (!node_online(nid)) { >> -err = mem_online_node(nid); >> -if (err) >> -return err; >> -} >> - >> -pgdat = NODE_DATA(nid); >> -if (!pgdat) { > >> -printk(KERN_ERR >> -"Can't online cpu %d due to NULL pgdat\n", cpu); > > Please move this comments into try_online_node() too. In this case, please use pr_err() instead of printk(). Thanks, Yasuaki Ishimatsu > > Thanks, > Yasuaki Ishimatsu > >> -return -ENOMEM; >> -} >> - >> -if (pgdat->node_zonelists->_zonerefs->zone == NULL) { >> -mutex_lock(_mutex); >> -build_all_zonelists(NULL, NULL); >> -mutex_unlock(_mutex); >> -} >> -#endif >> +err = try_online_node(cpu_to_node(cpu)); >> +if (err) >> +return err; >> >> cpu_maps_update_begin(); >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index ed85fe3..c326bdf 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1044,14 +1044,19 @@ static void rollback_node_hotadd(int nid, pg_data_t >> *pgdat) >>} >> >> >> -/* >> +/** >> + * try_online_node - online a node if offlined >> + * >> * called by cpu_up() to online a node without onlined memory. >> */ >> -int mem_online_node(int nid) >> +int try_online_node(int nid) >>{ >> pg_data_t *pgdat; >> int ret; >> >> +if (node_online(nid)) >> +return 0; >> + >> lock_memory_hotplug(); >> pgdat = hotadd_new_pgdat(nid, 0); >> if (!pgdat) { >> @@ -1062,6 +1067,12 @@ int mem_online_node(int nid) >> ret = register_one_node(nid); >> BUG_ON(ret); >> >> +if (pgdat->node_zonelists->_zonerefs->zone == NULL) { >> +mutex_lock(_mutex); >> +build_all_zonelists(NULL, NULL); >> +
Re: [PATCH] cpu/mem hotplug: Add try_online_node() for cpu_up()
(2013/09/10 9:24), Toshi Kani wrote: > cpu_up() has #ifdef CONFIG_MEMORY_HOTPLUG code blocks, which > call mem_online_node() to put its node online if offlined and > then call build_all_zonelists() to initialize the zone list. > These steps are specific to memory hotplug, and should be > managed in mm/memory_hotplug.c. lock_memory_hotplug() should > also be held for the whole steps. > > For this reason, this patch replaces mem_online_node() with > try_online_node(), which performs the whole steps with > lock_memory_hotplug() held. try_online_node() is named after > try_offline_node() as they have similar purpose. > > There is no functional change in this patch. > > Signed-off-by: Toshi Kani > --- > include/linux/memory_hotplug.h |8 +++- > kernel/cpu.c | 29 +++-- > mm/memory_hotplug.c| 15 +-- > 3 files changed, 23 insertions(+), 29 deletions(-) > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index dd38e62..22203c2 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -94,6 +94,8 @@ extern void __online_page_set_limits(struct page *page); > extern void __online_page_increment_counters(struct page *page); > extern void __online_page_free(struct page *page); > > +extern int try_online_node(int nid); > + > #ifdef CONFIG_MEMORY_HOTREMOVE > extern bool is_pageblock_removable_nolock(struct page *page); > extern int arch_remove_memory(u64 start, u64 size); > @@ -225,6 +227,11 @@ static inline void > register_page_bootmem_info_node(struct pglist_data *pgdat) > { > } > > +static inline int try_online_node(int nid) > +{ > + return 0; > +} > + > static inline void lock_memory_hotplug(void) {} > static inline void unlock_memory_hotplug(void) {} > > @@ -256,7 +263,6 @@ static inline void remove_memory(int nid, u64 start, u64 > size) {} > > extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn, > void *arg, int (*func)(struct memory_block *, void *)); > -extern int mem_online_node(int nid); > extern int add_memory(int nid, u64 start, u64 size); > extern int arch_add_memory(int nid, u64 start, u64 size); > extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); > diff --git a/kernel/cpu.c b/kernel/cpu.c > index d7f07a2..c10b285 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -420,11 +420,6 @@ int cpu_up(unsigned int cpu) > { > int err = 0; > > -#ifdef CONFIG_MEMORY_HOTPLUG > - int nid; > - pg_data_t *pgdat; > -#endif > - > if (!cpu_possible(cpu)) { > printk(KERN_ERR "can't online cpu %d because it is not " > "configured as may-hotadd at boot time\n", cpu); > @@ -435,27 +430,9 @@ int cpu_up(unsigned int cpu) > return -EINVAL; > } > > -#ifdef CONFIG_MEMORY_HOTPLUG > - nid = cpu_to_node(cpu); > - if (!node_online(nid)) { > - err = mem_online_node(nid); > - if (err) > - return err; > - } > - > - pgdat = NODE_DATA(nid); > - if (!pgdat) { > - printk(KERN_ERR > - "Can't online cpu %d due to NULL pgdat\n", cpu); Please move this comments into try_online_node() too. Thanks, Yasuaki Ishimatsu > - return -ENOMEM; > - } > - > - if (pgdat->node_zonelists->_zonerefs->zone == NULL) { > - mutex_lock(_mutex); > - build_all_zonelists(NULL, NULL); > - mutex_unlock(_mutex); > - } > -#endif > + err = try_online_node(cpu_to_node(cpu)); > + if (err) > + return err; > > cpu_maps_update_begin(); > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index ed85fe3..c326bdf 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1044,14 +1044,19 @@ static void rollback_node_hotadd(int nid, pg_data_t > *pgdat) > } > > > -/* > +/** > + * try_online_node - online a node if offlined > + * >* called by cpu_up() to online a node without onlined memory. >*/ > -int mem_online_node(int nid) > +int try_online_node(int nid) > { > pg_data_t *pgdat; > int ret; > > + if (node_online(nid)) > + return 0; > + > lock_memory_hotplug(); > pgdat = hotadd_new_pgdat(nid, 0); > if (!pgdat) { > @@ -1062,6 +1067,12 @@ int mem_online_node(int nid) > ret = register_one_node(nid); > BUG_ON(ret); > > + if (pgdat->node_zonelists->_zonerefs->zone == NULL) { > + mutex_lock(_mutex); > + build_all_zonelists(NULL, NULL); > + mutex_unlock(_mutex); > + } > + > out: > unlock_memory_hotplug(); > return ret; > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see:
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On 09/09/2013 07:44 PM, Igor Gnatenko wrote: > On Mon, 2013-09-09 at 16:42 +0800, Aaron Lu wrote: >> diff --git a/drivers/gpu/drm/i915/i915_dma.c >> b/drivers/gpu/drm/i915/i915_dma.c >> index f466980..75fba17 100644 >> --- a/drivers/gpu/drm/i915/i915_dma.c >> +++ b/drivers/gpu/drm/i915/i915_dma.c >> @@ -1650,7 +1650,7 @@ int i915_driver_load(struct drm_device *dev, unsigned >> long flags) >> if (INTEL_INFO(dev)->num_pipes) { >> /* Must be done after probing outputs */ >> intel_opregion_init(dev); >> -acpi_video_register(); >> +__acpi_video_register(i915_take_over_backlight); >> } >> >> if (IS_GEN5(dev)) > > I can't compile: > > > DEBUG: drivers/gpu/drm/i915/i915_dma.c: In function 'i915_driver_load': > DEBUG: drivers/gpu/drm/i915/i915_dma.c:1661:3: error: implicit > declaration of function > '__acpi_video_register' [-Werror=implicit-function-declaration] > DEBUG:__acpi_video_register(i915_take_over_backlight); > DEBUG:^ > DEBUG: cc1: some warnings being treated as errors > DEBUG: make[4]: *** [drivers/gpu/drm/i915/i915_dma.o] Error 1 > DEBUG: make[3]: *** [drivers/gpu/drm/i915] Error 2 > DEBUG: make[2]: *** [drivers/gpu/drm] Error 2 > DEBUG: make[1]: *** [drivers/gpu] Error 2 > DEBUG: make: *** [drivers] Error 2 > The two patches are based on top of Rafael's linux-next tree. I just tried it again, no compile problem for me. I also tried on today Linus' master tree, as there are some updates from i915, two conflicts exist. I've just resolved them and will update it in next revision. If you want to try it now, please use: https://github.com/aaronlu/linux acpi_video_rework Thanks, Aaron -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/2] hwmon: (lm90) Add power control
On 09/09/2013 11:50 PM, Guenter Roeck wrote: > On Mon, Sep 09, 2013 at 02:50:22PM +0100, Mark Brown wrote: >> On Mon, Sep 09, 2013 at 04:34:43AM -0700, Guenter Roeck wrote: >>> On 09/09/2013 04:12 AM, Mark Brown wrote: On Mon, Sep 09, 2013 at 06:29:11PM +0800, Wei Ni wrote: >> This doesn't look good, it is going to ignore actual errors - I *really* doubt that vcc is optional, it looks like it's the main power supply for the device. You should use normal regulator_get(), _optional() is for supplies which could physically not be provided in a system (eg, if the device can generate them internally if required). >> >>> Then he'll have to make sure that all devicetree files in the system >>> contain references to this regulator. >> >> Or get the patches applied on top of the code that'll be going in this >> cycle implementing get_optional() properly - when that's done the >> default will be to provide a dummy supply for regulator_get(). If you >> ack the patch I'd be happy to carry it. >> > Jean will have to ack it. > I think it's better to use get_optional(), and ignore the errors except -EPROBE_DEFER. Because many platform may always power on this device, and will not provide regulator for it, so if we get errors from regulator subsystem and return it directly, then the probe() can't be implemented, this driver can't work properly, even though it can work without regulator support. Mark, do you mean you have patches for regulator_get_optional() and regulator_get()? Wei. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] PCI/ACPI: Convert ACPI PCI Hot Plug core debug fuction to dynamic debug function
From: Lan Tianyu This patch is to use pr_debug/info/warn/err to replace acpiphp debug functions and remove module's debug param. Signed-off-by: Lan Tianyu --- drivers/pci/hotplug/acpiphp.h | 10 -- drivers/pci/hotplug/acpiphp_core.c | 35 +-- drivers/pci/hotplug/acpiphp_glue.c | 23 --- 3 files changed, 29 insertions(+), 39 deletions(-) diff --git a/drivers/pci/hotplug/acpiphp.h b/drivers/pci/hotplug/acpiphp.h index f4e0289..26100f5 100644 --- a/drivers/pci/hotplug/acpiphp.h +++ b/drivers/pci/hotplug/acpiphp.h @@ -39,16 +39,6 @@ #include #include -#define dbg(format, arg...)\ - do {\ - if (acpiphp_debug) \ - printk(KERN_DEBUG "%s: " format,\ - MY_NAME , ## arg); \ - } while (0) -#define err(format, arg...) printk(KERN_ERR "%s: " format, MY_NAME , ## arg) -#define info(format, arg...) printk(KERN_INFO "%s: " format, MY_NAME , ## arg) -#define warn(format, arg...) printk(KERN_WARNING "%s: " format, MY_NAME , ## arg) - struct acpiphp_context; struct acpiphp_bridge; struct acpiphp_slot; diff --git a/drivers/pci/hotplug/acpiphp_core.c b/drivers/pci/hotplug/acpiphp_core.c index bf2203e..a56ae79 100644 --- a/drivers/pci/hotplug/acpiphp_core.c +++ b/drivers/pci/hotplug/acpiphp_core.c @@ -31,6 +31,8 @@ * */ +#define pr_fmt(fmt) "acpiphp: " fmt + #include #include #include @@ -43,12 +45,9 @@ #include #include "acpiphp.h" -#define MY_NAME"acpiphp" - /* name size which is used for entries in pcihpfs */ #define SLOT_NAME_SIZE 21 /* {_SUN} */ -bool acpiphp_debug; bool acpiphp_disabled; /* local variables */ @@ -61,9 +60,7 @@ static struct acpiphp_attention_info *attention_info; MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE("GPL"); -MODULE_PARM_DESC(debug, "Debugging mode enabled or not"); MODULE_PARM_DESC(disable, "disable acpiphp driver"); -module_param_named(debug, acpiphp_debug, bool, 0644); module_param_named(disable, acpiphp_disabled, bool, 0444); /* export the attention callback registration methods */ @@ -139,7 +136,7 @@ static int enable_slot(struct hotplug_slot *hotplug_slot) { struct slot *slot = hotplug_slot->private; - dbg("%s - physical_slot = %s\n", __func__, slot_name(slot)); + pr_debug("%s - physical_slot = %s\n", __func__, slot_name(slot)); /* enable the specified slot */ return acpiphp_enable_slot(slot->acpi_slot); @@ -156,7 +153,7 @@ static int disable_slot(struct hotplug_slot *hotplug_slot) { struct slot *slot = hotplug_slot->private; - dbg("%s - physical_slot = %s\n", __func__, slot_name(slot)); + pr_debug("%s - physical_slot = %s\n", __func__, slot_name(slot)); /* disable the specified slot */ return acpiphp_disable_and_eject_slot(slot->acpi_slot); @@ -176,7 +173,8 @@ static int disable_slot(struct hotplug_slot *hotplug_slot) { int retval = -ENODEV; - dbg("%s - physical_slot = %s\n", __func__, hotplug_slot_name(hotplug_slot)); + pr_debug("%s - physical_slot = %s\n", __func__, + hotplug_slot_name(hotplug_slot)); if (attention_info && try_module_get(attention_info->owner)) { retval = attention_info->set_attn(hotplug_slot, status); @@ -199,7 +197,7 @@ static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value) { struct slot *slot = hotplug_slot->private; - dbg("%s - physical_slot = %s\n", __func__, slot_name(slot)); + pr_debug("%s - physical_slot = %s\n", __func__, slot_name(slot)); *value = acpiphp_get_power_status(slot->acpi_slot); @@ -221,7 +219,8 @@ static int get_attention_status(struct hotplug_slot *hotplug_slot, u8 *value) { int retval = -EINVAL; - dbg("%s - physical_slot = %s\n", __func__, hotplug_slot_name(hotplug_slot)); + pr_debug("%s - physical_slot = %s\n", __func__, + hotplug_slot_name(hotplug_slot)); if (attention_info && try_module_get(attention_info->owner)) { retval = attention_info->get_attn(hotplug_slot, value); @@ -244,7 +243,7 @@ static int get_latch_status(struct hotplug_slot *hotplug_slot, u8 *value) { struct slot *slot = hotplug_slot->private; - dbg("%s - physical_slot = %s\n", __func__, slot_name(slot)); + pr_debug("%s - physical_slot = %s\n", __func__, slot_name(slot)); *value = acpiphp_get_latch_status(slot->acpi_slot); @@ -264,7 +263,7 @@ static int get_adapter_status(struct hotplug_slot *hotplug_slot, u8 *value) { struct slot *slot = hotplug_slot->private; - dbg("%s - physical_slot = %s\n", __func__, slot_name(slot)); + pr_debug("%s - physical_slot = %s\n", __func__,
[PATCH 1/2] PCI/ACPI: Convert ACPI PCI Hot Plug IBM Extension dbg/err() to pr_debug/pr_err()
From: Lan Tianyu This patch is to convert internal debug macros to dynamic debug function and remove module's debug param. Signed-off-by: Lan Tianyu --- drivers/pci/hotplug/acpiphp_ibm.c | 56 --- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/drivers/pci/hotplug/acpiphp_ibm.c b/drivers/pci/hotplug/acpiphp_ibm.c index 2f5786c..32d1db6 100644 --- a/drivers/pci/hotplug/acpiphp_ibm.c +++ b/drivers/pci/hotplug/acpiphp_ibm.c @@ -25,6 +25,8 @@ * */ +#define pr_fmt(fmt) "acpiphp_ibm: " fmt + #include #include #include @@ -43,23 +45,11 @@ #define DRIVER_AUTHOR "Irene Zubarev , Vernon Mauery " #define DRIVER_DESC"ACPI Hot Plug PCI Controller Driver IBM extension" -static bool debug; MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE("GPL"); MODULE_VERSION(DRIVER_VERSION); -module_param(debug, bool, 0644); -MODULE_PARM_DESC(debug, " Debugging mode enabled or not"); -#define MY_NAME "acpiphp_ibm" - -#undef dbg -#define dbg(format, arg...)\ -do { \ - if (debug) \ - printk(KERN_DEBUG "%s: " format,\ - MY_NAME , ## arg); \ -} while (0) #define FOUND_APCI 0x61504349 /* these are the names for the IBM ACPI pseudo-device */ @@ -189,7 +179,7 @@ static int ibm_set_attention_status(struct hotplug_slot *slot, u8 status) ibm_slot = ibm_slot_from_id(hpslot_to_sun(slot)); - dbg("%s: set slot %d (%d) attention status to %d\n", __func__, + pr_debug("%s: set slot %d (%d) attention status to %d\n", __func__, ibm_slot->slot.slot_num, ibm_slot->slot.slot_id, (status ? 1 : 0)); @@ -202,10 +192,10 @@ static int ibm_set_attention_status(struct hotplug_slot *slot, u8 status) stat = acpi_evaluate_integer(ibm_acpi_handle, "APLS", , ); if (ACPI_FAILURE(stat)) { - err("APLS evaluation failed: 0x%08x\n", stat); + pr_err("APLS evaluation failed: 0x%08x\n", stat); return -ENODEV; } else if (!rc) { - err("APLS method failed: 0x%08llx\n", rc); + pr_err("APLS method failed: 0x%08llx\n", rc); return -ERANGE; } return 0; @@ -234,7 +224,7 @@ static int ibm_get_attention_status(struct hotplug_slot *slot, u8 *status) else *status = 0; - dbg("%s: get slot %d (%d) attention status is %d\n", __func__, + pr_debug("%s: get slot %d (%d) attention status is %d\n", __func__, ibm_slot->slot.slot_num, ibm_slot->slot.slot_id, *status); @@ -266,10 +256,10 @@ static void ibm_handle_events(acpi_handle handle, u32 event, void *context) u8 subevent = event & 0xf0; struct notification *note = context; - dbg("%s: Received notification %02x\n", __func__, event); + pr_debug("%s: Received notification %02x\n", __func__, event); if (subevent == 0x80) { - dbg("%s: generationg bus event\n", __func__); + pr_debug("%s: generationg bus event\n", __func__); acpi_bus_generate_netlink_event(note->device->pnp.device_class, dev_name(>device->dev), note->event, detail); @@ -301,7 +291,7 @@ static int ibm_get_table_from_acpi(char **bufp) status = acpi_evaluate_object(ibm_acpi_handle, "APCI", NULL, ); if (ACPI_FAILURE(status)) { - err("%s: APCI evaluation failed\n", __func__); + pr_err("%s: APCI evaluation failed\n", __func__); return -ENODEV; } @@ -309,13 +299,13 @@ static int ibm_get_table_from_acpi(char **bufp) if (!(package) || (package->type != ACPI_TYPE_PACKAGE) || !(package->package.elements)) { - err("%s: Invalid APCI object\n", __func__); + pr_err("%s: Invalid APCI object\n", __func__); goto read_table_done; } for(size = 0, i = 0; i < package->package.count; i++) { if (package->package.elements[i].type != ACPI_TYPE_BUFFER) { - err("%s: Invalid APCI element %d\n", __func__, i); + pr_err("%s: Invalid APCI element %d\n", __func__, i); goto read_table_done; } size += package->package.elements[i].buffer.length; @@ -325,7 +315,7 @@ static int ibm_get_table_from_acpi(char **bufp) goto read_table_done; lbuf = kzalloc(size, GFP_KERNEL); - dbg("%s: element count: %i, ASL table size: %i, = 0x%p\n", + pr_debug("%s: element count: %i, ASL table size: %i, = 0x%p\n",
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
Linus Torvalds wrote: > It doesn't need to. The RCU lookup looks at individual dentry sequence > numbers and doesn't care about the bigger rename sequence number at > all. Right; it's sequential. > The fallback (if you hit one of the very very rare races, or if you > hit a symlink) ends up doing per-path-component lookups under the > rename sequence lock, but for it, read-locking it until it succeeds is > the right thing to do. No, it's write-locking. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/12] One more attempt at useful kernel lockdown
On Tue, 10 Sep 2013, Matthew Garrett wrote: On Mon, 2013-09-09 at 19:44 -0700, David Lang wrote: On Tue, 10 Sep 2013, Matthew Garrett wrote: No. Say someone adds an additional lockdown bit to forbid raw access to mounted block devices. The "Turn everything off" approach now means that I won't be able to perform raw access to mounted block devices, even if that's something that my use case relies on. I was meaning that if you only turn off features that you know about, the addition of a new thing that can be disabled doesn't make you any worse off than you were. Someone adds a new "install_evil()" syscall and adds a disable bit. If I don't disable it, I'm now vulnerable. Please pay attention to earlier discussion. so instead they add install_evil() and don't have it be disabled by your big switch. or do you think the existance of this switch will give you veto power over any new system calls unless they include the ability for them to be disabled if they don't match your security model. so if you only have a single bit, how do you deal with the case where that bit locks down something that's required? (your reason for not just setting all bits in the first approach) Because that bit is well-defined, and if anything is added to it that doesn't match that definition then it's a bug. it may be well defined, but that doesn't mean that it actually matches what the system owner wants to do. If it doesn't match what the system owner wants to do, the system owner doesn't set it. The system owner uses a more appropriate security mechanism instead. SELinux is a wonderful example of how making a system that users cannot change doesn't improve security, it just gets disabled instead. The idea that the programmer can possibly anticipate all possible needs and provide a switch for exactly that need is just wrong. Users will have needs that you never thought of. The best systems are the ones where the creators look at what users are doing and react with "I never imagined doing that" Describe the security case for disabling PCI BAR access but permitting i/o port access. Simple, I have some device that lives on those I/O ports that I want to be able to use. Anything more granular means that you trust your userspace, and if you trust your userspace then you can already set up a granular policy using the existing tools for that job. So just use the existing tools. If you can't trust your userspace, how do you know that the userspace has set the big hammer flag in the first place? if you can trust it to throw that switch, you can trust it to throw multiple smaller switches. Hence the final patch in the series, and hence also the suggestion for exposing it as a command line option that can be set by the bootloader during an attested boot. And if SELinux can do the job, what is the reason for creating this new option? Because you can't embed an unmodifiable selinux policy in the kernel. Why not, have the policy set in an initramfs that's part of the kernel and have part of that policy be to block all access to the selinux controls. Because then someone disables selinux on the kernel command line. If they can modify the command line, they can remove your command line switch to turn this on. If you really care about this, why are you using a bootloader that lets you modify the kernel command line in the first place? In any case, even if you make it impossible to change from the command line, you won't prevent people from changing your system (unless you take control completely away from them with TPM or similar) remember that the system integrity checking of the original Tivo was defeated by someone dong a binary patch to bypass a routine that they didn't understand that took a lot of time, so they did a binary patch to the bios to speed up boot and discovered that what they did was disable the system integrity checking. users who own the systems are going to modify them and bypass any restrictions you want to impose. users who don't own the systems can be defeated by simply not offering them the option to change the settings in the first place. David Lang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/12] One more attempt at useful kernel lockdown
On Mon, 2013-09-09 at 19:44 -0700, David Lang wrote: > On Tue, 10 Sep 2013, Matthew Garrett wrote: > > No. Say someone adds an additional lockdown bit to forbid raw access to > > mounted block devices. The "Turn everything off" approach now means that > > I won't be able to perform raw access to mounted block devices, even if > > that's something that my use case relies on. > > I was meaning that if you only turn off features that you know about, the > addition of a new thing that can be disabled doesn't make you any worse off > than > you were. Someone adds a new "install_evil()" syscall and adds a disable bit. If I don't disable it, I'm now vulnerable. Please pay attention to earlier discussion. > >> so if you only have a single bit, how do you deal with the case where that > >> bit > >> locks down something that's required? (your reason for not just setting > >> all bits > >> in the first approach) > > > > Because that bit is well-defined, and if anything is added to it that > > doesn't match that definition then it's a bug. > > it may be well defined, but that doesn't mean that it actually matches what > the > system owner wants to do. If it doesn't match what the system owner wants to do, the system owner doesn't set it. The system owner uses a more appropriate security mechanism instead. > The idea that the programmer can possibly anticipate all possible needs and > provide a switch for exactly that need is just wrong. Users will have needs > that > you never thought of. The best systems are the ones where the creators look > at > what users are doing and react with "I never imagined doing that" Describe the security case for disabling PCI BAR access but permitting i/o port access. > > Anything more granular means that you trust your userspace, and if you > > trust your userspace then you can already set up a granular policy using > > the existing tools for that job. So just use the existing tools. > > If you can't trust your userspace, how do you know that the userspace has set > the big hammer flag in the first place? if you can trust it to throw that > switch, you can trust it to throw multiple smaller switches. Hence the final patch in the series, and hence also the suggestion for exposing it as a command line option that can be set by the bootloader during an attested boot. > >> And if SELinux can do the job, what is the reason for creating this new > >> option? > > > > Because you can't embed an unmodifiable selinux policy in the kernel. > > Why not, have the policy set in an initramfs that's part of the kernel and > have > part of that policy be to block all access to the selinux controls. Because then someone disables selinux on the kernel command line. -- Matthew Garrett N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Re: Question regarding list_for_each_entry_safe usage in move_one_task
On Mon, Sep 9, 2013 at 7:15 PM, Peter Zijlstra wrote: > On Mon, Sep 02, 2013 at 02:26:45PM +0800, Lei Wen wrote: >> Hi Peter, >> >> I find one list API usage may not be correct in current fair.c code. >> In move_one_task function, it may iterate through whole cfs_tasks >> list to get one task to move. >> >> But in dequeue_task(), it would delete one task node from list >> without the lock protection. So that we could see from >> list_for_each_entry_safe API definitoin: > > Both sites hold the required rq->lock. I see, sorry for the noise... Thanks, Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/12] One more attempt at useful kernel lockdown
On Tue, 10 Sep 2013, Matthew Garrett wrote: On Mon, 2013-09-09 at 16:19 -0700, David Lang wrote: On Mon, 9 Sep 2013, Matthew Garrett wrote: Having thought about this, the answer is no. It presents exactly the same problem as capabilities do - the set can never be meaningfully extended. If an application sets only the bits it knows about, and if a new security-sensitive feature is added to the kernel, the feature will be left enabled and the system will be insecure. Alternatively, if an application sets all the bits regardless of whether it knows them or not, it may enable a lockdown feature that it otherwise required. In this case you are no less secure than you were before the feature was added, you just can't take advantage of the new feature without updating userspace. No. Say someone adds an additional lockdown bit to forbid raw access to mounted block devices. The "Turn everything off" approach now means that I won't be able to perform raw access to mounted block devices, even if that's something that my use case relies on. I was meaning that if you only turn off features that you know about, the addition of a new thing that can be disabled doesn't make you any worse off than you were. The only way this is useful is if all the bits are semantically equivalent, and in that case there's no point in having anything other than a single bit. Users who want a more fine-grained interface should use one of the existing mechanisms for doing so - leave the kernel open and impose the security policy from userspace using either capabilities or selinux. so if you only have a single bit, how do you deal with the case where that bit locks down something that's required? (your reason for not just setting all bits in the first approach) Because that bit is well-defined, and if anything is added to it that doesn't match that definition then it's a bug. it may be well defined, but that doesn't mean that it actually matches what the system owner wants to do. The idea that the programmer can possibly anticipate all possible needs and provide a switch for exactly that need is just wrong. Users will have needs that you never thought of. The best systems are the ones where the creators look at what users are doing and react with "I never imagined doing that" defining the "one true way" of operating is just wrong. your arguments don't seem self consistent. You don't seem to have been paying attention to the past 12 months of discussion. If I'm building a kiosk PC (or voting machine), I want to disable a lot of things that I could not get away with disabling on a generic laptop. Are we going to have Securelevel, ReallySecurelevel, ReallyReallySecurelevel, etc? or can we accept that security is not binary and allow users to disable features in a more granualar way? Anything more granular means that you trust your userspace, and if you trust your userspace then you can already set up a granular policy using the existing tools for that job. So just use the existing tools. If you can't trust your userspace, how do you know that the userspace has set the big hammer flag in the first place? if you can trust it to throw that switch, you can trust it to throw multiple smaller switches. And if SELinux can do the job, what is the reason for creating this new option? Because you can't embed an unmodifiable selinux policy in the kernel. Why not, have the policy set in an initramfs that's part of the kernel and have part of that policy be to block all access to the selinux controls. David Lang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 12/19] cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
On Mon, Sep 09, 2013 at 04:24:18PM +0100, Sudeep KarkadaNagesha wrote: > Hi Shawn, > > Ok. But I am bit suspicious about devm_clk_get(cpu_dev, NULL). > I don't understand completely as how the clock are registered(whether > with dev_id or with connection_id). As the connection_id of devm_clk_get() call here is NULL, the clock lookup should be registered with a proper dev_id in clk_register_clkdev() call. And that's what you have seen with imx and shmobile code. > A quick grep revealed that i.mx and shmobile is using conection id while > registering. They are using dev_id. > If the clock is registered with connection id and retrieved > with cpu_dev(now dev_id is cpu0 and not cpufreq-cpu0), IIUC that would > break. If we pass pdev->dev for clk_get, it should be fine but again > IIUC it breaks highbank which gets all the information from DT. If the clock lookup is from DT, we should be just fine, since it will work as long as the DT node with 'clocks' property (/cpus/cpu@0 in this case) is attached to the struct device pointer of devm_clk_get() call. > So only solution I can think of is to continue to have the code > assigning (>dev)->of_node with cpu device node which is not clean > and arguable as incorrect since there is no DT node for cpufreq-cpu0. > I don't have a strong opinion though. > > Let me know how would you like to fix this. So we only need to change all clkdev registration to use "cpu0" as dev_id intstead of "cpufreq-cpu0.0", something like below. And for imx, it should work even without the changes, because we have device tree lookup ready there, and those clk_register_clkdev() calls can just be removed now. But I prefer to include the change and leave the cleanup to another patch for keeping the change log clear. Shawn ---8<-- diff --git a/arch/arm/mach-imx/clk-imx27.c b/arch/arm/mach-imx/clk-imx27.c index c3cfa41..c6b40f3 100644 --- a/arch/arm/mach-imx/clk-imx27.c +++ b/arch/arm/mach-imx/clk-imx27.c @@ -285,7 +285,7 @@ int __init mx27_clocks_init(unsigned long fref) clk_register_clkdev(clk[ata_ahb_gate], "ata", NULL); clk_register_clkdev(clk[rtc_ipg_gate], NULL, "imx21-rtc"); clk_register_clkdev(clk[scc_ipg_gate], "scc", NULL); - clk_register_clkdev(clk[cpu_div], NULL, "cpufreq-cpu0.0"); + clk_register_clkdev(clk[cpu_div], NULL, "cpu0"); clk_register_clkdev(clk[emi_ahb_gate], "emi_ahb" , NULL); mxc_timer_init(MX27_IO_ADDRESS(MX27_GPT1_BASE_ADDR), MX27_INT_GPT1); diff --git a/arch/arm/mach-imx/clk-imx51-imx53.c b/arch/arm/mach-imx/clk-imx51-imx53.c index 1a56a33..de1964c 100644 --- a/arch/arm/mach-imx/clk-imx51-imx53.c +++ b/arch/arm/mach-imx/clk-imx51-imx53.c @@ -328,7 +328,7 @@ static void __init mx5_clocks_common_init(unsigned long rate_ckil, clk_register_clkdev(clk[ssi2_ipg_gate], NULL, "imx-ssi.1"); clk_register_clkdev(clk[ssi3_ipg_gate], NULL, "imx-ssi.2"); clk_register_clkdev(clk[sdma_gate], NULL, "imx35-sdma"); - clk_register_clkdev(clk[cpu_podf], NULL, "cpufreq-cpu0.0"); + clk_register_clkdev(clk[cpu_podf], NULL, "cpu0"); clk_register_clkdev(clk[iim_gate], "iim", NULL); clk_register_clkdev(clk[dummy], NULL, "imx2-wdt.0"); clk_register_clkdev(clk[dummy], NULL, "imx2-wdt.1"); diff --git a/arch/arm/mach-shmobile/clock-r8a73a4.c b/arch/arm/mach-shmobile/clock-r8a73a4.c index 8ea5ef6..5bd2e85 100644 --- a/arch/arm/mach-shmobile/clock-r8a73a4.c +++ b/arch/arm/mach-shmobile/clock-r8a73a4.c @@ -555,7 +555,7 @@ static struct clk_lookup lookups[] = { CLKDEV_CON_ID("pll2h", _clk), /* CPU clock */ - CLKDEV_DEV_ID("cpufreq-cpu0", _clk), + CLKDEV_DEV_ID("cpu0", _clk), /* DIV6 */ CLKDEV_CON_ID("zb", _clks[DIV6_ZB]), diff --git a/arch/arm/mach-shmobile/clock-sh73a0.c b/arch/arm/mach-shmobile/clock-sh73a0.c index 1942eae..c92c023 100644 --- a/arch/arm/mach-shmobile/clock-sh73a0.c +++ b/arch/arm/mach-shmobile/clock-sh73a0.c @@ -616,7 +616,7 @@ static struct clk_lookup lookups[] = { CLKDEV_DEV_ID("smp_twd", _clk), /* smp_twd */ /* DIV4 clocks */ - CLKDEV_DEV_ID("cpufreq-cpu0", _clks[DIV4_Z]), + CLKDEV_DEV_ID("cpu0", _clks[DIV4_Z]), /* DIV6 clocks */ CLKDEV_CON_ID("vck1_clk", _clks[DIV6_VCK1]), -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCHv3 1/4] pwm: Add Freescale FTM PWM driver support
> Subject: Re: [PATCHv3 1/4] pwm: Add Freescale FTM PWM driver support > > On Mon, Sep 09, 2013 at 02:20:09PM +0200, Thierry Reding wrote: > > On Fri, Sep 06, 2013 at 04:08:24PM +0800, Xiubo Li wrote: > > > The FTM PWM device can be found on Vybrid VF610 Tower and Layerscape > LS-1 SoCs. > > > > > > Signed-off-by: Xiubo Li > > > Signed-off-by: Jingchang Lu > > > --- > > > drivers/pwm/Kconfig | 10 + > > > drivers/pwm/Makefile | 1 + > > > drivers/pwm/pwm-fsl-ftm.c | 505 > > > ++ > > > 3 files changed, 516 insertions(+) > > > create mode 100644 drivers/pwm/pwm-fsl-ftm.c > > > > This looks pretty good to me. I noticed that you didn't Cc Sascha who > > commented on this a lot. Can you please resend with him Cc'ed to make > > sure he sees this version of the series? > > I'm already aware of this, no need to resend. I'll have a look over it > tomorrow. > I'm very sorry, next time I will. Thanks very much. -- Best Regards, Xiubo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
On Mon, Sep 9, 2013 at 7:25 PM, Al Viro wrote: > > One name: Mark V. Shaney... Heh, yes. I had ignored the earlier emails, and that last one looked more reasonable than the earlier ones ;) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 1/3] SMP: kill redundant call_function_data->cpumask_ipi field
On Sun, Sep 08, 2013 at 11:22:23PM +0800, Jiang Liu wrote: > From: Jiang Liu > > Commit f44310b98ddb7 "smp: Fix SMP function call empty cpu mask race" > introduced field call_function_data->cpumask_ipi to resolve a race > condition in smp_call_function_many(). > > Later commit 9a46ad6d6df3 "smp: make smp_call_function_many() use logic > similar to smp_call_function_single()" fixed the same issue in another > way when optimizing smp_call_function_many(), which then obsoletes > changes introduced by commit f44310b98ddb7. So revert it. > Yes, you are right, after commit 9a46ad6d6df3, we can revert f44310b98ddb7. Maybe use "Revert smp: Fix SMP function call empty cpu mask race" is a better subject. > We may also keep call_function_data->cpumask_ipi field and use it to > optimize smp_call_function_many() as below: > diff --git a/kernel/smp.c b/kernel/smp.c > index fe9f773..dd852fb 100644 > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -428,6 +428,8 @@ void smp_call_function_many(const struct cpumask *mask, > csd->info = info; > > raw_spin_lock_irqsave(>lock, flags); > + if (list_empty(>list)) > + cpumask_clear_cpu(cpu, cfd->cpumask_ipi); > list_add_tail(>list, >list); > raw_spin_unlock_irqrestore(>lock, flags); > } > Your optimization don't need to keep cpumask_ipi, just clear cfd->cpumask, and test whether cfd->cpumask is empty after "for_each_cpu(cpu, cfd->cpumask)". Acked-by: Wang YanQing Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
On Mon, Sep 09, 2013 at 06:34:16PM -0700, Linus Torvalds wrote: > On Mon, Sep 9, 2013 at 6:15 PM, Ramkumar Ramachandra > wrote: > > > > Maybe it should then? > > It doesn't need to. The RCU lookup looks at individual dentry sequence > numbers and doesn't care about the bigger rename sequence number at > all. One name: Mark V. Shaney... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: > This patchset is the re-spin of the original branch stack sampling > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset > also enables SW based branch filtering support for PPC64 platforms which have > branch stack sampling support. With this new enablement, the branch filter > support > for PPC64 platforms have been extended to include all these combinations > discussed > below with a sample test application program. ... > Mixed filters > - > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog > Error: > The perf.data file has no samples! > > NOTE: As expected. The HW filters all the branches which are calls and SW > tries to find return > branches in that given set. Both the filters are mutually exclussive, so > obviously no samples > found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] doc: fix some typos in documentation
On 09/05/2013 07:26:22 AM, Xishi Qiu wrote: Fix some typos in Documentation/IRQ-domain.txt/email-clients.txt/io-mapping.txt Signed-off-by: Xishi Qiu --- Documentation/IRQ-domain.txt|4 ++-- Documentation/email-clients.txt |2 +- Documentation/io-mapping.txt|2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt index 9bc9594..5a8b8ed 100644 --- a/Documentation/IRQ-domain.txt +++ b/Documentation/IRQ-domain.txt @@ -97,7 +97,7 @@ hwirq number. The disadvantage is that hwirq to IRQ number lookup is dependent on how many entries are in the table. Very few drivers should need this mapping. At the moment, powerpc -iseries is the only user. +series is the only user. That one's not a typo, there actually was a powerpc iseries: http://lxr.free-electrons.com/source/arch/powerpc/platforms/iseries/setup.c?v=2.6.35 It got removed in the past year or so (git bisect could find the commit), which implies that this documentation needs to be fixed if this removed architecture was the only user of whatever it's documenting. (Ping the powerpc guys?) Rob-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
On Tue, Sep 10, 2013 at 09:52:35AM +0800, Libin wrote: > From: Li Bin > > When one work starts execution, the high bits of work's data contain > pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID > is assigned WORK_OFFQ_POOL_NONE when the work being initialized > indicating that no pool is associated and get_work_pool() uses it to > check the associated pool. So if worker_pool_assign_id() assigns a > ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it triggers > leakage, and it may break the non-reentrance guarantee. > > This patch fix this issue by modifying the worker_pool_assign_id() > function calling idr_alloc() by setting @end param WORK_OFFQ_POOL_NONE. > > Furthermore, in the current implementation, the BUILD_BUG_ON() in > init_workqueues makes no sense. The number of worker pools needed > cannot be determined at compile time, because the number of backing > pools for UNBOUND workqueues is dynamic based on the assigned custom > attributes. So remove it. > > Signed-off-by: Li Bin Applied to wq/for-3.12-fixes w/ minor updates. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the dmaengine tree with the slave-dma tree
Hi Dan, Today's linux-next merge of the dmaengine tree got a conflict in include/linux/dmaengine.h between commit 7bb587f4eef8 ("dmaengine: add interface of dma_get_slave_channel") from the slave-dma tree and commit 4a43f394a082 ("dmaengine: dma_sync_wait and dma_find_channel undefined") from the dmaengine tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc include/linux/dmaengine.h index 2601186,0c72b89..000 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@@ -1030,8 -1006,6 +1042,7 @@@ static inline void dma_release_channel( int dma_async_device_register(struct dma_device *device); void dma_async_device_unregister(struct dma_device *device); void dma_run_dependencies(struct dma_async_tx_descriptor *tx); - struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type); +struct dma_chan *dma_get_slave_channel(struct dma_chan *chan); struct dma_chan *net_dma_find_channel(void); #define dma_request_channel(mask, x, y) __dma_request_channel(&(mask), x, y) #define dma_request_slave_channel_compat(mask, x, y, dev, name) \ pgp5iPYz1g3Ak.pgp Description: PGP signature
[PATCH] workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
From: Li Bin When one work starts execution, the high bits of work's data contain pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID is assigned WORK_OFFQ_POOL_NONE when the work being initialized indicating that no pool is associated and get_work_pool() uses it to check the associated pool. So if worker_pool_assign_id() assigns a ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it triggers leakage, and it may break the non-reentrance guarantee. This patch fix this issue by modifying the worker_pool_assign_id() function calling idr_alloc() by setting @end param WORK_OFFQ_POOL_NONE. Furthermore, in the current implementation, the BUILD_BUG_ON() in init_workqueues makes no sense. The number of worker pools needed cannot be determined at compile time, because the number of backing pools for UNBOUND workqueues is dynamic based on the assigned custom attributes. So remove it. Signed-off-by: Li Bin --- kernel/workqueue.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 987293d..5b4c1bd 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -518,14 +518,21 @@ static inline void debug_work_activate(struct work_struct *work) { } static inline void debug_work_deactivate(struct work_struct *work) { } #endif -/* allocate ID and assign it to @pool */ +/** + * worker_pool_assign_id - allocate ID and assing it to @pool + * @pool: the pool pointer of interest + * + * Return 0 if ID assigned successful. + * Return non-zero if the allocation fails or the ID number out of + * [0, %WORK_OFFQ_POOL_NONE) range. + */ static int worker_pool_assign_id(struct worker_pool *pool) { int ret; lockdep_assert_held(_pool_mutex); - ret = idr_alloc(_pool_idr, pool, 0, 0, GFP_KERNEL); + ret = idr_alloc(_pool_idr, pool, 0, WORK_OFFQ_POOL_NONE, GFP_KERNEL); if (ret >= 0) { pool->id = ret; return 0; @@ -5009,10 +5016,6 @@ static int __init init_workqueues(void) int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL }; int i, cpu; - /* make sure we have enough bits for OFFQ pool ID */ - BUILD_BUG_ON((1LU << (BITS_PER_LONG - WORK_OFFQ_POOL_SHIFT)) < -WORK_CPU_END * NR_STD_WORKER_POOLS); - WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long)); pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device
On 09/09/13 21:41, Christoph Hellwig wrote: >> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) > > Please reproduce without this weird crap loaded. > These modules is filesystem and will not impact enclosure. Thanks, Joe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Build failures in drivers/of/of_reserved_mem.c due to missing asm/dma-contiguous.h
drivers/of/of_reserved_mem.c:14:32: fatal error: asm/dma-contiguous.h: No such file or directory #include Seen with arm64:defconfig mips:nlm_xlp_defconfig mips:cavium_octeon_defconfig Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH v4 3/3] sched: Periodically decay max cost of idle balance
On Mon, 2013-09-09 at 14:07 -0700, Jason Low wrote: > On Mon, 2013-09-09 at 13:49 +0200, Peter Zijlstra wrote: > > On Wed, Sep 04, 2013 at 12:10:01AM -0700, Jason Low wrote: > > > On Fri, 2013-08-30 at 12:18 +0200, Peter Zijlstra wrote: > > > > On Thu, Aug 29, 2013 at 01:05:36PM -0700, Jason Low wrote: > > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > > > > index 58b0514..bba5a07 100644 > > > > > --- a/kernel/sched/core.c > > > > > +++ b/kernel/sched/core.c > > > > > @@ -1345,7 +1345,7 @@ ttwu_do_wakeup(struct rq *rq, struct > > > > > task_struct *p, int wake_flags) > > > > > > > > > > if (rq->idle_stamp) { > > > > > u64 delta = rq_clock(rq) - rq->idle_stamp; > > > > > - u64 max = 2*rq->max_idle_balance_cost; > > > > > + u64 max = 2*(sysctl_sched_migration_cost + > > > > > rq->max_idle_balance_cost); > > > > > > > > You re-introduce sched_migration_cost here because max_idle_balance_cost > > > > can now drop down to 0 again? > > > > > > Yes it was so that max_idle_balance_cost would be at least > > > sched_migration_cost > > > and that we would still skip idle_balance if avg_idle < > > > sched_migration_cost. > > > > > > I also initially thought that adding sched_migration_cost would also > > > account for > > > the extra "costs" of idle balancing that are not accounted for in the > > > time spent > > > on each newidle load balance. Come to think of it though, > > > sched_migration_cost > > > might be too large when used in that context considering we're already > > > using the > > > max cost. > > > > Right, so shall we do as Srikar suggests and drop that initial check? > > I agree that we can delete the check between avg_idle and > max_idle_balance_cost > so that large costs in higher domains don't cause balancing to be skipped in > lower domains as Srikar suggested. Should we keep the old > "if (this_rq->avg_idle < sysctl_sched_migration_cost)" check? It was put there to allow cross core scheduling to recover as much overlap as possible, so rapidly switching communicating tasks with only small recoverable overlap in the first place don't get pounded to pulp by overhead instead. If a different way does a better job, whack it. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] powerpc: Export cpu_to_chip_id() to fix build error
powerpc allmodconfig build fails with: ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined! The problem was introduced with commit 15863ff3b (powerpc: Make chip-id information available to userspace). Export the missing symbol. Cc: Vasant Hegde Cc: Shivaprasad G Bhat Signed-off-by: Guenter Roeck --- arch/powerpc/kernel/smp.c |1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 442d8e2..8e59abc 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -611,6 +611,7 @@ int cpu_to_chip_id(int cpu) of_node_put(np); return of_get_ibm_chip_id(np); } +EXPORT_SYMBOL(cpu_to_chip_id); /* Helper routines for cpu to core mapping */ int cpu_core_index_of_thread(int cpu) -- 1.7.9.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "cpufreq: fix serialization issues with freq change notifiers" breaks cpufreq too
On Tuesday, September 10, 2013 01:12:49 AM Rafael J. Wysocki wrote: > On Monday, September 09, 2013 11:42:41 PM Guennadi Liakhovetski wrote: > > Hi Rafael > > > > On Mon, 9 Sep 2013, Rafael J. Wysocki wrote: > > > > > Hi, > > > > > > On Monday, September 09, 2013 05:11:10 PM Guennadi Liakhovetski wrote: > > > > Sorry guys, I'm trying my best to stop this patch from propagating to > > > > stable and to get it fixed asap, so, the CC list might be a bit > > > > excessive. > > > > Also trying to fix the originally spare cc list, which makes it > > > > impossible > > > > for me to reply to the original thread, instead have to start a new one. > > > > > > I'm not sure what you're talking about. What exactly was wrong with the > > > original CC list in particular? > > > > I think you advised once to cc cpufreq related mails to linux-pm too at > > least. > > Yes, I did. > > > I haven't found this patch in my pm archive, have I missed it there? > > Quite frankly, I don't remember if it was there. ISTR having it it patchwork, > which would mean that it was there, but well. > > > > > Commit > > > > > > > > commit dceff5ce18801dddc220d6238628619c93bc3cb6 > > > > Author: Viresh Kumar > > > > Date: Sun Sep 1 22:19:37 2013 +0530 > > > > > > > > cpufreq: fix serialization issues with freq change notifiers > > > > > > > > breaks .transition_ongoing counting. > > > > > > Do you know how exactly it breaks that? If so, care to share that > > > knowledge? > > > > No, I don't. I only know that in __cpufreq_driver_target() the check for > > > > if (policy->transition_ongoing) { > > write_unlock_irqrestore(_driver_lock, flags); > > return -EBUSY; > > } > > > > is failing with this patch and cpufreq-cpu0. > > OK, we need to figure out that, then. But given the timing I think I'll just start to revert things and we can add them back later after we've sorted out all problems. So I'm going to drop commit dceff5c from the linux-next branch and I'm going to revert commit 7c30ed5 along with commit 266c13d that tried to fix it and we'll revisit the transition serialization issue when we really know how to fix it. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
On Mon, Sep 9, 2013 at 6:15 PM, Ramkumar Ramachandra wrote: > > Maybe it should then? It doesn't need to. The RCU lookup looks at individual dentry sequence numbers and doesn't care about the bigger rename sequence number at all. The fallback (if you hit one of the very very rare races, or if you hit a symlink) ends up doing per-path-component lookups under the rename sequence lock, but for it, read-locking it until it succeeds is the right thing to do. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PCI/PM: Removing the function pci_pm_complete()
Commit(88d2613) removed the pm_runtime_put_sync() from pci_pm_complete() to PM core code device_complete(). Here the pci_pm_complete() is doing the same work which can be done in device_complete(), so we can remove it directly. Signed-off-by: liu chuansheng --- drivers/pci/pci-driver.c |9 - 1 files changed, 0 insertions(+), 9 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 98f7b9b..736ef3f 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -599,18 +599,10 @@ static int pci_pm_prepare(struct device *dev) return error; } -static void pci_pm_complete(struct device *dev) -{ - struct device_driver *drv = dev->driver; - - if (drv && drv->pm && drv->pm->complete) - drv->pm->complete(dev); -} #else /* !CONFIG_PM_SLEEP */ #define pci_pm_prepare NULL -#define pci_pm_completeNULL #endif /* !CONFIG_PM_SLEEP */ @@ -1123,7 +1115,6 @@ static int pci_pm_runtime_idle(struct device *dev) const struct dev_pm_ops pci_dev_pm_ops = { .prepare = pci_pm_prepare, - .complete = pci_pm_complete, .suspend = pci_pm_suspend, .resume = pci_pm_resume, .freeze = pci_pm_freeze, -- 1.7.0.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: powerpc allmodconfig build broken due to commit 15863ff3b (powerpc: Make chip-id information available to userspace)
On 09/09/2013 04:55 PM, Asai Thambi S P wrote: On 09/08/2013 5:28 PM, Guenter Roeck wrote: Hi all, powerpc allmodconfig build on the latest upstream kernel results in: ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined! This is due to commit 15863ff3b (powerpc: Make chip-id information available to userspace). Not surprising, as cpu_to_chip_id() is not exported. Apart from the above error, I have a concern on the patch, purely based on the commit message. (to be honest, I am not familiar with the ppc architecture) Commit message of 15863ff3b has the following text. ** So far "/sys/devices/system/cpu/cpuX/topology/physical_package_id" was always default (-1) on ppc64 architecture. Now, some systems have an ibm,chip-id property in the cpu nodes in the device tree. On these systems, we now use this information to display physical_package_id ** Shouldn't the new definition of "topology_physical_package_id" apply only to those systems supporting ibm,chip-id property? Looking into the code, I think that is what it does. For other platforms (ie if there is no ibm,chip-id property) it still returns -1. Question for the fix is what path to take to fix the problem. Exporting cpu_to_chip_id() might be the easiest solution. Other platforms export the respective data, so it should not be a problem. I might submit a patch and see where it goes. Guenter Reverting this commit fixes the problem. Any good idea how to fix it for real ? Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
Al Viro wrote: > _What_ "pathname translations"? Pathname resolution doesn't fall back to > seq_writelock() at all. Maybe it should then? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] mfd: rtsx: Modify rts5249_optimize_phy
于 2013年09月09日 21:02, Lee Jones 写道: #define PHY_FLD4 0x1E >+#define FLDEN_SEL 0x4000 >+#define REQ_REF 0x2000 >+#define RXAMP_OFF 0x1000 >+#define REQ_ADDA 0x0800 >+#define BER_COUNT 0x00E0 >+#define BER_TIMER 0x000A >+#define BER_CHK_EN0x0001 > #define PHY_DUM_REG 0x1F > > #define LCTLR0x80 This doesn't look right. We had a nicely structured, ordered list and now you've seemingly randomly shoved a truck load of un-prefixed defines between them. Am I missing something? Is there method to the madness? Hi Lee: Are you suggesting that I should define the macros using the same prefix like below? #define PHY_FLD40x1E #define FLD4_FLDEN_SEL 0x4000 #define FLD4_REQ_REF 0x2000 #define FLD4_RXAMP_OFF0x1000 BR, Wei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] IB/qib: Use pcie_set_mps() and pcie_get_mps() to simplify code
On 2013/9/9 22:55, Marciniszyn, Mike wrote: >> Subject: [PATCH 4/6] IB/qib: Use pcie_set_mps() and pcie_get_mps() to >> simplify >> code >> >> Refactor qib_tune_pcie_caps() function, use pcie_set_mps() and >> pcie_get_mps() to simply code. Because pci core caches the "PCI-E Max >> Payload Size Supported" in pci_dev->pcie_mpss, so use that instead of >> pcie_capability_read_word(). Remove the unused val2fld() and fld2val(). >> > > I tested this patch as is and saw no issues. > > The only thing I would suggest is that the new use of > pcie_get_readrq/pcie_set_readrq() be reflected in the comments with an > appropriate adjustment in the subject. Hi Mike, Thanks for your tests very much! Ok, I will update the comments. Thanks! Yijing. > > Mike > > > > . > -- Thanks! Yijing -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance
Hi, 2013-09-07 (토), 08:00 +, Chao Yu: > Hi Knize, > > Thanks for your reply, I think it's actually meaningless that it's > being named after "spin_lock", > it's better to rename this spinlock to "round_robin_lock". > > This patch can only resolve the issue of unbalanced fs_lock usage, > it can not fix the deadlock issue. > can we fix deadlock issue through this method: > > - vfs_create() > - f2fs_create() - takes an fs_lock and save current thread info into > thread_info[NR_GLOBAL_LOCKS] > - f2fs_add_link() >- __f2fs_add_link() > - init_inode_metadata() > - f2fs_init_security() > - security_inode_init_security() >- f2fs_initxattrs() > - f2fs_setxattr() - get fs_lock only if there is no current > thread info in thread_info > > So it keeps one thread can only hold one fs_lock to avoid deadlock. > Can we use this solution? It could be. But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs() level, since this case only happens when f2fs_initxattrs() is called. Let's think about ut in more detail. Thanks, > > > > thanks again! > > > > --- Original Message --- > > Sender : Russ Knize > > Date : 九月 07, 2013 04:25 (GMT+09:00) > > Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better > performance > > > > I encountered this same issue recently and solved it in much the same > way. Can we rename "spin_lock" to something more meaningful? > > > This race actually exposed a potential deadlock between f2fs_create() > and f2fs_initxattrs(): > > > - vfs_create() > - f2fs_create() - takes an fs_lock > - f2fs_add_link() >- __f2fs_add_link() > - init_inode_metadata() > - f2fs_init_security() > - security_inode_init_security() >- f2fs_initxattrs() > - f2fs_setxattr() - also takes an fs_lock > > > If another CPU happens to have the same lock that f2fs_setxattr() was > trying to take because of the race around next_lock_num, we can get > into a deadlock situation if the two threads are also contending over > another resource (like bdi). > > > Another scenario is if the above happens while another thread is in > the middle of grabbing all of the locks via mutex_lock_all(). > f2fs_create() is holding a lock that mutex_lock_all() is waiting for > and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting > for. > > > Russ > > > On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu wrote: > Hi Kim: > > I think there is a performance problem: when all > sbi->fs_lock is holded, > > then all other threads may get the same next_lock value from > sbi->next_lock_num in function mutex_lock_op, > > and wait to get the same lock at position fs_lock[next_lock], > it unbalance the fs_lock usage. > > It may lost performance when we do the multithread test. > > > > Here is the patch to fix this problem: > > > > Signed-off-by: Yu Chao > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > > old mode 100644 > > new mode 100755 > > index 467d42d..983bb45 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -371,6 +371,7 @@ struct f2fs_sb_info { > > struct mutex fs_lock[NR_GLOBAL_LOCKS]; /* blocking FS > operations */ > > struct mutex node_write;/* locking > node writes */ > > struct mutex writepages;/* mutex for > writepages() */ > > + spinlock_t spin_lock; /* lock for > next_lock_num */ > > unsigned char next_lock_num;/* round-robin > global locks */ > > int por_doing; /* recovery is > doing or not */ > > int on_build_free_nids; /* > build_free_nids is doing */ > > @@ -533,15 +534,19 @@ static inline void > mutex_unlock_all(struct f2fs_sb_info *sbi) > > > > static inline int mutex_lock_op(struct f2fs_sb_info *sbi) > > { > > - unsigned char next_lock = sbi->next_lock_num % > NR_GLOBAL_LOCKS; > > + unsigned char next_lock; > > int i = 0; > > > > for (; i < NR_GLOBAL_LOCKS; i++) > > if (mutex_trylock(>fs_lock[i])) > > return i; > > > > - mutex_lock(>fs_lock[next_lock]); > > + spin_lock(>spin_lock); > > +
[PATCH] perf, x86: Avoid checkpointed counters causing excessive TSX aborts v5
From: Andi Kleen With checkpointed counters there can be a situation where the counter is overflowing, aborts the transaction, is set back to a non overflowing checkpoint, causes interupt. The interrupt doesn't see the overflow because it has been checkpointed. This is then a spurious PMI, typically with a ugly NMI message. It can also lead to excessive aborts. Avoid this problem by: - Using the full counter width for counting counters (earlier patch) - Forbid sampling for checkpointed counters. It's not too useful anyways, checkpointing is mainly for counting. The check is approximate (to still handle KVM), but should catch the majority of cases. - On a PMI always set back checkpointed counters to zero. v2: Add unlikely. Add comment v3: Allow large sampling periods with CP for KVM v4: Use event_is_checkpointed. Use EOPNOTSUPP. (Stephane Eranian) v5: Remove comment. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event_intel.c | 37 ++ 1 file changed, 37 insertions(+) diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index a45d8d4..91e3f8c 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1134,6 +1134,11 @@ static void intel_pmu_enable_event(struct perf_event *event) __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); } +static inline bool event_is_checkpointed(struct perf_event *event) +{ + return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0; +} + /* * Save and restart an expired event. Called by NMI contexts, * so it has to be careful about preempting normal event ops: @@ -1141,6 +1146,17 @@ static void intel_pmu_enable_event(struct perf_event *event) int intel_pmu_save_and_restart(struct perf_event *event) { x86_perf_event_update(event); + /* +* For a checkpointed counter always reset back to 0. This +* avoids a situation where the counter overflows, aborts the +* transaction and is then set back to shortly before the +* overflow, and overflows and aborts again. +*/ + if (unlikely(event_is_checkpointed(event))) { + /* No race with NMIs because the counter should not be armed */ + wrmsrl(event->hw.event_base, 0); + local64_set(>hw.prev_count, 0); + } return x86_perf_event_set_period(event); } @@ -1224,6 +1240,13 @@ again: x86_pmu.drain_pebs(regs); } + /* +* To avoid spurious interrupts with perf stat always reset checkpointed +* counters. +*/ + if (cpuc->events[2] && event_is_checkpointed(cpuc->events[2])) + status |= (1ULL << 2); + for_each_set_bit(bit, (unsigned long *), X86_PMC_IDX_MAX) { struct perf_event *event = cpuc->events[bit]; @@ -1689,6 +1712,20 @@ static int hsw_hw_config(struct perf_event *event) event->attr.precise_ip > 0)) return -EOPNOTSUPP; + if (event_is_checkpointed(event)) { + /* +* Sampling of checkpointed events can cause situations where +* the CPU constantly aborts because of a overflow, which is +* then checkpointed back and ignored. Forbid checkpointing +* for sampling. +* +* But still allow a long sampling period, so that perf stat +* from KVM works. +*/ + if (event->attr.sample_period > 0 && + event->attr.sample_period < 0x7fff) + return -EOPNOTSUPP; + } return 0; } -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
On Mon, Sep 09, 2013 at 08:40:20PM -0400, George Spelvin wrote: > I'm really wondering about only trying once before taking the write lock. > Yes, using the lsbit is a cute hack, but are we using it for its cuteness > rather than its effectiveness? > > Renames happen occasionally. If that causes all the current pathname > translations to fall back to the write lock, that is fairly heavy. > Worse, all of those translations will (unnecessarily) bump the write > seqcount, triggering *other* translations to fail back to the write-lock > path. _What_ "pathname translations"? Pathname resolution doesn't fall back to seq_writelock() at all. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] dmaengine update for 3.12
Hi Linus, The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f: Linux 3.11-rc4 (2013-08-04 13:46:46 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine tags/dmaengine-3.12 for you to fetch changes up to 4a43f394a08214eaf92cdd8ce3eae75e555323d8: dmaengine: dma_sync_wait and dma_find_channel undefined (2013-09-09 17:02:38 -0700) dmaengine update for 3.12 Collection of random updates to the core and some end-driver fixups for ioatdma and mv_xor: * NUMA aware channel allocation * Cleanup dmatest debugfs interface * ioat: make raid-support Atom only * mv_xor: big endian Aside from the top three commits: 4a43f39 dmaengine: dma_sync_wait and dma_find_channel undefined ab5f8c6 MAINTAINERS: update email for Dan Williams a577659 dma: mv_xor: Fix incorrect error path ...these have all had some soak time in -next. The top commit fixes a recent build breakage. It has been a long while since my last pull request, hopefully it does not show. Thanks to Vinod for keeping an eye on drivers/dma/ this past year. -- Dan Andy Shevchenko (3): dmatest: make module parameters writable dmatest: remove IS_ERR_OR_NULL checks of debugfs calls dmatest: print message on debug level in case of no error Brice Goglin (2): ioatdma: disable RAID on non-Atom platforms and reenable unaligned copies dmaengine: make dma_channel_rebalance() NUMA aware Dan Carpenter (1): dmaengine: make dma_submit_error() return an error code Dan Williams (1): MAINTAINERS: update email for Dan Williams Jon Mason (1): dmaengine: dma_sync_wait and dma_find_channel undefined Paul Bolle (1): ioatdma: silence GCC warnings Sachin Kamat (1): dma: mv_xor: Fix incorrect error path Thomas Petazzoni (2): mv_xor: use {readl, writel}_relaxed instead of __raw_{readl, writel} mv_xor: support big endian systems using descriptor swap feature Documentation/dmatest.txt | 15 ++-- MAINTAINERS | 18 ++--- drivers/dma/dmaengine.c | 55 +++--- drivers/dma/dmatest.c | 182 -- drivers/dma/ioat/dma_v3.c | 26 +-- drivers/dma/mv_xor.c | 53 -- drivers/dma/mv_xor.h | 28 ++- include/linux/dmaengine.h | 17 - 8 files changed, 146 insertions(+), 248 deletions(-)
Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance
Hi, Nice catch. This is definitely a bug where one thread grabbed two fs_locks across the same flow. Any idea? Thanks, 2013-09-06 (금), 14:25 -0500, Russ Knize: > I encountered this same issue recently and solved it in much the same > way. Can we rename "spin_lock" to something more meaningful? > > > This race actually exposed a potential deadlock between f2fs_create() > and f2fs_initxattrs(): > > > - vfs_create() > - f2fs_create() - takes an fs_lock > - f2fs_add_link() >- __f2fs_add_link() > - init_inode_metadata() > - f2fs_init_security() > - security_inode_init_security() >- f2fs_initxattrs() > - f2fs_setxattr() - also takes an fs_lock > > > If another CPU happens to have the same lock that f2fs_setxattr() was > trying to take because of the race around next_lock_num, we can get > into a deadlock situation if the two threads are also contending over > another resource (like bdi). > > > Another scenario is if the above happens while another thread is in > the middle of grabbing all of the locks via mutex_lock_all(). > f2fs_create() is holding a lock that mutex_lock_all() is waiting for > and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting > for. > > > Russ > > > On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu wrote: > Hi Kim: > > I think there is a performance problem: when all > sbi->fs_lock is holded, > > then all other threads may get the same next_lock value from > sbi->next_lock_num in function mutex_lock_op, > > and wait to get the same lock at position fs_lock[next_lock], > it unbalance the fs_lock usage. > > It may lost performance when we do the multithread test. > > > > Here is the patch to fix this problem: > > > > Signed-off-by: Yu Chao > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > > old mode 100644 > > new mode 100755 > > index 467d42d..983bb45 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -371,6 +371,7 @@ struct f2fs_sb_info { > > struct mutex fs_lock[NR_GLOBAL_LOCKS]; /* blocking FS > operations */ > > struct mutex node_write;/* locking > node writes */ > > struct mutex writepages;/* mutex for > writepages() */ > > + spinlock_t spin_lock; /* lock for > next_lock_num */ > > unsigned char next_lock_num;/* round-robin > global locks */ > > int por_doing; /* recovery is > doing or not */ > > int on_build_free_nids; /* > build_free_nids is doing */ > > @@ -533,15 +534,19 @@ static inline void > mutex_unlock_all(struct f2fs_sb_info *sbi) > > > > static inline int mutex_lock_op(struct f2fs_sb_info *sbi) > > { > > - unsigned char next_lock = sbi->next_lock_num % > NR_GLOBAL_LOCKS; > > + unsigned char next_lock; > > int i = 0; > > > > for (; i < NR_GLOBAL_LOCKS; i++) > > if (mutex_trylock(>fs_lock[i])) > > return i; > > > > - mutex_lock(>fs_lock[next_lock]); > > + spin_lock(>spin_lock); > > + next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS; > > sbi->next_lock_num++; > > + spin_unlock(>spin_lock); > > + > > + mutex_lock(>fs_lock[next_lock]); > > return next_lock; > > } > > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > > old mode 100644 > > new mode 100755 > > index 75c7dc3..4f27596 > > --- a/fs/f2fs/super.c > > +++ b/fs/f2fs/super.c > > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct > super_block *sb, void *data, int silent) > > mutex_init(>cp_mutex); > > for (i = 0; i < NR_GLOBAL_LOCKS; i++) > > mutex_init(>fs_lock[i]); > > + spin_lock_init(>spin_lock); > > mutex_init(>node_write); > > sbi->por_doing = 0; > > spin_lock_init(>stat_lock); > > (END) > >
Re: [PATCH 00/12] One more attempt at useful kernel lockdown
On Mon, 2013-09-09 at 16:19 -0700, David Lang wrote: > On Mon, 9 Sep 2013, Matthew Garrett wrote: > > Having thought about this, the answer is no. It presents exactly the > > same problem as capabilities do - the set can never be meaningfully > > extended. If an application sets only the bits it knows about, and if a > > new security-sensitive feature is added to the kernel, the feature will > > be left enabled and the system will be insecure. Alternatively, if an > > application sets all the bits regardless of whether it knows them or > > not, it may enable a lockdown feature that it otherwise required. > > In this case you are no less secure than you were before the feature was > added, > you just can't take advantage of the new feature without updating userspace. No. Say someone adds an additional lockdown bit to forbid raw access to mounted block devices. The "Turn everything off" approach now means that I won't be able to perform raw access to mounted block devices, even if that's something that my use case relies on. > > The only way this is useful is if all the bits are semantically > > equivalent, and in that case there's no point in having anything other > > than a single bit. Users who want a more fine-grained interface should > > use one of the existing mechanisms for doing so - leave the kernel open > > and impose the security policy from userspace using either capabilities > > or selinux. > > so if you only have a single bit, how do you deal with the case where that > bit > locks down something that's required? (your reason for not just setting all > bits > in the first approach) Because that bit is well-defined, and if anything is added to it that doesn't match that definition then it's a bug. > your arguments don't seem self consistent. You don't seem to have been paying attention to the past 12 months of discussion. > If I'm building a kiosk PC (or voting machine), I want to disable a lot of > things that I could not get away with disabling on a generic laptop. Are we > going to have Securelevel, ReallySecurelevel, ReallyReallySecurelevel, etc? > or > can we accept that security is not binary and allow users to disable features > in a more granualar way? Anything more granular means that you trust your userspace, and if you trust your userspace then you can already set up a granular policy using the existing tools for that job. So just use the existing tools. > And if SELinux can do the job, what is the reason for creating this new > option? Because you can't embed an unmodifiable selinux policy in the kernel. -- Matthew Garrett
Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
Hi, At first, thank you for the report and please follow the email writing rules. :) Anyway, I agree to the below issue. One thing that I can think of is that we don't need to use the spin_lock, since we don't care about the exact lock number, but just need to get any not-collided number. So, how about removing the spin_lock? And how about using a random number? Thanks, 2013-09-06 (금), 09:48 +, Chao Yu: > Hi Kim: > > I think there is a performance problem: when all sbi->fs_lock is > holded, > > then all other threads may get the same next_lock value from > sbi->next_lock_num in function mutex_lock_op, > > and wait to get the same lock at position fs_lock[next_lock], it > unbalance the fs_lock usage. > > It may lost performance when we do the multithread test. > > > > Here is the patch to fix this problem: > > > > Signed-off-by: Yu Chao > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > > old mode 100644 > > new mode 100755 > > index 467d42d..983bb45 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -371,6 +371,7 @@ struct f2fs_sb_info { > > struct mutex fs_lock[NR_GLOBAL_LOCKS]; /* blocking FS > operations */ > > struct mutex node_write;/* locking node writes > */ > > struct mutex writepages;/* mutex for > writepages() */ > > + spinlock_t spin_lock; /* lock for > next_lock_num */ > > unsigned char next_lock_num;/* round-robin global > locks */ > > int por_doing; /* recovery is doing > or not */ > > int on_build_free_nids; /* build_free_nids is > doing */ > > @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct > f2fs_sb_info *sbi) > > > > static inline int mutex_lock_op(struct f2fs_sb_info *sbi) > > { > > - unsigned char next_lock = sbi->next_lock_num % > NR_GLOBAL_LOCKS; > > + unsigned char next_lock; > > int i = 0; > > > > for (; i < NR_GLOBAL_LOCKS; i++) > > if (mutex_trylock(>fs_lock[i])) > > return i; > > > > - mutex_lock(>fs_lock[next_lock]); > > + spin_lock(>spin_lock); > > + next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS; > > sbi->next_lock_num++; > > + spin_unlock(>spin_lock); > > + > > + mutex_lock(>fs_lock[next_lock]); > > return next_lock; > > } > > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > > old mode 100644 > > new mode 100755 > > index 75c7dc3..4f27596 > > --- a/fs/f2fs/super.c > > +++ b/fs/f2fs/super.c > > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb, > void *data, int silent) > > mutex_init(>cp_mutex); > > for (i = 0; i < NR_GLOBAL_LOCKS; i++) > > mutex_init(>fs_lock[i]); > > + spin_lock_init(>spin_lock); > > mutex_init(>node_write); > > sbi->por_doing = 0; > > spin_lock_init(>stat_lock); > > (END) > > > > > > -- Jaegeuk Kim Samsung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH aio-next] aio: fix race in ring buffer page lookup introduced by page migration support
Hi Ben, Al, On 09/10/2013 12:02 AM, Benjamin LaHaise wrote: > Hi Al, Gu, > > I've added this patch to my tree at git://git.kvack.org/~bcrl/aio-next.git > to fix the get_user_pages() issue introduced by Gu's changes in the page > migration patch. Thanks Al for spotting this. Thanks very much for spotting and fixing this issue. Best regards, Gu > > -ben > > commit d6c355c7dabcd753a75bc77d150d36328a355267 > Author: Benjamin LaHaise > Date: Mon Sep 9 11:57:59 2013 -0400 > > aio: fix race in ring buffer page lookup introduced by page migration > support > > Prior to the introduction of page migration support in "fs/aio: Add > support > to aio ring pages migration" / 36bc08cc01709b4a9bb563b35aa530241ddc63e3, > mapping of the ring buffer pages was done via get_user_pages() while > retaining mmap_sem held for write. This avoided possible races with > userland > racing an munmap() or mremap(). The page migration patch, however, > switched > to using mm_populate() to prime the page mapping. mm_populate() cannot be > called with mmap_sem held. > > Instead of dropping the mmap_sem, revert to the old behaviour and simply > drop the use of mm_populate() since get_user_pages() will cause the pages > to > get mapped anyways. Thanks to Al Viro for spotting this issue. > > Signed-off-by: Benjamin LaHaise > > diff --git a/fs/aio.c b/fs/aio.c > index 6e26755..f4a27af 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -307,16 +307,25 @@ static int aio_setup_ring(struct kioctx *ctx) > aio_free_ring(ctx); > return -EAGAIN; > } > - up_write(>mmap_sem); > - > - mm_populate(ctx->mmap_base, populate); > > pr_debug("mmap address: 0x%08lx\n", ctx->mmap_base); > + > + /* We must do this while still holding mmap_sem for write, as we > + * need to be protected against userspace attempting to mremap() > + * or munmap() the ring buffer. > + */ > ctx->nr_pages = get_user_pages(current, mm, ctx->mmap_base, nr_pages, > 1, 0, ctx->ring_pages, NULL); > + > + /* Dropping the reference here is safe as the page cache will hold > + * onto the pages for us. It is also required so that page migration > + * can unmap the pages and get the right reference count. > + */ > for (i = 0; i < ctx->nr_pages; i++) > put_page(ctx->ring_pages[i]); > > + up_write(>mmap_sem); > + > if (unlikely(ctx->nr_pages != nr_pages)) { > aio_free_ring(ctx); > return -EAGAIN; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 1/1] dcache: Translating dentry into pathname without taking rename_lock
I'm really wondering about only trying once before taking the write lock. Yes, using the lsbit is a cute hack, but are we using it for its cuteness rather than its effectiveness? Renames happen occasionally. If that causes all the current pathname translations to fall back to the write lock, that is fairly heavy. Worse, all of those translations will (unnecessarily) bump the write seqcount, triggering *other* translations to fail back to the write-lock path. One patch to fix this would be to have the fallback read algorithm take sl->lock but *not* touch sl->seqcount, so it wouldn't break concurrent readers. But another is to simply retry at least once (two attempts) on the non-exclusive path before falling back to the exclusive one, This means that the count lsbit is no longer enough space for a retry counter, but oh, well. (If you really want to use one word, perhaps a better heuristic as to how to retry would be to examine the *number* of writes to the seqlock during the read. If there was only one, there's a fair chance that another read will succeed. If there was more than one (i.e. the seqlock has incremented by 3 or more), then forcing the writers to stop is probably necessary.) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] workqueue: fix potential reentrancy issue
Hello Tejun, On 2013/9/9 22:14, Tejun Heo wrote: > On Mon, Sep 09, 2013 at 01:12:14PM +0800, Libin wrote: >> From: Li Bin >> >> When one work starts execution, the high bits of work's data contain >> pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID >> is assigned WORK_OFFQ_POOL_NONE when the work being initialized >> indicating that no pool is associated and get_work_pool() uses it to >> check the associated pool. So if worker_pool_assign_id() assigns a >> ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it may break >> the non-reentrance guarantee. >> >> This patch fix this issue by modifying the worker_pool_assign_id() >> function to add the WORK_OFFQ_POOL_NONE check condition. >> >> Signed-off-by: Li Bin >> --- >> kernel/workqueue.c | 11 ++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/workqueue.c b/kernel/workqueue.c >> index 41019b1..97d9ff7 100644 >> --- a/kernel/workqueue.c >> +++ b/kernel/workqueue.c >> @@ -518,7 +518,14 @@ static inline void debug_work_activate(struct >> work_struct *work) { } >> static inline void debug_work_deactivate(struct work_struct *work) { } >> #endif >> >> -/* allocate ID and assign it to @pool */ >> +/** >> + * worker_pool_assign_id - allocate ID and assing it to @pool >> + * @pool: the pool pointer of interest >> + * >> + * Return 0 if ID assigned successful. >> + * Return non-zero if the allocation fails or the ID number out of >> + * %WORK_OFFQ_POOL_NONE range. >> + */ >> static int worker_pool_assign_id(struct worker_pool *pool) >> { >> int ret; >> @@ -526,6 +533,8 @@ static int worker_pool_assign_id(struct worker_pool >> *pool) >> lockdep_assert_held(_pool_mutex); >> >> ret = idr_alloc(_pool_idr, pool, 0, 0, GFP_KERNEL); >> +if (ret >= WORK_OFFQ_POOL_NONE) >> +return ret; > > Hmmm this would leak the allocated ID. Just setting @end param to > idr_alloc to WORK_OFFQ_POOL_NONE would work, right? > Yes, I will update it with the last patch. Thanks! Libin > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cpu/mem hotplug: Add try_online_node() for cpu_up()
cpu_up() has #ifdef CONFIG_MEMORY_HOTPLUG code blocks, which call mem_online_node() to put its node online if offlined and then call build_all_zonelists() to initialize the zone list. These steps are specific to memory hotplug, and should be managed in mm/memory_hotplug.c. lock_memory_hotplug() should also be held for the whole steps. For this reason, this patch replaces mem_online_node() with try_online_node(), which performs the whole steps with lock_memory_hotplug() held. try_online_node() is named after try_offline_node() as they have similar purpose. There is no functional change in this patch. Signed-off-by: Toshi Kani --- include/linux/memory_hotplug.h |8 +++- kernel/cpu.c | 29 +++-- mm/memory_hotplug.c| 15 +-- 3 files changed, 23 insertions(+), 29 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index dd38e62..22203c2 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -94,6 +94,8 @@ extern void __online_page_set_limits(struct page *page); extern void __online_page_increment_counters(struct page *page); extern void __online_page_free(struct page *page); +extern int try_online_node(int nid); + #ifdef CONFIG_MEMORY_HOTREMOVE extern bool is_pageblock_removable_nolock(struct page *page); extern int arch_remove_memory(u64 start, u64 size); @@ -225,6 +227,11 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) { } +static inline int try_online_node(int nid) +{ + return 0; +} + static inline void lock_memory_hotplug(void) {} static inline void unlock_memory_hotplug(void) {} @@ -256,7 +263,6 @@ static inline void remove_memory(int nid, u64 start, u64 size) {} extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn, void *arg, int (*func)(struct memory_block *, void *)); -extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); diff --git a/kernel/cpu.c b/kernel/cpu.c index d7f07a2..c10b285 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -420,11 +420,6 @@ int cpu_up(unsigned int cpu) { int err = 0; -#ifdef CONFIG_MEMORY_HOTPLUG - int nid; - pg_data_t *pgdat; -#endif - if (!cpu_possible(cpu)) { printk(KERN_ERR "can't online cpu %d because it is not " "configured as may-hotadd at boot time\n", cpu); @@ -435,27 +430,9 @@ int cpu_up(unsigned int cpu) return -EINVAL; } -#ifdef CONFIG_MEMORY_HOTPLUG - nid = cpu_to_node(cpu); - if (!node_online(nid)) { - err = mem_online_node(nid); - if (err) - return err; - } - - pgdat = NODE_DATA(nid); - if (!pgdat) { - printk(KERN_ERR - "Can't online cpu %d due to NULL pgdat\n", cpu); - return -ENOMEM; - } - - if (pgdat->node_zonelists->_zonerefs->zone == NULL) { - mutex_lock(_mutex); - build_all_zonelists(NULL, NULL); - mutex_unlock(_mutex); - } -#endif + err = try_online_node(cpu_to_node(cpu)); + if (err) + return err; cpu_maps_update_begin(); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index ed85fe3..c326bdf 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1044,14 +1044,19 @@ static void rollback_node_hotadd(int nid, pg_data_t *pgdat) } -/* +/** + * try_online_node - online a node if offlined + * * called by cpu_up() to online a node without onlined memory. */ -int mem_online_node(int nid) +int try_online_node(int nid) { pg_data_t *pgdat; int ret; + if (node_online(nid)) + return 0; + lock_memory_hotplug(); pgdat = hotadd_new_pgdat(nid, 0); if (!pgdat) { @@ -1062,6 +1067,12 @@ int mem_online_node(int nid) ret = register_one_node(nid); BUG_ON(ret); + if (pgdat->node_zonelists->_zonerefs->zone == NULL) { + mutex_lock(_mutex); + build_all_zonelists(NULL, NULL); + mutex_unlock(_mutex); + } + out: unlock_memory_hotplug(); return ret; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] workqueue: remove meaningless BUILD_BUG_ON() in init_workqueues()
On 2013/9/9 22:18, Tejun Heo wrote: > On Mon, Sep 09, 2013 at 10:13:02AM -0400, Tejun Heo wrote: >> Indeed, but can you please update worker_pool_assign_id() so that it >> doesn't allocate ids >= WORK_OFFQ_POOL_NONE? > > So, you did this separately on the next patch. Can you please roll > that patch into this one with the suggested update? Hello Tejun, I will update it according to your suggestion. Thanks! Libin > > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/hotplug: Remove stop_machine() from try_offline_node()
Sorry, please ignore this email. I accidentally sent a wrong patch... -Toshi On Mon, 2013-09-09 at 18:21 -0600, Toshi Kani wrote: > lock_device_hotplug() serializes hotplug & online/offline operations. > The lock is held in common sysfs online/offline interfaces and ACPI > hotplug code paths. > > try_offline_node() off-lines a node if all memory sections and cpus > are removed on the node. It is called from acpi_processor_remove() > and acpi_memory_remove_memory()->remove_memory() paths, both of which > are in the ACPI hotplug code. > > try_offline_node() calls stop_machine() to stop all cpus while checking > all cpu status with the assumption that the caller is not protected from > CPU hotplug or CPU online/offline operations. However, the caller is > always serialized with lock_device_hotplug(). Also, the code needs to > be properly serialized with a lock, not by stopping all cpus at a random > place with stop_machine(). > > This patch removes the use of stop_machine() in try_offline_node() and > adds comments to try_offline_node() and remove_memory() that > lock_device_hotplug() is required. > > Signed-off-by: Toshi Kani > --- > mm/memory_hotplug.c | 31 ++- > 1 file changed, 22 insertions(+), 9 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index ca1dd3a..0b4b0f7 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1674,9 +1674,8 @@ static int is_memblock_offlined_cb(struct memory_block > *mem, void *arg) > return ret; > } > > -static int check_cpu_on_node(void *data) > +static int check_cpu_on_node(pg_data_t *pgdat) > { > - struct pglist_data *pgdat = data; > int cpu; > > for_each_present_cpu(cpu) { > @@ -1691,10 +1690,9 @@ static int check_cpu_on_node(void *data) > return 0; > } > > -static void unmap_cpu_on_node(void *data) > +static void unmap_cpu_on_node(pg_data_t *pgdat) > { > #ifdef CONFIG_ACPI_NUMA > - struct pglist_data *pgdat = data; > int cpu; > > for_each_possible_cpu(cpu) > @@ -1703,10 +1701,11 @@ static void unmap_cpu_on_node(void *data) > #endif > } > > -static int check_and_unmap_cpu_on_node(void *data) > +static int check_and_unmap_cpu_on_node(pg_data_t *pgdat) > { > - int ret = check_cpu_on_node(data); > + int ret; > > + ret = check_cpu_on_node(pgdat); > if (ret) > return ret; > > @@ -1715,11 +1714,18 @@ static int check_and_unmap_cpu_on_node(void *data) >* the cpu_to_node() now. >*/ > > - unmap_cpu_on_node(data); > + unmap_cpu_on_node(pgdat); > return 0; > } > > -/* offline the node if all memory sections of this node are removed */ > +/** > + * try_offline_node > + * > + * Offline a node if all memory sections and cpus of the node are removed. > + * > + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug > + * and online/offline operations before this call. > + */ > void try_offline_node(int nid) > { > pg_data_t *pgdat = NODE_DATA(nid); > @@ -1745,7 +1751,7 @@ void try_offline_node(int nid) > return; > } > > - if (stop_machine(check_and_unmap_cpu_on_node, pgdat, NULL)) > + if (check_and_unmap_cpu_on_node(pgdat)) > return; > > /* > @@ -1782,6 +1788,13 @@ void try_offline_node(int nid) > } > EXPORT_SYMBOL(try_offline_node); > > +/** > + * remove_memory > + * > + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug > + * and online/offline operations before this call, as required by > + * try_offline_node(). > + */ > void __ref remove_memory(int nid, u64 start, u64 size) > { > int ret; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/hotplug: Remove stop_machine() from try_offline_node()
lock_device_hotplug() serializes hotplug & online/offline operations. The lock is held in common sysfs online/offline interfaces and ACPI hotplug code paths. try_offline_node() off-lines a node if all memory sections and cpus are removed on the node. It is called from acpi_processor_remove() and acpi_memory_remove_memory()->remove_memory() paths, both of which are in the ACPI hotplug code. try_offline_node() calls stop_machine() to stop all cpus while checking all cpu status with the assumption that the caller is not protected from CPU hotplug or CPU online/offline operations. However, the caller is always serialized with lock_device_hotplug(). Also, the code needs to be properly serialized with a lock, not by stopping all cpus at a random place with stop_machine(). This patch removes the use of stop_machine() in try_offline_node() and adds comments to try_offline_node() and remove_memory() that lock_device_hotplug() is required. Signed-off-by: Toshi Kani --- mm/memory_hotplug.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index ca1dd3a..0b4b0f7 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1674,9 +1674,8 @@ static int is_memblock_offlined_cb(struct memory_block *mem, void *arg) return ret; } -static int check_cpu_on_node(void *data) +static int check_cpu_on_node(pg_data_t *pgdat) { - struct pglist_data *pgdat = data; int cpu; for_each_present_cpu(cpu) { @@ -1691,10 +1690,9 @@ static int check_cpu_on_node(void *data) return 0; } -static void unmap_cpu_on_node(void *data) +static void unmap_cpu_on_node(pg_data_t *pgdat) { #ifdef CONFIG_ACPI_NUMA - struct pglist_data *pgdat = data; int cpu; for_each_possible_cpu(cpu) @@ -1703,10 +1701,11 @@ static void unmap_cpu_on_node(void *data) #endif } -static int check_and_unmap_cpu_on_node(void *data) +static int check_and_unmap_cpu_on_node(pg_data_t *pgdat) { - int ret = check_cpu_on_node(data); + int ret; + ret = check_cpu_on_node(pgdat); if (ret) return ret; @@ -1715,11 +1714,18 @@ static int check_and_unmap_cpu_on_node(void *data) * the cpu_to_node() now. */ - unmap_cpu_on_node(data); + unmap_cpu_on_node(pgdat); return 0; } -/* offline the node if all memory sections of this node are removed */ +/** + * try_offline_node + * + * Offline a node if all memory sections and cpus of the node are removed. + * + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug + * and online/offline operations before this call. + */ void try_offline_node(int nid) { pg_data_t *pgdat = NODE_DATA(nid); @@ -1745,7 +1751,7 @@ void try_offline_node(int nid) return; } - if (stop_machine(check_and_unmap_cpu_on_node, pgdat, NULL)) + if (check_and_unmap_cpu_on_node(pgdat)) return; /* @@ -1782,6 +1788,13 @@ void try_offline_node(int nid) } EXPORT_SYMBOL(try_offline_node); +/** + * remove_memory + * + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug + * and online/offline operations before this call, as required by + * try_offline_node(). + */ void __ref remove_memory(int nid, u64 start, u64 size) { int ret; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/4] kernel: add support for init_array constructors
On Mon, Sep 09, 2013 at 06:28:14PM +0200, Frantisek Hrbata wrote: > I'm not sure if coexistence of .ctors and .init_array sections should result > in > denial of module, but I for sure know nothing about this :). Could you maybe > privide one example of the "weird thing"? > They shouldn't exist unless placed there intentionally... I suspect a call_if_changed Makefile target to regenerated a header would solve this problem sufficiently for a given toolchain version. A little exposition: .init_array and .ctors are laid out on top of each other, with an ordering that's a bit complicated... the sort of the ctor functions ends up being .ctor.x upwards towards 65535, and .init_array.x downwards from 65535 towards 0, with priority 65535-x, so that .init_array.32768 would be called before .ctor.32768. It's all a complete mess. Perhaps if CONFIG_GCOV is on, we should enforce MODVERSIONS and make sure the GCC version doesn't change for the running kernel? Maybe it would be sufficient to just detect what the toolchain supports and do that? I have a patch based on the configure.ac in gcc that does something like that, which would be trivial to use to generate a header based on gcc version. > Anyway many thanks for taking time to look at this. Below is my attepmt to > implement the check you proposed. > > untested/uncompiled regards, Kyle -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] ARM: msm: trout: fix uninit var warning
Fix the following warning when !CONFIG_MMC: arch/arm/mach-msm/board-trout.c: In function 'trout_init': arch/arm/mach-msm/board-trout.c:67:6: warning: unused variable 'rc' [-Wunused-variable] int rc; ^ Also, while we're here, rework explicit printk(KERN_CRIT..) to use pr_crit. Signed-off-by: Josh Cartwright --- arch/arm/mach-msm/board-trout.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm/mach-msm/board-trout.c b/arch/arm/mach-msm/board-trout.c index ccf6621..015d544 100644 --- a/arch/arm/mach-msm/board-trout.c +++ b/arch/arm/mach-msm/board-trout.c @@ -13,6 +13,7 @@ * GNU General Public License for more details. * */ +#define pr_fmt(fmt) "%s: " fmt, __func__ #include #include @@ -68,12 +69,11 @@ static void __init trout_init(void) platform_add_devices(devices, ARRAY_SIZE(devices)); -#ifdef CONFIG_MMC -rc = trout_init_mmc(system_rev); -if (rc) -printk(KERN_CRIT "%s: MMC init failure (%d)\n", __func__, rc); -#endif - + if (IS_ENABLED(CONFIG_MMC)) { + rc = trout_init_mmc(system_rev); + if (rc) + pr_crit("MMC init failure (%d)\n", rc); + } } static struct map_desc trout_io_desc[] __initdata = { -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] ARM: msm: trout: fix 'pointer from integer' warnings
Fix several errors, all in the following form: arch/arm/mach-msm/board-trout-gpio.c: In function 'trout_gpio_irq_ack': arch/arm/mach-msm/board-trout-gpio.c:120:2: warning: passing argument 2 of '__raw_writeb' makes pointer from integer without a cast [enabled by default] writeb(mask, TROUT_CPLD_BASE + reg); ^ In file included from include/linux/io.h:22:0, from arch/arm/mach-msm/board-trout-gpio.c:16: arch/arm/include/asm/io.h:81:20: note: expected 'volatile void *' but argument is of type 'unsigned int' static inline void __raw_writeb(u8 val, volatile void __iomem *addr) ^ arch/arm/mach-msm/board-trout-gpio.c: In function 'trout_gpio_irq_mask': arch/arm/mach-msm/board-trout-gpio.c:135:2: warning: passing argument 2 of '__raw_writeb' makes pointer from integer without a cast [enabled by default] writeb(reg_val, TROUT_CPLD_BASE + reg); ^ Signed-off-by: Josh Cartwright --- arch/arm/mach-msm/board-trout-gpio.c | 8 arch/arm/mach-msm/board-trout.h | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm/mach-msm/board-trout-gpio.c b/arch/arm/mach-msm/board-trout-gpio.c index 87e1d01..af47465 100644 --- a/arch/arm/mach-msm/board-trout-gpio.c +++ b/arch/arm/mach-msm/board-trout-gpio.c @@ -115,7 +115,7 @@ static void trout_gpio_irq_ack(struct irq_data *d) { int bank = TROUT_INT_TO_BANK(d->irq); uint8_t mask = TROUT_INT_TO_MASK(d->irq); - int reg = TROUT_BANK_TO_STAT_REG(bank); + void __iomem *reg = TROUT_BANK_TO_STAT_REG(bank); /*printk(KERN_INFO "trout_gpio_irq_ack irq %d\n", d->irq);*/ writeb(mask, TROUT_CPLD_BASE + reg); } @@ -126,7 +126,7 @@ static void trout_gpio_irq_mask(struct irq_data *d) uint8_t reg_val; int bank = TROUT_INT_TO_BANK(d->irq); uint8_t mask = TROUT_INT_TO_MASK(d->irq); - int reg = TROUT_BANK_TO_MASK_REG(bank); + void __iomem *reg = TROUT_BANK_TO_MASK_REG(bank); local_irq_save(flags); reg_val = trout_int_mask[bank] |= mask; @@ -142,7 +142,7 @@ static void trout_gpio_irq_unmask(struct irq_data *d) uint8_t reg_val; int bank = TROUT_INT_TO_BANK(d->irq); uint8_t mask = TROUT_INT_TO_MASK(d->irq); - int reg = TROUT_BANK_TO_MASK_REG(bank); + void __iomem *reg = TROUT_BANK_TO_MASK_REG(bank); local_irq_save(flags); reg_val = trout_int_mask[bank] &= ~mask; @@ -172,7 +172,7 @@ static void trout_gpio_irq_handler(unsigned int irq, struct irq_desc *desc) int j, m; unsigned v; int bank; - int stat_reg; + void __iomem *stat_reg; int int_base = TROUT_INT_START; uint8_t int_mask; diff --git a/arch/arm/mach-msm/board-trout.h b/arch/arm/mach-msm/board-trout.h index b2379ed..82326ca 100644 --- a/arch/arm/mach-msm/board-trout.h +++ b/arch/arm/mach-msm/board-trout.h @@ -67,10 +67,10 @@ #define TROUT_GPIO_START (128) -#define TROUT_GPIO_INT_MASK0_REG(0x0c) -#define TROUT_GPIO_INT_STAT0_REG(0x0e) -#define TROUT_GPIO_INT_MASK1_REG(0x14) -#define TROUT_GPIO_INT_STAT1_REG(0x10) +#define TROUT_GPIO_INT_MASK0_REGIOMEM(0x0c) +#define TROUT_GPIO_INT_STAT0_REGIOMEM(0x0e) +#define TROUT_GPIO_INT_MASK1_REGIOMEM(0x14) +#define TROUT_GPIO_INT_STAT1_REGIOMEM(0x10) #define TROUT_GPIO_HAPTIC_PWM (28) #define TROUT_GPIO_PS_HOLD (25) -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dmadevices: dma_sync_wait and dma_find_channel undefined
On Mon, Sep 9, 2013 at 4:51 PM, Jon Mason wrote: > dma_sync_wait and dma_find_channel are declared regardless of whether > CONFIG_DMA_ENGINE is enabled, but calling the function without > CONFIG_DMA_ENGINE enabled results "undefined reference" errors. > > To get around this, declare dma_sync_wait and dma_find_channel as inline > functions if CONFIG_DMA_ENGINE is undefined. > > Signed-off-by: Jon Mason Applied with a s/dmadevices/dmaengine/ on the subject. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL 0/3] ARM: SoC: Second round of changes for v3.12
Linus Torvalds writes: > On Mon, Sep 9, 2013 at 3:42 PM, Kevin Hilman wrote: >> >> The main thing of note (or of potential annoyance factor) here is the >> handful of conflicts in PULL 2/3 coming from platform changes >> conflicting with driver changes going in to the V4L tree. I've listed >> them in detail in that pull request, and we will work with the >> platform maintainer on the workflow to avoid this in the future. > > Ok. I still really despise the absolute incredible sh*t that is > non-discoverable buses, and I hope that ARM SoC hardware designers all > die in some incredibly painful accident. DT only does so much. In case it helps you feel slightly better... in what some might call a painful accident (though probably not the kind you'd like to see), most of the designers I used to work with (at TI) were laid off in the last year. > So if you see any, send them my love, and possibly puncture the > brake-lines on their car and put a little surprise in their coffee, > ok? Got it. I'll be sure to send your love. >> For future reference, when it comes to these conflicts, do you want to >> see a summary of the suggested resolutions, a published branch with >> the resolutions, both or neither? Just curious. > > I'll basically always end up re-doing the conflict resolution by hand > anyway unless it's just *incredibly* messy (and I think that has > happened all of once or twice), so anything you send me ends up being > just confirmation. > > In this case, for example, I didn't end up looking at your pre-merged > stuff, because the summaries were enough for me to just say "ok, that > confirms my resolution". In other cases, people don't write detailed > summaries, and I end up confirming my resolution by just doing a > separate test-merge against their pre-merged branch and comparing. > > And in most cases, the resolution is trivial enough that I don't > bother with either. > > And in *all* cases I appreciate it when people do the preparation. It > hopefully also makes submaintainers themselves more aware of > development flow conflicts and more aware of possible problem issues > (same reason I prefer doing all the resolutions by hand myself), so I > suspect all of this is healthy even if I don't end up using it. OK, thanks for the feedback. > Final note: putting the conflict resolution explanation in the tag > message is unnecessary, since it's not really worth it after-the-fact > - so I'll just edit it away. It's not a problem, but in general I'd > suggest the tag message just contain the "here's the highlights", and > you do the conflict resolution notes just in the email. But I suspect > you may find the use of the tags a convenient way to jot down the > resolution for then sending the email later, and it's not like it > hurts me to edit it away afterwards, so not a big deal. Whatever works > for you. Noted, thanks. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL 0/3] ARM: SoC: Second round of changes for v3.12
On Mon, 9 Sep 2013 16:49:23 -0700 Linus Torvalds wrote: > On Mon, Sep 9, 2013 at 3:42 PM, Kevin Hilman wrote: > > > > The main thing of note (or of potential annoyance factor) here is the > > handful of conflicts in PULL 2/3 coming from platform changes > > conflicting with driver changes going in to the V4L tree. I've listed > > them in detail in that pull request, and we will work with the > > platform maintainer on the workflow to avoid this in the future. > > Ok. I still really despise the absolute incredible sh*t that is > non-discoverable buses, and I hope that ARM SoC hardware designers all > die in some incredibly painful accident. DT only does so much. > > So if you see any, send them my love, and possibly puncture the > brake-lines on their car and put a little surprise in their coffee, > ok? As you wish. signature.asc Description: PGP signature