date:20180419

Re: [PATCH] net: phy: TLK10X initial driver submission

2018-04-19 Thread Måns Andersson

On Thu, Apr 19, 2018 at 02:09:02PM +0200, Andrew Lunn wrote:
> On Thu, Apr 19, 2018 at 10:28:16AM +0200, Måns Andersson wrote:
> > From: Mans Andersson 
> > 
> > Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
> > 
> > In addition the TLK10X needs to be removed from DP83848 driver as the
> > power back off support is added here for this device.
> > 
> > Datasheet:
> > http://www.ti.com/lit/gpn/tlk106
> > ---
> >  .../devicetree/bindings/net/ti,tlk10x.txt  |  27 +++
> >  drivers/net/phy/Kconfig|   5 +
> >  drivers/net/phy/Makefile   |   1 +
> >  drivers/net/phy/dp83848.c  |   3 -
> >  drivers/net/phy/tlk10x.c   | 209 
> > +
> >  5 files changed, 242 insertions(+), 3 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/net/ti,tlk10x.txt
> >  create mode 100644 drivers/net/phy/tlk10x.c
> > 
> > diff --git a/Documentation/devicetree/bindings/net/ti,tlk10x.txt 
> > b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> > new file mode 100644
> > index 000..371d0d7
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> > @@ -0,0 +1,27 @@
> > +* Texas Instruments - TLK105 / TLK106 ethernet PHYs
> > +
> > +Required properties:
> > +   - reg - The ID number for the phy, usually a small integer
> > +
> > +Optional properties:
> > +   - ti,power-back-off - Power Back Off Level
> > +   Please refer to data sheet chapter 8.6 and TI Application
> > +   Note SLLA3228
> > +   0 - Normal Operation
> > +   1 - Level 1 (up to 140m cable between TLK link partners)
> > +   2 - Level 2 (up to 100m cable between TLK link partners)
> > +   3 - Level 3 (up to 80m cable between TLK link partners)
> 
> Hi Måns
> 
> Device tree is all about board properties. In most cases, power back
> off is not a board properties, since it depends on the cable length
> and the peer board. If however, your board has two PHYs back to back,
> say to connect to an Ethernet switch, that would be a valid board
> property.
> 
> How are you using this?
> 
> I know of others who would like such a configuration. Marvell PHYs can
> do something similar. I've always suggested adding a PHY tunable. Pass
> the cable length in meters and let the PHY driver pick the nearest it
> can do, rounding up. The Marvell PHYs also support measuring the cable
> length as part of the cable diagnostics. So it would be good to
> reserve a configuration value to mean 'auto' - measure the cable and
> then pick the best power back off. Quickly scanning the data sheet, i
> see that this PHY also has the ability to measure the cable length.
>

Hi Andrew,

Thanks for your comments, highly appreciated!

I'm using this to lock down the PHY to the IEEE 802.3 100m standard cable
length, as opposed to the extended 150m which the PHY supports in its
default operation (see pg. 2 of the data sheet). The reason why I need this
is that the board has too high EMC emissions when running with the default
operation. For me the setting is therefore used as a board property, i.e.
it's not something that will be changed during operation.
 
> > +static int tlk10x_read(struct phy_device *phydev, int reg)
> > +{
> > +   if (reg & ~0x1f) {
> > +   /* Extended register */
> > +   phy_write(phydev, TLK10X_REGCR, 0x001F);
> > +   phy_write(phydev, TLK10X_ADDAR, reg);
> > +   phy_write(phydev, TLK10X_REGCR, 0x401F);
> > +   reg = TLK10X_ADDAR;
> > +   }
> > +
> > +   return phy_read(phydev, reg);
> > +}
> > +
> > +static int tlk10x_write(struct phy_device *phydev, int reg, int val)
> > +{
> > +   if (reg & ~0x1f) {
> > +   /* Extended register */
> > +   phy_write(phydev, TLK10X_REGCR, 0x001F);
> > +   phy_write(phydev, TLK10X_ADDAR, reg);
> > +   phy_write(phydev, TLK10X_REGCR, 0x401F);
> > +   reg = TLK10X_ADDAR;
> > +   }
> > +
> > +   return phy_write(phydev, reg, val);
> > +}
> 
> This looks to be phy_read_mmd() and phy_write_mmd(). If so, please use
> them, they get the locking correct.
>

Yes, that's correct, will fix that.

> 
> > +#ifdef CONFIG_OF_MDIO
> > +static int tlk10x_of_init(struct phy_device *phydev)
> > +{
> > +   struct tlk10x_private *tlk10x = phydev->priv;
> > +   struct device *dev = &phydev->mdio.dev;
> > +   struct device_node *of_node = dev->of_node;
> > +   int ret;
> > +
> > +   if (!of_node)
> > +   return 0;
> > +
> > +   ret = of_property_read_u32(of_node, "ti,power-back-off",
> > +  &tlk10x->pwrbo_level);
> > +   if (ret) {
> > +   dev_err(dev, "missing ti,power-back-off property");
> > +   tlk10x->pwrbo_level = 0;
> > +   }
> 
> If we decide to accept this, you should do range checking, and return
> -EINVAL if the value is out of range.

Ok, will fix that.

> 
> > +static int tlk10x_config_init(struct phy_d

RE: [PATCH V2 1/2] clk: imx6sx: add missing lvds2 clock to the clock tree

2018-04-19 Thread Anson Huang



Anson Huang
Best Regards!


> -Original Message-
> From: Shawn Guo [mailto:shawn...@kernel.org]
> Sent: Thursday, April 19, 2018 10:57 PM
> To: Anson Huang 
> Cc: ker...@pengutronix.de; Fabio Estevam ;
> robh...@kernel.org; mark.rutl...@arm.com; li...@armlinux.org.uk;
> mturque...@baylibre.com; sb...@kernel.org; S.j. Wang
> ; dl-linux-imx ;
> linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
> linux-kernel@vger.kernel.org; linux-...@vger.kernel.org
> Subject: Re: [PATCH V2 1/2] clk: imx6sx: add missing lvds2 clock to the clock 
> tree
> 
> On Mon, Mar 19, 2018 at 10:30:44AM +0800, Anson Huang wrote:
> > i.MX6SX has lvds2 (analog clock2), an I/O clock like lvds1.
> > And this lvds2, along with lvds1, can be used to provide external
> > clock source to the internal pll, such as pll4_audio and pll5_video.
> >
> > This patch mainly adds the lvds2 to the clock tree and fix its
> > relationship with pll accordingly.
> >
> > Signed-off-by: Anson Huang 
> > Signed-off-by: Shengjiu Wang 
> > ---
> >  drivers/clk/imx/clk-imx6sx.c | 8 ++--
> >  include/dt-bindings/clock/imx6sx-clock.h | 6 +-
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/clk/imx/clk-imx6sx.c
> > b/drivers/clk/imx/clk-imx6sx.c index e6d389e..478ad0d 100644
> > --- a/drivers/clk/imx/clk-imx6sx.c
> > +++ b/drivers/clk/imx/clk-imx6sx.c
> > @@ -80,7 +80,7 @@ static const char *lvds_sels[]= {
> > "arm", "pll1_sys", "dummy", "dummy", "dummy", "dummy", "dummy",
> "pll5_video_div",
> > "dummy", "dummy", "pcie_ref_125m", "dummy", "usbphy1", "usbphy2",
> > }; -static const char *pll_bypass_src_sels[] = { "osc", "lvds1_in", };
> > +static const char *pll_bypass_src_sels[] = { "osc", "lvds1_in",
> > +"lvds2_in", "dummy", };
> >  static const char *pll1_bypass_sels[] = { "pll1", "pll1_bypass_src",
> > };  static const char *pll2_bypass_sels[] = { "pll2",
> > "pll2_bypass_src", };  static const char *pll3_bypass_sels[] = {
> > "pll3", "pll3_bypass_src", }; @@ -158,8 +158,9 @@ static void __init
> imx6sx_clocks_init(struct device_node *ccm_node)
> > clks[IMX6SX_CLK_IPP_DI0] = of_clk_get_by_name(ccm_node, "ipp_di0");
> > clks[IMX6SX_CLK_IPP_DI1] = of_clk_get_by_name(ccm_node, "ipp_di1");
> >
> > -   /* Clock source from external clock via CLK1 PAD */
> > +   /* Clock source from external clock via CLK1/2 PAD */
> > clks[IMX6SX_CLK_ANACLK1] = imx_obtain_fixed_clock("anaclk1", 0);
> > +   clks[IMX6SX_CLK_ANACLK2] = imx_obtain_fixed_clock("anaclk2", 0);
> 
> It seems to me that anaclk clocks are similar to ipp_di, and could be handled 
> in
> the same way as ipp_di clocks.  If that's the case, I would suggest we do the
> following.
> 
> 1. Kill clocks container node by dropping 'reg' property and naming
>clock nodes uniquely.  This is not strictly related to what we try
>to do here, but just to address DT maintainers' concern on 'clocks'
>container node.
> 
>   clk_ckil: clock-ckil {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <32768>;
>   clock-output-names = "ckil";
>   };
> 
>   clk_osc: clock-osc {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <2400>;
>   clock-output-names = "osc";
>   };
> 
>   clk_ipp_di0: clock-ipp-di0 {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <0>;
>   clock-output-names = "ipp_di0";
>   };
> 
>   clk_ipp_di1: clock-ipp-di1 {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <0>;
>   clock-output-names = "ipp_di1";
>   };
> 
>   clks: ccm@20c4000 {
>   compatible = "fsl,imx6sx-ccm";
>   reg = <0x020c4000 0x4000>;
>   interrupts = ,
>;
>   #clock-cells = <1>;
>   clocks = <&clk_ckil>, <&clk_osc>, <&clk_ipp_di0>, 
> <&clk_ipp_di1>;
>   clock-names = "ckil", "osc", "ipp_di0", "ipp_di1";
>   };
> 
> 2. Patch clock driver to have anaclk1 and anaclk2 handled in the same
>way as ipp_di clocks.
> 
>   clks[IMX6SX_CLK_ANACLK1] = of_clk_get_by_name(ccm_node, "anaclk1");
>   clks[IMX6SX_CLK_ANACLK2] = of_clk_get_by_name(ccm_node, "anaclk2");
> 
> 3. Add anaclk1 and anaclk2 with clock-frequency being 0 by default, just
>like ipp_di clocks.
> 
>   clk_anaclk1: clock-anaclk1 {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <0>;
>   clock-output-names = "anaclk1";
>   };
> 
>   clk_anaclk2: clock-anaclk2 {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
>   clock-frequency = <0>;
>   clock-output-names = "anaclk2";
>   }

Re: [PATCH v2] ARM64: dts: meson-axg: enable the eMMC controller

2018-04-19 Thread Ulf Hansson

On 19 April 2018 at 19:58, Kevin Hilman  wrote:
> Yixun Lan  writes:
>
>> From: Nan Li 
>>
>> The IP of eMMC controller in AXG is similiar to Meson-GX series.
>> Here we add the initial support of the HS200 mode with
>> clock running at 166MHz (to be safe), since we found some eMMC chip
>> fail to run at 200MHz due to tunning phase error.
>>
>> Signed-off-by: Nan Li 
>> Signed-off-by: Yixun Lan 
>
> Applied to v4.18/dt64
>
>> ---
>> Hi Kevin
>>   Please note this patch actually depend on the eMMC driver here [0].
>>   Still a few problem to solve, to improve the tuning phase driver to make
>> the clock running at 200MHz, and to further support the HS400 mode.
>> Anyway, this patch itself is quite independent.
>
> The driver changes are queued for v4.18 also.  Good!

Right, may I consider that as an ack? :-)

Kind regards
Uffe

>
> Kevin
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

[GIT PULL] MMC fixes for v.4.17-rc2

2018-04-19 Thread Ulf Hansson

Hi Linus,

Here's a PR with a couple of MMC fixes intended for v4.17-rc2. Details about
the highlights are as usual found in the signed tag.

Please pull this in!

Kind regards
Ulf Hansson


The following changes since commit fc167daff581c01ebce8695e9618231cae3561a1:

  mmc: tmio: Fix error handling when issuing CMD23 (2018-04-04 12:21:27 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git tags/mmc-v4.17-3

for you to fetch changes up to 0cbc94daa55441c21999e96a07061952d873dcb7:

  mmc: renesas_sdhi_internal_dmac: limit DMA RX for old SoCs (2018-04-19 
14:57:17 +0200)


MMC host:
 - sdhci-pci: Fixup tuning for AMD for eMMC HS200 mode
 - renesas_sdhi_internal_dmac: Avoid data corruption by limiting DMA RX


Daniel Kurtz (1):
  mmc: sdhci-pci: Only do AMD tuning for HS200

Wolfram Sang (1):
  mmc: renesas_sdhi_internal_dmac: limit DMA RX for old SoCs

 drivers/mmc/host/renesas_sdhi_internal_dmac.c | 39 ++-
 drivers/mmc/host/sdhci-pci-core.c | 25 +++--
 2 files changed, 56 insertions(+), 8 deletions(-)

Re: [RFC/RFT patch 0/7] timekeeping: Unify clock MONOTONIC and clock BOOTTIME

2018-04-19 Thread David Herrmann

Hi

On Fri, Apr 20, 2018 at 7:44 AM, Sergey Senozhatsky
 wrote:
> On (04/20/18 06:37), David Herrmann wrote:
>>
>> I get lots of timer-errors on Arch-Linux booting current master, after
>> a suspend/resume cycle. Just a selection of errors I see on resume:
>
> Hello David,
> Any chance you can revert the patches in question and test? I'm running
> ARCH (4.17.0-rc1-dbg-00042-gaa03ddd9c434) and suspend/resume cycle does
> not trigger any errors. Except for this one
>
> kernel: do_IRQ: 0.55 No irq handler for vector

I can easily reproduce it by sleeping for >5min, so the systemd
watchdog timers are triggered. The patches don't revert cleanly, so I
didn't look into booting without them, yet. I will try just linking
the monotonic clock to the monotonic_active clock later.

Also, doesn't this hunk in 72199320d49d need a 'break;':

diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
index b258bee13b02..6259dbc0191a 100644
--- a/kernel/time/posix-stubs.c
+++ b/kernel/time/posix-stubs.c
@@ -73,6 +73,8 @@ int do_clock_gettime(clockid_t which_clock, struct
timespec64 *tp)
case CLOCK_BOOTTIME:
get_monotonic_boottime64(tp);
break;
+   case CLOCK_MONOTONIC_ACTIVE:
+   ktime_get_active_ts64(tp);
default:
return -EINVAL;
}

Linux 4.9.95

2018-04-19 Thread Greg KH

I'm announcing the release of the 4.9.95 kernel.

All users of the 4.9 kernel series must upgrade.

The updated 4.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-4.9.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 
 arch/arm/include/asm/kvm_host.h |6 
 arch/arm/include/asm/kvm_mmu.h  |   10 
 arch/arm/include/asm/kvm_psci.h |   27 
 arch/arm/kvm/arm.c  |   11 
 arch/arm/kvm/handle_exit.c  |4 
 arch/arm/kvm/psci.c |  143 +
 arch/arm64/Kconfig  |   17 
 arch/arm64/crypto/sha256-core.S | 2061 
 arch/arm64/crypto/sha512-core.S | 1085 ++
 arch/arm64/include/asm/assembler.h  |   19 
 arch/arm64/include/asm/barrier.h|   23 
 arch/arm64/include/asm/cpucaps.h|3 
 arch/arm64/include/asm/cputype.h|6 
 arch/arm64/include/asm/futex.h  |9 
 arch/arm64/include/asm/kvm_host.h   |5 
 arch/arm64/include/asm/kvm_mmu.h|   38 
 arch/arm64/include/asm/kvm_psci.h   |   27 
 arch/arm64/include/asm/memory.h |   15 
 arch/arm64/include/asm/mmu.h|   39 
 arch/arm64/include/asm/processor.h  |   24 
 arch/arm64/include/asm/sysreg.h |2 
 arch/arm64/include/asm/uaccess.h|  153 -
 arch/arm64/kernel/Makefile  |4 
 arch/arm64/kernel/arm64ksyms.c  |4 
 arch/arm64/kernel/bpi.S |   75 
 arch/arm64/kernel/cpu_errata.c  |  189 +
 arch/arm64/kernel/cpufeature.c  |   10 
 arch/arm64/kernel/entry.S   |   25 
 arch/arm64/kvm/handle_exit.c|   16 
 arch/arm64/kvm/hyp/hyp-entry.S  |   20 
 arch/arm64/kvm/hyp/switch.c |5 
 arch/arm64/lib/clear_user.S |6 
 arch/arm64/lib/copy_in_user.S   |4 
 arch/arm64/mm/context.c |   12 
 arch/arm64/mm/fault.c   |   34 
 arch/arm64/mm/proc.S|7 
 arch/parisc/kernel/drivers.c|4 
 arch/s390/kernel/ipl.c  |1 
 drivers/acpi/nfit/core.c|   22 
 drivers/block/loop.c|   12 
 drivers/firmware/psci.c |   57 
 drivers/gpu/drm/radeon/radeon_object.c  |3 
 drivers/hv/channel_mgmt.c   |2 
 drivers/hwmon/ina2xx.c  |3 
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c   |4 
 drivers/net/phy/micrel.c|   42 
 drivers/net/slip/slhc.c |5 
 drivers/net/usb/cdc_ether.c |6 
 drivers/net/usb/lan78xx.c   |3 
 drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c  |2 
 drivers/s390/cio/qdio_main.c|   42 
 drivers/vhost/vhost.c   |8 
 fs/namei.c  |3 
 include/kvm/arm_psci.h  |   51 
 include/linux/arm-smccc.h   |  165 +
 include/linux/mm.h  |4 
 include/linux/psci.h|   14 
 include/net/bluetooth/hci_core.h|2 
 include/net/slhc_vj.h   |1 
 include/uapi/linux/psci.h   |3 
 kernel/events/core.c|6 
 net/bluetooth/hci_conn.c|   29 
 net/bluetooth/hci_event.c   |   15 
 net/bluetooth/l2cap_core.c  |2 
 net/rds/send.c  |   15 
 net/sunrpc/auth_gss/gss_krb5_crypto.c   |3 
 tools/perf/tests/code-reading.c |   20 
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c |   64 
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h |2 
 tools/perf/util/intel-pt.c  |   37 
 71 files changed, 4442 insertions(+), 350 deletions(-)

Adrian Hunter (4):
  perf intel-pt: Fix overlap detection to identify consecutive buffers 
correctly
  perf intel-pt: Fix sync_switch
  perf intel-pt:

Re: [PATCH 4.9 00/66] 4.9.95-stable review

2018-04-19 Thread Greg Kroah-Hartman

On Thu, Apr 19, 2018 at 03:04:05PM -0500, Dan Rue wrote:
> On Thu, Apr 19, 2018 at 04:03:05PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Apr 19, 2018 at 04:42:56PM +0530, Naresh Kamboju wrote:
> > > >
> > > > Can you try 'git bisect'?  I'll hold off on releasing 4.9.y until this
> > > > gets figured out.
> > > 
> > > After reverting this patch, network started works on arm32 x15 device.
> > > d7ba3c00047d ("net: phy: micrel: Restore led_mode and clk_sel on resume")
> > 
> > Thanks for letting me know, I've now reverted that commit.
> 
> Alright here we go.
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm and x86_64.

Great, thanks for testing and letting me know.

greg k-h

Re: [PATCH] time: tick-sched: use bool for tick_stopped

2018-04-19 Thread yuankuiz

On 2018-04-20 09:47 AM, yuank...@codeaurora.org wrote:

On 2018-04-11 07:20 AM, yuank...@codeaurora.org wrote:

++
On 2018-04-11 07:09 AM, yuank...@codeaurora.org wrote:

++

On 2018-04-10 10:49 PM, yuank...@codeaurora.org wrote:

Typo...

On 2018-04-10 10:08 PM, yuank...@codeaurora.org wrote:

On 2018-04-10 07:06 PM, Thomas Gleixner wrote:

On Tue, 10 Apr 2018, yuank...@codeaurora.org wrote:

On 2018-04-10 05:10 PM, Thomas Gleixner wrote:
> On Tue, 10 Apr 2018, yuank...@codeaurora.org wrote:
> > On 2018-04-10 04:00 PM, Rafael J. Wysocki wrote:
> > > On Tue, Apr 10, 2018 at 9:33 AM,   wrote:
> > > > From: John Zhao 
> > > >
> > > > Variable tick_stopped returned by tick_nohz_tick_stopped
> > > > can have only true / false values. Since the return type
> > > > of the tick_nohz_tick_stopped is also bool, variable
> > > > tick_stopped nice to have data type as bool in place of unsigned int.
> > > > Moreover, the executed instructions cost could be minimal
> > > > without potiential data type conversion.
> > > >
> > > > Signed-off-by: John Zhao 
> > > > ---
> > > >  kernel/time/tick-sched.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
> > > > index 6de959a..4d34309 100644
> > > > --- a/kernel/time/tick-sched.h
> > > > +++ b/kernel/time/tick-sched.h
> > > > @@ -48,8 +48,8 @@ struct tick_sched {
> > > > unsigned long   check_clocks;
> > > > enum tick_nohz_mode nohz_mode;
> > > >
> > > > +   booltick_stopped: 1;
> > > > unsigned intinidle  : 1;
> > > > -   unsigned inttick_stopped: 1;
> > > > unsigned intidle_active : 1;
> > > > unsigned intdo_timer_last   : 1;
> > > > unsigned intgot_idle_tick   : 1;
> > >
> > > I don't think this is a good idea at all.
> > >
> > > Please see https://lkml.org/lkml/2017/11/21/384 for example.
> > [ZJ] Thanks for this sharing. Looks like, this patch fall into the case of
> > "Maybe".
>
> This patch falls into the case 'pointless' because it adds extra storage
[ZJ] 1 bit vs 1 bit. no more.

Groan. No. Care to look at the data structure? You create a new 
storage,

[ZJ] Say, {unsigned int, unsigned int, unsigned int, unsigned int,
unsigned int} becomes
  {bool, unsigned int, unsigned int, unsigned int, 
unsigned int}
As specified by the rule No.10 at the section 6.7.2.1 of C99 TC2 
as:
"If enough space remains, a bit-field that immediately follows 
another

bit-field in a
structure shall be packed into adjacent bits of the same unit." 
What

is the new storage so far?

[ZJ] Further prototyping has been given based on gcc for both of
x86_64 and armv8-a,
 unsigned int and bool share the same 1 bytes without the
addtional storage for sure.
 Open this and welcome if any other difference behaviour could be 
captured.

[ZJ] Typo.. change 4 bytes above to 1 byte actually.

which is incidentally merged into the other bitfield by the 
compiler at a
different bit position, but there is no guarantee that a compiler 
does

that. It's free to use distinct storage for that bool based bit.

[ZJ] Per the rule No.10 at section 6.7.2.1 of C99 TC2 as:
" If insufficient space remains, whether  a  bit-field  that  does
not  fit  is  put  into
the  next  unit  or overlaps  adjacent  units  is 
implementation-defined."

So, implementation is never mind which type will be stored if any.

>> > for no benefit at all.
[ZJ] tick_stopped is returned by the tick_nohz_tick_stopped() 
which is bool.

The benefit is no any potiential type conversion could be minded.

A bit stays a bit. 'bool foo : 1;' or 'unsigned int foo : 1' has 
to be
evaluated as a bit. So there is a type conversion from BIT to bool 
required

because BIT != bool.

[ZJ] Per the rule No.9 at section 6.7.2.1 of C99 TC2 as:
"If  the  value  0  or  1  is  stored  into  a  nonzero-width
bit-field  of  types
_Bool, the value of the bit-field shall compare equal to the value 
stored."

Obviously, it is nothing related to type conversion actually.

By chance the evaluation can be done by evaluating the byte in 
which the
bit is placed just because the compiler knows that the remaining 
bits are
not used. There is no guarantee that this is done, it happens to 
be true

for a particular compiler.
[ZJ] Actually, such as GCC owe that kind of guarantee to be 
promised by ABI.
[ZJ] "-mone-byte-bool" could be used by alpha-linux-gcc to override the 
default bool size

 to become 1 byte for even Darwin / powerPC from it's manual.

But that does not make it any more interesting. It just makes the 
code

harder to read and eventually leads to bigger storage.

[ZJ] To get the benctifit to be profiled, it is given as:
number of instructions of function tick_nohz_tick_stopped():
[ZJ] Here, I used is not the

Hello

2018-04-19 Thread Mrs. Pamela Atuegbe

Am Mrs.Pamela Atuegbe, I work in one of the prime bank here in burkina 
faso, i want the bank to transfer the money left by our late customer 
is a foreigner from Korea. can you investment this money and also help 
the poor' the amount value at $13,300,000.00 (Thirteen Million Three 
Hundred Thousand United States American Dollars), left in his account 
still unclaimed. more details will be giving to you if you are 
interested, I wait your reply thanks.


Yours sincerely

Name: Mrs. Pamela Atuegbe

Re: [PATCH v5 4/4] zram: introduce zram memory tracking

2018-04-19 Thread Minchan Kim

On Fri, Apr 20, 2018 at 11:09:21AM +0900, Minchan Kim wrote:
> On Wed, Apr 18, 2018 at 02:07:15PM -0700, Andrew Morton wrote:
> > On Wed, 18 Apr 2018 10:26:36 +0900 Minchan Kim  wrote:
> > 
> > > Hi Andrew,
> > > 
> > > On Tue, Apr 17, 2018 at 02:59:21PM -0700, Andrew Morton wrote:
> > > > On Mon, 16 Apr 2018 18:09:46 +0900 Minchan Kim  
> > > > wrote:
> > > > 
> > > > > zRam as swap is useful for small memory device. However, swap means
> > > > > those pages on zram are mostly cold pages due to VM's LRU algorithm.
> > > > > Especially, once init data for application are touched for launching,
> > > > > they tend to be not accessed any more and finally swapped out.
> > > > > zRAM can store such cold pages as compressed form but it's pointless
> > > > > to keep in memory. Better idea is app developers free them directly
> > > > > rather than remaining them on heap.
> > > > > 
> > > > > This patch tell us last access time of each block of zram via
> > > > > "cat /sys/kernel/debug/zram/zram0/block_state".
> > > > > 
> > > > > The output is as follows,
> > > > >   30075.033841 .wh
> > > > >   30163.806904 s..
> > > > >   30263.806919 ..h
> > > > > 
> > > > > First column is zram's block index and 3rh one represents symbol
> > > > > (s: same page w: written page to backing store h: huge page) of the
> > > > > block state. Second column represents usec time unit of the block
> > > > > was last accessed. So above example means the 300th block is accessed
> > > > > at 75.033851 second and it was huge so it was written to the backing
> > > > > store.
> > > > > 
> > > > > Admin can leverage this information to catch cold|incompressible pages
> > > > > of process with *pagemap* once part of heaps are swapped out.
> > > > 
> > > > A few things..
> > > > 
> > > > - Terms like "Admin can" and "Admin could" are worrisome.  How do we
> > > >   know that admins *will* use this?  How do we know that we aren't
> > > >   adding a bunch of stuff which nobody will find to be (sufficiently)
> > > >   useful?  For example, is there some userspace tool to which you are
> > > >   contributing which will be updated to use this feature?
> > > 
> > > Actually, I used this feature two years ago to find memory hogger
> > > although the feature was very fast prototyping. It was very useful
> > > to reduce memory cost in embedded space.
> > > 
> > > The reason I am trying to upstream the feature is I need the feature
> > > again. :)
> > > 
> > > Yub, I have a userspace tool to use the feature although it was
> > > not compatible with this new version. It should be updated with
> > > new format. I will find a time to submit the tool.
> > 
> > hm, OK, can we get this info into the changelog?  
> 
> No problem. I will add as follows,
> 
> "I used the feature a few years ago to find memory hoggers in userspace
> to notice them what memory they have wasted without touch for a long time.
> With it, they could reduce unnecessary memory space. However, at that time,
> I hacked up zram for the feature but now I need the feature again so
> I decided it would be better to upstream rather than keeping it alone.
> I hope I submit the userspace tool to use the feature soon"
> 
> > 
> > > > 
> > > > - block_state's second column is in microseconds since some
> > > >   undocumented time.  But how is userspace to know how much time has
> > > >   elapsed since the access?  ie, "current time".
> > > 
> > > It's a sched_clock so it should be elapsed time since the system boot.
> > > I should have written it explictly.
> > > I will fix it.
> > > 
> > > > 
> > > > - Is the sched_clock() return value suitable for exporting to
> > > >   userspace?  Is it monotonic?  Is it consistent across CPUs, across
> > > >   CPU hotadd/remove, across suspend/resume, etc?  Does it run all the
> > > >   way up to 2^64 on all CPU types, or will some processors wrap it at
> > > >   (say) 32 bits?  etcetera.  Documentation/timers/timekeeping.txt
> > > >   points out that suspend/resume can mess it up and that the counter
> > > >   can drift between cpus.
> > > 
> > > Good point!
> > > 
> > > I just referenced it from ftrace because I thought the goal is similiar
> > > "no need to be exact unless the drift is frequent but wanted to be fast"
> > > 
> > > AFAIK, ftrace/printk is active user of the function so if the problem
> > > happens frequently, it might be serious. :)
> > 
> > It could be that ktime_get() is a better fit here - especially if
> > sched_clock() goes nuts after resume.  Unfortunately ktime_get()
> > appears to be totally undocumented :(
> > 
> 
> I will use ktime_get_boottime(). With it, zram is not demamaged by
> suspend/resume and code would be more simple/clear. For user, it
> would be more straightforward to parse the time.
> 
> Thanks for good suggestion, Andrew!
> 

Hey Andrew,

This is updated patch for 4/4.
If you want to replace full patchset, please tell me. I will send full
patchset.

>From 2ac685c32ffd3fba42d5eea6347f924c6e89bec

Re: [PATCH v3 1/5] clk: Extract OF clock helpers in

2018-04-19 Thread Geert Uytterhoeven

Hi Stephen, Rob,

On Fri, Apr 20, 2018 at 12:25 AM, Stephen Boyd  wrote:
> Quoting Geert Uytterhoeven (2018-04-18 07:50:01)
>> The use of of_clk_get_parent_{count,name}() and of_clk_init() is not
>> limited to clock providers.
>>
>> Hence move these helpers into their own header file, so callers that are
>> not clock providers no longer have to include .
>>
>> Suggested-by: Stephen Boyd 
>> Signed-off-by: Geert Uytterhoeven 
>> Reviewed-by: Heiko Stuebner 
>> ---
>> v3:
>>   - Add Reviewed-by,
>>   - Add SPDX-License-Identifier,
>>   - Add to clock section in MAINTAINERS (note that Rob is still listed
>> as a maintainer, too, due to the include/linux/of*.h catch-all
>> rule),
>
> Can you X: out this file so Rob is happy? Or that doesn't work?

I guess that should work.

My point here is that due to the catch-all rule, he's listed as maintainer
for of_{dma,gpio,irq,iommu,mdio,net,pci} too, which are all helpers for
other subsystems. Perhaps these should be X'd-out too?
Or is it OK without X-ing them out, as the clock maintainers are now shown, too?

Rob: What's your preference?

Thanks!

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v9 2/2] drm: bridge: Add thc63lvd1024 LVDS decoder driver

2018-04-19 Thread Vladimir Zapolskiy

Hi Jacopo,

On 04/18/2018 05:40 PM, Jacopo Mondi wrote:
> Add DRM bridge driver for Thine THC63LVD1024 LVDS to digital parallel
> output converter.
> 
> Signed-off-by: Jacopo Mondi 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Niklas Söderlund 
> Reviewed-by: Laurent Pinchart 
> ---
>  drivers/gpu/drm/bridge/Kconfig|   6 +
>  drivers/gpu/drm/bridge/Makefile   |   1 +
>  drivers/gpu/drm/bridge/thc63lvd1024.c | 206 
> ++
>  3 files changed, 213 insertions(+)
>  create mode 100644 drivers/gpu/drm/bridge/thc63lvd1024.c

Reviewed-by: Vladimir Zapolskiy 

--
With best wishes,
Vladimir

Re: [PATCH v9 1/2] dt-bindings: display: bridge: Document THC63LVD1024 LVDS decoder

2018-04-19 Thread Vladimir Zapolskiy

Hi Jacopo,

On 04/18/2018 05:40 PM, Jacopo Mondi wrote:
> Document Thine THC63LVD1024 LVDS decoder device tree bindings.
> 
> Signed-off-by: Jacopo Mondi 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Niklas Söderlund 
> Reviewed-by: Laurent Pinchart 
> Reviewed-by: Rob Herring 
> ---
>  .../bindings/display/bridge/thine,thc63lvd1024.txt | 60 
> ++
>  1 file changed, 60 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/bridge/thine,thc63lvd1024.txt
> 

Reviewed-by: Vladimir Zapolskiy 

--
With best wishes,
Vladimir

Re: 4.15.17 regression: bisected: timeout during microcode update

2018-04-19 Thread Vitezslav Samel

On Thu, Apr 19, 2018 at 06:37:34PM +0200, Borislav Petkov wrote:
> On Thu, Apr 19, 2018 at 03:46:27PM +0200, Vitezslav Samel wrote:
> > 
> > microcode: __reload_late: CPU0
> > microcode: __reload_late: CPU3
> > microcode: __reload_late: CPU2
> > microcode: __reload_late: CPU1
> > microcode: __reload_late: CPU0 reloading
> > microcode: __reload_late: CPU2 reloading
> > microcode: __reload_late: CPU1 reloading
> > microcode: __reload_late: CPU3 reloading
> > microcode: find_patch: CPU2, NADA
> 
> Ok, I think I have it. Please run the patch below, it still has the
> debugging output so please paste it here once you've done the exact same
> exercise.
> 
> It should not explode this time! (Famous last words :-))

  ;-)  This time it works.

(Ashok: all test were against stable 4.16.3)

--
microcode: __reload_late: CPU1
microcode: __reload_late: CPU0
microcode: __reload_late: CPU2
microcode: __reload_late: CPU3
microcode: __reload_late: CPU1 reloading
microcode: __reload_late: CPU0 reloading
microcode: __reload_late: CPU3 reloading
microcode: __reload_late: CPU2 reloading
microcode: find_patch: CPU0, phdr: 0x24, uci: 0x1c
microcode: find_patch: CPU0, find_matching_signature: sig: 0x306c3, pf: 0x2
microcode: find_patch: CPU0, found phdr: 0x24
microcode: updated to revision 0x24, date = 2018-01-21
microcode: __reload_late: CPU0 waiting to exit
microcode: find_patch: CPU3, phdr: 0x24, uci: 0x1c
microcode: find_patch: CPU3, find_matching_signature: sig: 0x306c3, pf: 0x2
microcode: find_patch: CPU3, found phdr: 0x24
microcode: __reload_late: CPU3 waiting to exit
microcode: find_patch: CPU2, phdr: 0x24, uci: 0x1c
microcode: find_patch: CPU2, find_matching_signature: sig: 0x306c3, pf: 0x2
microcode: find_patch: CPU2, found phdr: 0x24
microcode: __reload_late: CPU2 waiting to exit
microcode: find_patch: CPU1, phdr: 0x24, uci: 0x1c
microcode: find_patch: CPU1, find_matching_signature: sig: 0x306c3, pf: 0x2
microcode: find_patch: CPU1, found phdr: 0x24
microcode: __reload_late: CPU1 waiting to exit
x86/CPU: CPU features have changed after loading microcode, but might not take 
effect.
x86/CPU: Please consider either early loading through initrd/built-in or a 
potential BIOS update.
--

Thank you very much,

Vita


> Thx!
> 
> ---
>  arch/x86/kernel/cpu/microcode/core.c  | 11 +++
>  arch/x86/kernel/cpu/microcode/intel.c | 17 ++---
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/core.c 
> b/arch/x86/kernel/cpu/microcode/core.c
> index 10c4fc2c91f8..e84877b0f7d7 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -553,6 +553,8 @@ static int __reload_late(void *info)
>   enum ucode_state err;
>   int ret = 0;
>  
> + pr_info("%s: CPU%d\n", __func__, cpu);
> +
>   /*
>* Wait for all CPUs to arrive. A load will not be attempted unless all
>* CPUs show up.
> @@ -560,20 +562,21 @@ static int __reload_late(void *info)
>   if (__wait_for_cpus(&late_cpus_in, NSEC_PER_SEC))
>   return -1;
>  
> + pr_info("%s: CPU%d reloading\n", __func__, cpu);
> +
>   spin_lock(&update_lock);
>   apply_microcode_local(&err);
>   spin_unlock(&update_lock);
>  
> + /* siblings return UCODE_OK because their engine got updated already */
>   if (err > UCODE_NFOUND) {
>   pr_warn("Error reloading microcode on CPU %d\n", cpu);
> - return -1;
> - /* siblings return UCODE_OK because their engine got updated already */
>   } else if (err == UCODE_UPDATED || err == UCODE_OK) {
>   ret = 1;
> - } else {
> - return ret;
>   }
>  
> + pr_info("%s: CPU%d waiting to exit\n", __func__, cpu);
> +
>   /*
>* Increase the wait timeout to a safe value here since we're
>* serializing the microcode update and that could take a while on a
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
> b/arch/x86/kernel/cpu/microcode/intel.c
> index 32b8e5724f96..725e0bb6df03 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -485,7 +485,6 @@ static void show_saved_mc(void)
>   */
>  static void save_mc_for_early(u8 *mc, unsigned int size)
>  {
> -#ifdef CONFIG_HOTPLUG_CPU
>   /* Synchronization during CPU hotplug. */
>   static DEFINE_MUTEX(x86_cpu_microcode_mutex);
>  
> @@ -495,7 +494,6 @@ static void save_mc_for_early(u8 *mc, unsigned int size)
>   show_saved_mc();
>  
>   mutex_unlock(&x86_cpu_microcode_mutex);
> -#endif
>  }
>  
>  static bool load_builtin_intel_microcode(struct cpio_data *cp)
> @@ -727,21 +725,32 @@ static struct microcode_intel *find_patch(struct 
> ucode_cpu_info *uci)
>  {
>   struct microcode_header_intel *phdr;
>

Re: [PATCH] serial: imx: enable IMX21_UCR3_RXDMUXSEL for non-dte-mode

2018-04-19 Thread Uwe Kleine-König

Hello Chris,

On Fri, Apr 20, 2018 at 09:07:59AM +0800, Chris Ruehl wrote:
> Fix a problem introduced with
> commit e61c38d85b73 ("serial: imx: setup DCEDTE early and ensure DCD and RI 
> irqs to be off")
> result in non dte-mode imx-uart fail receive data.
> By add back IMX21_UCR3_RXDMUXSEL the serial port works as expected.
> 
> Signed-off-by: Chris Ruehl 
> ---
>  drivers/tty/serial/imx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
> index 91f3a1a..3d09933 100644
> --- a/drivers/tty/serial/imx.c
> +++ b/drivers/tty/serial/imx.c
> @@ -1391,7 +1391,7 @@ static int imx_uart_startup(struct uart_port *port)
>  
>   ucr3 = imx_uart_readl(sport, UCR3);
>  
> - ucr3 |= UCR3_DTRDEN | UCR3_RI | UCR3_DCD;
> + ucr3 |= IMX21_UCR3_RXDMUXSEL | UCR3_DTRDEN | UCR3_RI | UCR3_DCD;
>  
>   if (sport->dte_mode)
>   /* disable broken interrupts */

Doesn't 6df765dca378bddf994cfd2044acafa501bd800f fix this for you?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |

linux-next: Tree for Apr 20

2018-04-19 Thread Stephen Rothwell

Hi all,

Changes since 20180419:

I have added a patch to the arm-current tree to fix build problems
discovered overnight.

Non-merge commits (relative to Linus' tree): 1278
 1324 files changed, 47025 insertions(+), 20625 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (87ef12027b9b Merge tag 'ceph-for-4.17-rc2' of 
git://github.com/ceph/ceph-client)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (28913ee8191a netfilter: nf_nat_snmp_basic: add 
correct dependency to Makefile)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (fe680ca02c1e ARM: replace unnecessary perl with sed 
and the shell $(( )) operator)
Applying: arm: check for A as well as B type sybols when calculating BSS size
Merging arm64-fixes/for-next/fixes (b2d71b3cda19 arm64: signal: don't force 
known signals to SIGKILL)
Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" 
comment)
Merging powerpc-fixes/fixes (56376c5864f8 powerpc/kvm: Fix lockups when running 
KVM guests on Power8)
Merging sparc/master (17dec0a94915 Merge branch 'userns-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (1255fcb2a655 net/smc: fix shutdown in state SMC_LISTEN)
Merging bpf/master (0a0a7e00a250 tools/bpf: fix test_sock and test_sock_addr.sh 
failure)
Merging ipsec/master (b48c05ab5d32 xfrm: Fix warning in xfrm6_tunnel_net_exit.)
Merging netfilter/master (d71efb599ad4 netfilter: nf_tables: fix out-of-bounds 
in nft_chain_commit_update)
Merging ipvs/master (765cca91b895 netfilter: conntrack: include kmemleak.h for 
kmemleak_not_leak())
Merging wireless-drivers/master (77e30e10ee28 iwlwifi: mvm: query regdb for wmm 
rule if needed)
Merging mac80211/master (83826469e36b cfg80211: fix possible memory leak in 
regdb_query_country())
Merging rdma-fixes/for-rc (60cc43fc8884 Linux 4.17-rc1)
Merging sound-current/for-linus (8a56ef4f3ffb ALSA: rawmidi: Fix missing input 
substream checks in compat ioctls)
Merging pci-current/for-linus (60cc43fc8884 Linux 4.17-rc1)
Merging driver-core.current/driver-core-linus (ed4564babeee drivers: change 
struct device_driver::coredump() return type to void)
Merging tty.current/tty-linus (60cc43fc8884 Linux 4.17-rc1)
Merging usb.current/usb-linus (60cc43fc8884 Linux 4.17-rc1)
Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: 
add binging for r8a77965)
Merging usb-serial-fixes/usb-linus (470b5d6f0cf4 USB: serial: ftdi_sio: use 
jtag quirk for Arrow USB Blaster)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (60cc43fc8884 Linux 4.17-rc1)
Merging staging.current/staging-linus (edf5c17d866e staging: irda: remove 
remaining remants of irda code removal)
Merging char-misc.current/char-misc-linus (60cc43fc8884 Linux 4.17-rc1)
Merging input-current/for-linus (664b0bae0b87 Merge branch 'next'

Re: [PATCH v2] mmc: sdhci-cadence: fix logically and structurally dead code

2018-04-19 Thread Adrian Hunter

On 19/04/18 18:59, Gustavo A. R. Silva wrote:
> Currently, the code block inside the for loop will never execute
> more than once, because the function returns inmediately after
> the first iteration, hence the execution of the code at the second
> iteration is structurally dead and, code at line 281: return 0; is
> never reached.
> 
> Fix this by checking _ret_ before return.
> 
> Addresses-Coverity-ID: 1468009 ("Logically dead code")
> Addresses-Coverity-ID: 1468002 ("Structurally dead code")
> Suggested-by: Masahiro Yamada 
> Signed-off-by: Gustavo A. R. Silva 

Acked-by: Adrian Hunter 

> ---
> Changes in v2:
>  - Update changelog.
>  - Drop the 'Fixes' tag.
>  - Add check on ret instead of removing the "return ret;" line.
>  - Thanks to Masahiro Yamada for the feedback provided.
> 
>  drivers/mmc/host/sdhci-cadence.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci-cadence.c 
> b/drivers/mmc/host/sdhci-cadence.c
> index bc30d16..7a343b8 100644
> --- a/drivers/mmc/host/sdhci-cadence.c
> +++ b/drivers/mmc/host/sdhci-cadence.c
> @@ -274,8 +274,8 @@ static int sdhci_cdns_set_tune_val(struct sdhci_host 
> *host, unsigned int val)
>   ret = readl_poll_timeout(reg, tmp,
>!(tmp & SDHCI_CDNS_HRS06_TUNE_UP),
>0, 1);
> -
> - return ret;
> + if (ret)
> + return ret;
>   }
>  
>   return 0;
>

RE: [PATCH][next] ASoC: rt5668: fix incorrect 'and' operator

2018-04-19 Thread Bard Liao

> -Original Message-
> From: Colin King [mailto:colin.k...@canonical.com]
> Sent: Thursday, April 19, 2018 10:35 PM
> To: Bard Liao; Oder Chiou; Liam Girdwood; Mark Brown; Jaroslav Kysela;
> Takashi Iwai; alsa-de...@alsa-project.org
> Cc: kernel-janit...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH][next] ASoC: rt5668: fix incorrect 'and' operator
> 
> From: Colin Ian King 
> 
> Currently logical and is being used instead of bitwise and. Fix this.
> 
> Detected by CoverityScan, CID#1468008 ("Logical vs bitwise operator")
> 
> Fixes: d59fb2856223 ("ASoC: rt5668: add rt5668B codec driver")
> Signed-off-by: Colin Ian King 

Acked-By: Bard Liao 

> ---
>  sound/soc/codecs/rt5668.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sound/soc/codecs/rt5668.c b/sound/soc/codecs/rt5668.c
> index 52a343f96eb2..3c19d03f2446 100644
> --- a/sound/soc/codecs/rt5668.c
> +++ b/sound/soc/codecs/rt5668.c
> @@ -1194,7 +1194,7 @@ static int set_filter_clk(struct snd_soc_dapm_widget
> *w,
>   int ref, val, reg, idx = -EINVAL;
>   static const int div[] = {1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48};
> 
> - val = snd_soc_component_read32(component, RT5668_GPIO_CTRL_1)
> &&
> + val = snd_soc_component_read32(component, RT5668_GPIO_CTRL_1) &
>   RT5668_GP4_PIN_MASK;
>   if (w->shift == RT5668_PWR_ADC_S1F_BIT &&
>   val == RT5668_GP4_PIN_ADCDAT2)
> --
> 2.17.0
> 
> 
> --Please consider the environment before printing this e-mail.

Re: [PATCH v3] gpio: dwapb: Add support for 1 interrupt per port A GPIO

2018-04-19 Thread Hoan Tran

Hi Phil,

On Thu, Apr 19, 2018 at 3:03 AM, Phil Edworthy
 wrote:
> Hi Hoan
>
> On 18 April 2018 08:03 Hoan Tran wrote:
>> On Fri, Apr 13, 2018 at 9:47 AM, Phil Edworthy wrote:
>> > On 13 April 2018 17:37 Hoan Tran wrote:
>> >> On Fri, Apr 13, 2018 at 1:51 AM, Phil Edworthy wrote:
>> >> > The DesignWare GPIO IP can be configured for either 1 interrupt or
>> >> > 1 per GPIO in port A, but the driver currently only supports 1 
>> >> > interrupt.
>> >> > See the DesignWare DW_apb_gpio Databook description of the
>> >> > 'GPIO_INTR_IO' parameter.
>> >> >
>> >> > This change allows the driver to work with up to 32 interrupts, it
>> >> > will get as many interrupts as specified in the DT 'interrupts' 
>> >> > property.
>> >> > It doesn't do anything clever with the different interrupts, it
>> >> > just calls the same handler used for single interrupt hardware.
>> >> >
>> >> > Signed-off-by: Phil Edworthy 
>> >> > ---
>> >> > One point to mention is that I have made it possible for users to
>> >> > have unconncted interrupts by specifying holes in the list of 
>> >> > interrupts.
>> >> > This is done by supporting the interrupts-extended DT prop.
>> >> > However, I have no use for this and had to hack some test case for this.
>> >> > Perhaps the driver should support 1 interrupt or all GPIOa as 
>> >> > interrupts?
>> >> >
>> >> > v3:
>> >> >  - Rolled mfd: intel_quark_i2c_gpio fix into this patch to avoid
>> >> > bisect problems
>> >> > v2:
>> >> >  - Replaced interrupt-mask DT prop with support for the interrupts-
>> >> extended
>> >> >prop. This means replacing the call to irq_of_parse_and_map() with
>> calls
>> >> >to of_irq_parse_one() and irq_create_of_mapping().
>> >> >
>> >> > Note: There are a few *code* lines over 80 chars, but this is just
>> guidance,
>> >> >right? Especially as there are already some lines over 80 chars.
>> >> > ---
>> > [snip]
>> >
>> >> > -   if (has_acpi_companion(dev) && pp->idx == 0)
>> >> > -   pp->irq = 
>> >> > platform_get_irq(to_platform_device(dev), 0);
>> >> > +   if (has_acpi_companion(dev) && pp->idx == 0) {
>> >> > +   pp->irq[0] = 
>> >> > platform_get_irq(to_platform_device(dev),
>> 0);
>> >> > +   if (pp->irq[0])
>> >> > +   pp->has_irq = true;
>> >> > +   }
>> >>
>> >> It doesn't work for ACPI. Could you do the same logic for ACPI?
>> > I don’t have access to any device that was baked (i.e. fabbed) with
>> > multiple output interrupts from the Synopsys GPIO blocks and use ACPI.
>> > I don't know if any such device exists.
>>
>> Below code is tested on X-Gene system which supports 1 interrupt per GPIO
>> on Port A. You can update it into your patch.
>>
>> -   if (has_acpi_companion(dev) && pp->idx == 0)
>> -   pp->irq = platform_get_irq(to_platform_device(dev), 
>> 0);
>> +   if (has_acpi_companion(dev) && pp->idx == 0) {
>> +   unsigned int j;
>> +   for (j = 0; j < pp->ngpio; j++) {
>> +   pp->irq[j] =
>> platform_get_irq(to_platform_device(dev), j);
>> +   if (pp->irq[j])
>> +   pp->has_irq = true;
>> +   }
>> +   }
> Since I've already got some reviewed-by and acks for v4, I'll leave it to 
> Linus
> to decide if he wants me to roll your changes into this patch or for you to
> submit a separate patch.
>

I prefer this patch works for both DTB and ACPI. Btw let Linus decide.

Thanks
Hoan

> Thanks
> Phil
>
>
>> >> > pp->irq_shared  = false;
>> >> > pp->gpio_base   = -1;
>> >> > diff --git a/drivers/mfd/intel_quark_i2c_gpio.c
>> >> > b/drivers/mfd/intel_quark_i2c_gpio.c
>> >> > index 90e35de..5bddb84 100644
>> >> > --- a/drivers/mfd/intel_quark_i2c_gpio.c
>> >> > +++ b/drivers/mfd/intel_quark_i2c_gpio.c
>> >> > @@ -233,7 +233,8 @@ static int intel_quark_gpio_setup(struct
>> >> > pci_dev
>> >> *pdev, struct mfd_cell *cell)
>> >> > pdata->properties->idx  = 0;
>> >> > pdata->properties->ngpio= INTEL_QUARK_MFD_NGPIO;
>> >> > pdata->properties->gpio_base=
>> INTEL_QUARK_MFD_GPIO_BASE;
>> >> > -   pdata->properties->irq  = pdev->irq;
>> >> > +   pdata->properties->irq[0]   = pdev->irq;
>> >> > +   pdata->properties->has_irq  = true;
>> >> > pdata->properties->irq_shared   = true;
>> >> >
>> >> > cell->platform_data = pdata; diff --git
>> >> > a/include/linux/platform_data/gpio-dwapb.h
>> >> > b/include/linux/platform_data/gpio-dwapb.h
>> >> > index 2dc7f4a..5a52d69 100644
>> >> > --- a/include/linux/platform_data/gpio-dwapb.h
>> >> > +++ b/include/linux/platform_data/gpio-dwapb.h
>> >> > @@ -19,7 +19,8 @@ struct dwapb_port_property {
>> >> > unsigned intidx;
>> >> > unsigned intngpio;
>> >>

Re: [PATCH 06/61] crypto: simplify getting .drvdata

2018-04-19 Thread Krzysztof Kozlowski

On Thu, Apr 19, 2018 at 4:05 PM, Wolfram Sang
 wrote:
> We should get drvdata from struct device directly. Going via
> platform_device is an unneeded step back and forth.
>
> Signed-off-by: Wolfram Sang 
> ---
>
> Build tested only. buildbot is happy. Please apply individually.
>
>  drivers/crypto/exynos-rng.c   | 6 ++
>  drivers/crypto/picoxcell_crypto.c | 6 ++
>  2 files changed, 4 insertions(+), 8 deletions(-)

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

Re: [PATCH] serial: imx: fix cached UCR2 read on software reset

2018-04-19 Thread Uwe Kleine-König

Hello Stefan,

On Thu, Apr 19, 2018 at 11:37:23PM +0200, Stefan Agner wrote:
> On 16.04.2018 17:35, Stefan Agner wrote:
> > To reset the UART the SRST needs be cleared (low active). According
> > to the documentation the bit will remain active for 4 module clocks
> > until it is cleared (set to 1).
> > 
> > Hence the real register need to be read in case the cached register
> > indcates that the SRST bit is zero.
> > 
> > This bug lead to wrong baudrate because the baud rate register got
> > restored before reset completed in imx_flush_buffer.
> 
> Given that you reviewed my other patch rather quickly, you might have
> overlooked this one?

no I didn't, still the ping was justified. I didn't look into it at once
because I didn't feel like opening the refman.
 
> Since it is a regression, this should go into v4.17 still...

That's right,

Reviewed-by: Uwe Kleine-König 

I wonder what is different on your side that made it break. I didn't see
any breakage and tested that on a handful of different machines.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |

Re: [v2 1/1] i2c: dev: prevent ZERO_SIZE_PTR deref in i2cdev_ioctl_rdwr()

2018-04-19 Thread Uwe Kleine-König

Hello,

On Thu, Apr 19, 2018 at 08:01:46PM +0300, Alexander Popov wrote:
> On 19.04.2018 16:49, Uwe Kleine-König wrote:
> >> @@ -280,6 +280,7 @@ static noinline int i2cdev_ioctl_rdwr(struct 
> >> i2c_client *client,
> >> */
> >>if (msgs[i].flags & I2C_M_RECV_LEN) {
> >>if (!(msgs[i].flags & I2C_M_RD) ||
> >> +  !msgs[i].len ||
> > 
> > I'd prefer
> > 
> > msgs[i].len > 0
> 
> Excuse me, it will be wrong. We stop if len is 0 to avoid the following
> ZERO_SIZE_PTR dereference.

right you are. I missed the negation.
 
> > here instead of
> > 
> > !msgs[i].len
> 
> I can change it to "msgs[i].len == 0". But is it really important?
> 
> I've carefully tested the current version with the original repro. It works 
> correct.

I don't doubt it, and the code generated is maybe even the same. The
point I wanted to make is that

!len

is harder to read for a human than

len < 1

(or another suitable arithmetic expression). But feel free to disagree
and keep the code as is.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |

[PATCH] Documentation: updates for new syscall stub naming convention

2018-04-19 Thread Dominik Brodowski

For v4.17-rc1, the naming of syscall stubs changed. Update stack
traces and similar instances in the documentation to avoid sources
for confusion.

Signed-off-by: Dominik Brodowski 

diff --git a/Documentation/admin-guide/bug-hunting.rst 
b/Documentation/admin-guide/bug-hunting.rst
index f278b289e260..cebff8e5c59f 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -30,7 +30,7 @@ Kernel bug reports often come with a stack dump like the one 
below::
 [] ? driver_detach+0x87/0x90
 [] ? bus_remove_driver+0x38/0x90
 [] ? usb_deregister+0x58/0xb0
-[] ? SyS_delete_module+0x130/0x1f0
+[] ? __se_sys_delete_module+0x130/0x1f0
 [] ? task_work_run+0x64/0x80
 [] ? exit_to_usermode_loop+0x85/0x90
 [] ? do_fast_syscall_32+0x80/0x130
diff --git a/Documentation/dev-tools/kasan.rst 
b/Documentation/dev-tools/kasan.rst
index f7a18f274357..0fe231401ae9 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -60,7 +60,7 @@ A typical out of bounds access report looks like this::
  init_module+0x9/0x47 [test_kasan]
  do_one_initcall+0x99/0x200
  load_module+0x2cb3/0x3b20
- SyS_finit_module+0x76/0x80
+ __se_sys_finit_module+0x76/0x80
  system_call_fastpath+0x12/0x17
 INFO: Slab 0xea0001a4ef00 objects=17 used=7 fp=0x8800693bd728 
flags=0x1004080
 INFO: Object 0x8800693bc558 @offset=1368 fp=0x8800693bc720
@@ -101,7 +101,7 @@ A typical out of bounds access report looks like this::
  [] ? __vunmap+0xec/0x160
  [] load_module+0x2cb3/0x3b20
  [] ? m_show+0x240/0x240
- [] SyS_finit_module+0x76/0x80
+ [] __se_sys_finit_module+0x76/0x80
  [] system_call_fastpath+0x12/0x17
 Memory state around the buggy address:
  8800693bc300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
index c2f6452e38ed..df3f4016137a 100644
--- a/Documentation/dev-tools/kcov.rst
+++ b/Documentation/dev-tools/kcov.rst
@@ -103,7 +103,7 @@ program using kcov:
 
 After piping through addr2line output of the program looks as follows::
 
-SyS_read
+__se_sys_read
 fs/read_write.c:562
 __fdget_pos
 fs/file.c:774
@@ -115,7 +115,7 @@ After piping through addr2line output of the program looks 
as follows::
 fs/file.c:760
 __fdget_pos
 fs/file.c:784
-SyS_read
+__se_sys_read
 fs/read_write.c:562
 
 If a program needs to collect coverage from several threads (independently),
diff --git a/Documentation/locking/lockstat.txt 
b/Documentation/locking/lockstat.txt
index 5786ad2cd5e6..346a67e72671 100644
--- a/Documentation/locking/lockstat.txt
+++ b/Documentation/locking/lockstat.txt
@@ -96,7 +96,7 @@ Look at the current lock statistics:
 12   &mm->mmap_sem 17  
[] vm_munmap+0x41/0x80
 13 ---
 14   &mm->mmap_sem  1  
[] dup_mmap+0x2a/0x3f0
-15   &mm->mmap_sem 60  
[] SyS_mprotect+0xe9/0x250
+15   &mm->mmap_sem 60  
[] __se_sys_mprotect+0xe9/0x250
 16   &mm->mmap_sem 41  
[] __do_page_fault+0x1d4/0x510
 17   &mm->mmap_sem 68  
[] vm_mmap_pgoff+0x87/0xd0
 18
diff --git a/Documentation/trace/histogram.txt 
b/Documentation/trace/histogram.txt
index 6e05510afc28..f36784deae99 100644
--- a/Documentation/trace/histogram.txt
+++ b/Documentation/trace/histogram.txt
@@ -598,7 +598,7 @@
  apparmor_cred_prepare+0x1f/0x50
  security_prepare_creds+0x16/0x20
  prepare_creds+0xdf/0x1a0
- SyS_capset+0xb5/0x200
+ __se_sys_capset+0xb5/0x200
  system_call_fastpath+0x12/0x6a
 } hitcount:  1  bytes_req: 32  bytes_alloc: 32
 .
@@ -609,7 +609,7 @@
  i915_gem_execbuffer2+0x6c/0x2c0 [i915]
  drm_ioctl+0x349/0x670 [drm]
  do_vfs_ioctl+0x2f0/0x4f0
- SyS_ioctl+0x81/0xa0
+ __se_sys_ioctl+0x81/0xa0
  system_call_fastpath+0x12/0x6a
 } hitcount:  17726  bytes_req:   13944120  bytes_alloc:   19593808
 { stacktrace:
@@ -618,7 +618,7 @@
  load_elf_binary+0x102/0x1650
  search_binary_handler+0x97/0x1d0
  do_execveat_common.isra.34+0x551/0x6e0
- SyS_execve+0x3a/0x50
+ __se_sys_execve+0x3a/0x50
  return_from_execve+0x0/0x23
 } hitcount:  33348  bytes_req:   17152128  bytes_alloc:   20226048
 { stacktrace:
@@ -629,7 +629,7 @@
  path_openat+0x31/0x5f0
  do_filp_open+0x3a/0x90
  do_sys_open+0x128/0x220
- SyS_open+0x1e/0x20
+ __se_sys_open+0x1e/0x20
  system_call_fastpath+0x12/0x6a
 } hitcount:4766422  bytes_req:95

[PATCH] perf: update to new syscall stub naming convention

2018-04-19 Thread Dominik Brodowski

For v4.17-rc1, the naming of syscall stubs changed. Update the
perf scripts/utils/tests which need to be aware of the syscall
stub naming accordingly.

Signed-off-by: Dominik Brodowski 

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
index 53d83d7e6a09..9a970e334cea 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -32,10 +32,10 @@ int arch__choose_best_symbol(struct symbol *syma,
if (*sym == '.')
sym++;
 
-   /* Avoid "SyS" kernel syscall aliases */
-   if (strlen(sym) >= 3 && !strncmp(sym, "SyS", 3))
+   /* Avoid "__se_sys" kernel syscall aliases */
+   if (strlen(sym) >= 8 && !strncmp(sym,  "__se_sys", 8))
return SYMBOL_B;
-   if (strlen(sym) >= 10 && !strncmp(sym, "compat_SyS", 10))
+   if (strlen(sym) >= 15 && !strncmp(sym, "__se_compat_sys", 15))
return SYMBOL_B;
 
return SYMBOL_A;
diff --git a/tools/perf/tests/bpf-script-example.c 
b/tools/perf/tests/bpf-script-example.c
index e4123c1b0e88..5839baa3d766 100644
--- a/tools/perf/tests/bpf-script-example.c
+++ b/tools/perf/tests/bpf-script-example.c
@@ -31,8 +31,8 @@ struct bpf_map_def SEC("maps") flip_table = {
.max_entries = 1,
 };
 
-SEC("func=SyS_epoll_pwait")
-int bpf_func__SyS_epoll_pwait(void *ctx)
+SEC("func=__se_sys_epoll_pwait")
+int bpf_funcse_sys_epoll_pwait(void *ctx)
 {
int ind =0;
int *flag = bpf_map_lookup_elem(&flip_table, &ind);
diff --git a/tools/perf/util/c++/clang-test.cpp 
b/tools/perf/util/c++/clang-test.cpp
index 7b042a5ebc68..67a39ac8626d 100644
--- a/tools/perf/util/c++/clang-test.cpp
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -41,7 +41,7 @@ int test__clang_to_IR(void)
if (!M)
return -1;
for (llvm::Function& F : *M)
-   if (F.getName() == "bpf_func__SyS_epoll_pwait")
+   if (F.getName() == "bpf_funcse_sys_epoll_pwait")
return 0;
return -1;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 62b2dd2253eb..32e156992dfc 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -113,10 +113,11 @@ int __weak arch__compare_symbol_names_n(const char 
*namea, const char *nameb,
 int __weak arch__choose_best_symbol(struct symbol *syma,
struct symbol *symb __maybe_unused)
 {
-   /* Avoid "SyS" kernel syscall aliases */
-   if (strlen(syma->name) >= 3 && !strncmp(syma->name, "SyS", 3))
+   /* Avoid "__se_sys" kernel syscall aliases */
+   if (strlen(syma->name) >= 8 && !strncmp(syma->name,  "__se_sys", 8))
return SYMBOL_B;
-   if (strlen(syma->name) >= 10 && !strncmp(syma->name, "compat_SyS", 10))
+   if (strlen(syma->name) >= 15 &&
+   !strncmp(syma->name, "__se_compat_sys", 15))
return SYMBOL_B;
 
return SYMBOL_A;

Re: [RESEND PATCH 1/1] drm/i915/glk: Add MODULE_FIRMWARE for Geminilake

2018-04-19 Thread Ian W MORRISON

On 18 April 2018 at 00:14, Joonas Lahtinen
 wrote:
> Quoting Jani Nikula (2018-04-17 12:02:52)
>> On Mon, 16 Apr 2018, "Srivatsa, Anusha"  wrote:
>> >>-Original Message-
>> >>From: Jani Nikula [mailto:jani.nik...@linux.intel.com]
>> >>Sent: Wednesday, April 11, 2018 5:27 AM
>> >>To: Ian W MORRISON 
>> >>Cc: Vivi, Rodrigo ; Srivatsa, Anusha
>> >>; Wajdeczko, Michal
>> >>; Greg KH ;
>> >>airl...@linux.ie; joonas.lahti...@linux.intel.com; 
>> >>linux-kernel@vger.kernel.org;
>> >>sta...@vger.kernel.org; intel-...@lists.freedesktop.org; dri-
>> >>de...@lists.freedesktop.org
>> >>Subject: Re: [RESEND PATCH 1/1] drm/i915/glk: Add MODULE_FIRMWARE for
>> >>Geminilake

In summary so far:

Jani:
> NAK on indiscriminate Cc: stable. There are zero guarantees that
> older kernels will work with whatever firmware you throw at them.
> Who tested the firmware with v4.12 and later? We only have the CI
> results against *current* drm-tip. We don't even know about v4.16.
> I'm not going to ack and take responsibility for the stable backports
> unless someone actually comes forward with credible Tested-bys.

Anusha:
> The stable kernel version is 4.12 and beyond.
> It is appropriate to add the CC: stable in my opinion

Joonas:
> And even then, some distros will be surprised of the new MODULE_FIRMWARE
> and will need to update the linux-firmware package, too.

I've performed backport testing and some additional analysis as follows:

The DMC firmware for GLK was initially included in 4.11
  (commit: dbb28b5c3d3cb945a63030fab8d3894cf335ce19).
Then the firmware version was upgraded to 1.03 in 4.12
  (commit: f4a791819ed00a749a90387aa139706a507aa690).
However MODULE_FIRMWARE for the GLK DMC firmware
was also removed in 4.12
  (commit: d9321a03efcda867b3a8c6327e01808516f0acd7)
together with the firmware version being bumped to 1.04
  (commit: aebfd1d37194e00d4c417e7be97efeb736cd9c04).

The patch below effectively reverts commit d9321a03 because the GLK
firmware is now available in the linux-firmware repository.

To test stable backports I've used Ubuntu 18.04 (Beta 2) userspace with
both Ubuntu (generic) and self-compiled mainline (patched) kernels.
The conclusion was that the patch works across 4.12 to 4.17-rc1 kernels
additionally displaying a 'Possible missing firmware' message when
installing a kernel with the expected firmware missing.

The following are abridged backport test results:

Scenario: No DMC (glk_dmc_ver1_04.bin) firmware installed in
'/lib/firmware/i915'
  Test:Kernel installation ('grep -i dmc' output from 'apt install'):
4.12-generic and 4.15-generic:
  No output # as expected
4.12 to 4.17-rc1-patched:
  W: Possible missing firmware
/lib/firmware/i915/glk_dmc_ver1_04.bin for module i915
  Result: The effect of the patch is to add a 'Possible missing
firmware' message.
  Test: Booting ('grep -i dmc' output from 'dmesg'):
4.12-generic:
  No output # as expected
4.15-generic:
  i915 :00:02.0: Direct firmware load for
i915/glk_dmc_ver1_04.bin failed with error -2
  i915 :00:02.0: Failed to load DMC firmware
i915/glk_dmc_ver1_04.bin. Disabling runtime power management.
  i915 :00:02.0: DMC firmware homepage:
https://01.org/linuxgraphics/downloads/firmware
4.12-patched:
  No output # as expected
4.13 to 4.14-patched:
  i915 :00:02.0: Direct firmware load for
i915/glk_dmc_ver1_04.bin failed with error -2
  i915 :00:02.0: Failed to load DMC firmware
[https://01.org/linuxgraphics/downloads/firmware], disabling runtime
power management.
4.15 to 4.17-rc1-patched:
  i915 :00:02.0: Direct firmware load for
i915/glk_dmc_ver1_04.bin failed with error -2
  i915 :00:02.0: Failed to load DMC firmware
i915/glk_dmc_ver1_04.bin. Disabling runtime power management.
  i915 :00:02.0: DMC firmware homepage:
https://01.org/linuxgraphics/downloads/firmware
  Result: The effect of the patch does not change existing
(non-patched kernel) messages.

Scenario: DMC (glk_dmc_ver1_04.bin) firmware installed in '/lib/firmware/i915'
  Test:Kernel installation ('grep -i dmc' output from 'apt install')
All kernels:
  No messages # as expected
  Result: The effect of the patch does not change existing messages.
  Test" Booting ('grep -i dmc' output from 'dmesg'):
4.12-generic:
  No output # as expected
4.15-generic:
  i915 :00:02.0: Direct firmware load for
i915/glk_dmc_ver1_04.bin failed with error -2
  i915 :00:02.0: Failed to load DMC firmware
i915/glk_dmc_ver1_04.bin. Disabling runtime power management.
  i915 :00:02.0: DMC firmware homepage:
https://01.org/linuxgraphics/downloads/firmware
4.12-patched:
  No output # as expected
4.13 to 4.17-rc1-patched:
  [drm] Finished loading DMC firmware i915/glk_dmc_ver1_04.bin (v1.4)
  Result: The effect of the patch is to remove the 'Failed to load' message.

Regards,
Ian

Re: [RFC/RFT patch 0/7] timekeeping: Unify clock MONOTONIC and clock BOOTTIME

2018-04-19 Thread Sergey Senozhatsky

On (04/20/18 06:37), David Herrmann wrote:
>
> I get lots of timer-errors on Arch-Linux booting current master, after
> a suspend/resume cycle. Just a selection of errors I see on resume:

Hello David,
Any chance you can revert the patches in question and test? I'm running
ARCH (4.17.0-rc1-dbg-00042-gaa03ddd9c434) and suspend/resume cycle does
not trigger any errors. Except for this one

kernel: do_IRQ: 0.55 No irq handler for vector

> systemd[1]: systemd-journald.service: Main process exited,
> code=dumped, status=6/ABRT
> rtkit-daemon[742]: The canary thread is apparently starving. Taking action.
> systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
> systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
> kernel: e1000e :00:1f.6: Failed to restore TIMINCA clock rate delta: -22
> 
> Lots of crashes with SIGABRT due to these.
> 
> I did not bisect it, but it sounds related to me. Also, user-space
> uses CLOCK_MONOTONIC for watchdog timers. That is, a process is
> required to respond to a watchdog-request in a given MONOTONIC
> time-frame. If this jumps during suspend/resume, watchdogs will fire
> immediately. I don't see how this can work with the new MONOTONIC
> behavior?

-ss

Re: [PATCH V1 4/4] qcom: spmi-wled: Add auto-calibration logic support

2018-04-19 Thread kgunda


On 2018-04-19 21:28, Bjorn Andersson wrote:

On Thu 19 Apr 03:45 PDT 2018, kgu...@codeaurora.org wrote:



On 2017-12-05 11:10, Bjorn Andersson wrote:
> On Thu 16 Nov 04:18 PST 2017, Kiran Gunda wrote:
>
> > The auto-calibration algorithm checks if the current WLED sink
> > configuration is valid. It tries enabling every sink and checks
> > if the OVP fault is observed. Based on this information it
> > detects and enables the valid sink configuration. Auto calibration
> > will be triggered when the OVP fault interrupts are seen frequently
> > thereby it tries to fix the sink configuration.
> >
>
> So it's not auto "calibration" it's auto "detection" of strings?
>
Hi Bjorn,
Sorry for late response. Please find my answers.



No worries, happy to hear back from you!


Thanks!

Correct. This is the auto detection, This is the name given by the
HW/systems team.


I think the name should be considered a "hardware bug", that we can 
work
around in software (give it a useful name and document what the 
original

name was).

I don't think this is the "hardware bug". Rather we can say HW doesn't 
support it.
Hence, we are implementing it as a SW feature to detect the strings 
present on the
display panel, if the user fails to give the correct strings. As you 
suggested I will

rename this to "auto detection" instead of "auto calibration".


> When is this feature needed?
>
This feature is needed if the string configuration is given wrong in
the DT node by the user.


DT describes the hardware and for all other nodes it must do so
accurately.

But the user may not be aware of the strings present on the display 
panel or
may be using the same software on different devices which have different 
strings

present.
For cases where the hardware supports auto detection of functionality 
we

remove information from DT and rely on that logic to figure out the
hardware. We do not use it to reconfigure the hardware once we detect 
an

error. So when auto-detection is enabled it should always be used to
probe the hardware.

The auto string detection is not supported in any qcom hardware and i 
don't

think there is a plan to introduce in new hardware also.


Regards,
Bjorn


> > Signed-off-by: Kiran Gunda 
> > ---
> >  .../bindings/leds/backlight/qcom-spmi-wled.txt |   5 +
> >  drivers/video/backlight/qcom-spmi-wled.c   | 304
> > -
> >  2 files changed, 306 insertions(+), 3 deletions(-)
> >
> > diff --git
> > a/Documentation/devicetree/bindings/leds/backlight/qcom-spmi-wled.txt
> > b/Documentation/devicetree/bindings/leds/backlight/qcom-spmi-wled.txt
> > index d39ee93..f06c0cd 100644
> > ---
> > a/Documentation/devicetree/bindings/leds/backlight/qcom-spmi-wled.txt
> > +++
> > b/Documentation/devicetree/bindings/leds/backlight/qcom-spmi-wled.txt
> > @@ -94,6 +94,11 @@ The PMIC is connected to the host processor via
> > SPMI bus.
> >   Definition: Interrupt names associated with the interrupts.
> >   Currently supported interrupts are "sc-irq" and "ovp-irq".
> >
> > +- qcom,auto-calibration
>
> qcom,auto-string-detect?
>
ok. Will address in the next patch.
> > + Usage:  optional
> > + Value type: 
> > + Definition: Enables auto-calibration of the WLED sink configuration.
> > +
> >  Example:
> >
> >  qcom-wled@d800 {
> > diff --git a/drivers/video/backlight/qcom-spmi-wled.c
> > b/drivers/video/backlight/qcom-spmi-wled.c
> > index 8b2a77a..aee5c56 100644
> > --- a/drivers/video/backlight/qcom-spmi-wled.c
> > +++ b/drivers/video/backlight/qcom-spmi-wled.c
> > @@ -38,11 +38,14 @@
> >  #define  QCOM_WLED_CTRL_SC_FAULT_BIT BIT(2)
> >
> >  #define QCOM_WLED_CTRL_INT_RT_STS0x10
> > +#define  QCOM_WLED_CTRL_OVP_FLT_RT_STS_BIT   BIT(1)
>
> The use of BIT() makes this a mask and not a bit number, so if you just
> drop that you can afford to spell out the "FAULT" like the data sheet
> does. Perhaps even making it QCOM_WLED_CTRL_OVP_FAULT_STATUS ?
>
ok. Will change it in the next series.
> >
> >  #define QCOM_WLED_CTRL_MOD_ENABLE0x46
> >  #define  QCOM_WLED_CTRL_MOD_EN_MASK  BIT(7)
> >  #define  QCOM_WLED_CTRL_MODULE_EN_SHIFT  7
> >
> > +#define QCOM_WLED_CTRL_FDBK_OP   0x48
>
> This is called WLED_CTRL_FEEDBACK_CONTROL, why the need to make it
> unreadable?
>
Ok. Will address it in next series.
> > +
> >  #define QCOM_WLED_CTRL_SWITCH_FREQ   0x4c
> >  #define  QCOM_WLED_CTRL_SWITCH_FREQ_MASK GENMASK(3, 0)
> >
> > @@ -99,6 +102,7 @@ struct qcom_wled_config {
> >   int ovp_irq;
> >   bool en_cabc;
> >   bool ext_pfet_sc_pro_en;
> > + bool auto_calib_enabled;
> >  };
> >
> >  struct qcom_wled {
> > @@ -108,18 +112,25 @@ struct qcom_wled {
> >   struct mutex lock;
> >   struct qcom_wled_config cfg;
> >   ktime_t last_sc_event_time;
> > + ktime_t start_ovp_fault_time;
> >   u16 sink_addr;
> >   u16 ctrl_addr;
> > + u16 auto_calibration_ovp_count;
> >

Re: [PATCH] nvme: fc: provide a descriptive error

2018-04-19 Thread Hannes Reinecke

On 04/19/2018 07:43 PM, Johannes Thumshirn wrote:
> Provide a descriptive error in case an lport to rport association
> isn't found when creating the FC-NVME controller.
> 
> Currently it's very hard to debug the reason for a failed connect
> attempt without a look at the source.
> 
> Signed-off-by: Johannes Thumshirn 
> 
> ---
> This actually happened to Hannes and me because of a typo in a
> customer demo today, so yes things like this happen until we have a
> proper way to do auto-connect.
> ---
>  drivers/nvme/host/fc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 6cb26bcf6ec0..8b66879b4ebf 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3284,6 +3284,8 @@ nvme_fc_create_ctrl(struct device *dev, struct 
> nvmf_ctrl_options *opts)
>   }
>   spin_unlock_irqrestore(&nvme_fc_lock, flags);
>  
> + pr_warn("%s: %s - %s combination not found\n",
> + __func__, opts->traddr, opts->host_traddr);
>   return ERR_PTR(-ENOENT);
>  }
>  
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH] mm:memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create

2018-04-19 Thread Minchan Kim

On Thu, Apr 19, 2018 at 08:40:05AM +0200, Michal Hocko wrote:
> On Wed 18-04-18 11:58:00, David Rientjes wrote:
> > On Wed, 18 Apr 2018, Michal Hocko wrote:
> > 
> > > > Okay, no problem. However, I don't feel we need ratelimit at this 
> > > > moment.
> > > > We can do when we got real report. Let's add just one line warning.
> > > > However, I have no talent to write a poem to express with one line.
> > > > Could you help me?
> > > 
> > > What about
> > >   pr_info("Failed to create memcg slab cache. Report if you see floods of 
> > > these\n");
> > >  

Thanks you, Michal. However, hmm, floods is very vague to me. 100 time per sec?
10 time per hour? I guess we need more guide line to trigger user's reporting
if we really want to do.


> > 
> > Um, there's nothing actionable here for the user.  Even if the message 
> > directed them to a specific email address, what would you ask the user for 
> > in response if they show a kernel log with 100 of these?
> 
> We would have to think of a better way to create shaddow memcg caches.
> 
> > Probably ask 
> > them to use sysrq at the time it happens to get meminfo.  But any user 
> > initiated sysrq is going to reveal very different state of memory compared 
> > to when the kmalloc() actually failed.
> 
> Not really.
> 
> > If this really needs a warning, I think it only needs to be done once and 
> > reveal the state of memory similar to how slub emits oom warnings.  But as 
> > the changelog indicates, the system is oom and we couldn't reclaim.  We 
> > can expect this happens a lot on systems with memory pressure.  What is 
> > the warning revealing that would be actionable?
> 
> That it actually happens in real workloads and we want to know what
> those workloads are. This code is quite old and yet this is the first
> some somebody complains. So it is most probably rare. Maybe because most
> workloads doesn't create many memcgs dynamically while low on memory.
> And maybe that will change in future. In any case, having a large splat
> of meminfo for GFP_NOWAIT is not really helpful. It will tell us what we
> know already - the memory is low and the reclaim was prohibited. We just
> need to know that this happens out there.

The workload was experimenting creating memcg per app on embedded device
but at this moment, I don't consider kmemcg at this moment so I can live
with disabling kmemcg, even. Based on it, I cannot say whether it's real
workload or not.

When I see replies of this thread, it's arguble to add such one-line
warn so if you want it strongly, could you handle by yourself?
Sorry but I don't have any interest on the arguing.

Thanks.

Re: [PATCH v2 4/4] tpm: Move eventlog declarations to its own header

2018-04-19 Thread Jarkko Sakkinen

On Thu, Apr 12, 2018 at 12:13:50PM +0200, Thiebaud Weksteen wrote:
> Reduce the size of tpm.h by moving eventlog declarations to a separate
> header.
> 
> Signed-off-by: Thiebaud Weksteen 
> Suggested-by: Jarkko Sakkinen 

Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

/Jarkko

Re: [PATCH v2 3/4] tpm: Move shared eventlog functions to common.c

2018-04-19 Thread Jarkko Sakkinen

On Thu, Apr 12, 2018 at 12:13:49PM +0200, Thiebaud Weksteen wrote:
> Functions and structures specific to TPM1 are renamed from tpm* to tpm1*.
> 
> Signed-off-by: Thiebaud Weksteen 
> Suggested-by: Jarkko Sakkinen 

Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

/Jarkko

[PATCH] iommu/vt-d: fix shift-out-of-bounds in bug checking

2018-04-19 Thread changbin . du

From: Changbin Du 

It allows to flush more than 4GB of device TLBs. So the mask should be
64bit wide. UBSAN captured this fault as below.

[3.760024] 

[3.768440] UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1348:3
[3.774864] shift exponent 64 is too large for 32-bit type 'int'
[3.780853] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U
4.17.0-rc1+ #89
[3.788661] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 
01/26/2016
[3.796034] Call Trace:
[3.798472]  
[3.800479]  dump_stack+0x90/0xfb
[3.803787]  ubsan_epilogue+0x9/0x40
[3.807353]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[3.812916]  ? qi_flush_dev_iotlb+0x124/0x180
[3.817261]  qi_flush_dev_iotlb+0x124/0x180
[3.821437]  iommu_flush_dev_iotlb+0x94/0xf0
[3.825698]  iommu_flush_iova+0x10b/0x1c0
[3.829699]  ? fq_ring_free+0x1d0/0x1d0
[3.833527]  iova_domain_flush+0x25/0x40
[3.837448]  fq_flush_timeout+0x55/0x160
[3.841368]  ? fq_ring_free+0x1d0/0x1d0
[3.845200]  ? fq_ring_free+0x1d0/0x1d0
[3.849034]  call_timer_fn+0xbe/0x310
[3.852696]  ? fq_ring_free+0x1d0/0x1d0
[3.856530]  run_timer_softirq+0x223/0x6e0
[3.860625]  ? sched_clock+0x5/0x10
[3.864108]  ? sched_clock+0x5/0x10
[3.867594]  __do_softirq+0x1b5/0x6f5
[3.871250]  irq_exit+0xd4/0x130
[3.874470]  smp_apic_timer_interrupt+0xb8/0x2f0
[3.879075]  apic_timer_interrupt+0xf/0x20
[3.883159]  
[3.885255] RIP: 0010:poll_idle+0x60/0xe7
[3.889252] RSP: 0018:b1b201943e30 EFLAGS: 0246 ORIG_RAX: 
ff13
[3.896802] RAX: 8020 RBX: 008e RCX: 001f
[3.903918] RDX:  RSI: 2819aa06 RDI: 
[3.911031] RBP: 9e93c6b33280 R08: 0010f717d567 R09: 0010d205
[3.918146] R10: b1b201943df8 R11: 0001 R12: e01b169d
[3.925260] R13:  R14: b12aa400 R15: 
[3.932382]  cpuidle_enter_state+0xb4/0x470
[3.936558]  do_idle+0x222/0x310
[3.939779]  cpu_startup_entry+0x78/0x90
[3.943693]  start_secondary+0x205/0x2e0
[3.947607]  secondary_startup_64+0xa5/0xb0
[3.951783] 


Signed-off-by: Changbin Du 
---
 drivers/iommu/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index accf5838..e4ae600 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1345,7 +1345,7 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 
sid, u16 qdep,
struct qi_desc desc;
 
if (mask) {
-   BUG_ON(addr & ((1 << (VTD_PAGE_SHIFT + mask)) - 1));
+   BUG_ON(addr & ((1ULL << (VTD_PAGE_SHIFT + mask)) - 1));
addr |= (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1;
desc.high = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE;
} else
-- 
2.7.4

Re: [PATCH v2 2/4] tpm: Move eventlog files to a subdirectory

2018-04-19 Thread Jarkko Sakkinen

On Thu, Apr 12, 2018 at 12:13:48PM +0200, Thiebaud Weksteen wrote:
> Signed-off-by: Thiebaud Weksteen 
> Suggested-by: Jarkko Sakkinen 

Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

/Jarkko

Re: [PATCH v2 1/4] tpm: Add explicit endianness cast

2018-04-19 Thread Jarkko Sakkinen

On Thu, Apr 12, 2018 at 12:13:47PM +0200, Thiebaud Weksteen wrote:
> Signed-off-by: Thiebaud Weksteen 

Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

/Jarkko

[PATCH] arm64: avoid potential infinity loop in dump_backtrace

2018-04-19 Thread Ji Zhang

When we dump the backtrace of some tasks there is a potential infinity
loop if the content of the stack changed, no matter the change is
because the task is running or other unexpected cases.

This patch add stronger check on frame pointer and set the max number
of stack spanning to avoid infinity loop.

Signed-off-by: Ji Zhang 
---
 arch/arm64/include/asm/stacktrace.h | 25 +
 arch/arm64/kernel/stacktrace.c  |  8 
 arch/arm64/kernel/traps.c   |  1 +
 3 files changed, 34 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace.h 
b/arch/arm64/include/asm/stacktrace.h
index 902f9ed..f235b86 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -24,9 +24,18 @@
 #include 
 #include 
 
+#ifndef CONFIG_VMAP_STACK
+#define MAX_NR_STACKS  2
+#elif !defined(CONFIG_ARM_SDE_INTERFACE)
+#define MAX_NR_STACKS  3
+#else
+#define MAX_NR_STACKS  4
+#endif
+
 struct stackframe {
unsigned long fp;
unsigned long pc;
+   int nr_stacks;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
int graph;
 #endif
@@ -92,4 +101,20 @@ static inline bool on_accessible_stack(struct task_struct 
*tsk, unsigned long sp
return false;
 }
 
+
+static inline bool on_same_stack(struct task_struct *tsk,
+   unsigned long sp1, unsigned long sp2)
+{
+   if (on_task_stack(tsk, sp1) && on_task_stack(tsk, sp2))
+   return true;
+   if (on_irq_stack(sp1) && on_irq_stack(sp2))
+   return true;
+   if (on_overflow_stack(sp1) && on_overflow_stack(sp2))
+   return true;
+   if (on_sdei_stack(sp1) && on_sdei_stack(sp2))
+   return true;
+
+   return false;
+}
+
 #endif /* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index d5718a0..d75f59d 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -43,6 +43,7 @@
 int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 {
unsigned long fp = frame->fp;
+   bool same_stack;
 
if (fp & 0xf)
return -EINVAL;
@@ -56,6 +57,13 @@ int notrace unwind_frame(struct task_struct *tsk, struct 
stackframe *frame)
frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
 
+   same_stack = on_same_stack(tsk, fp, frame->fp);
+
+   if (fp <= frame->fp && same_stack)
+   return -EINVAL;
+   if (!same_stack && ++frame->nr_stacks > MAX_NR_STACKS)
+   return -EINVAL;
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk->ret_stack &&
(frame->pc == (unsigned long)return_to_handler)) {
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index ba964da..ee0403d 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -121,6 +121,7 @@ void dump_backtrace(struct pt_regs *regs, struct 
task_struct *tsk)
frame.fp = thread_saved_fp(tsk);
frame.pc = thread_saved_pc(tsk);
}
+   frame.nr_stacks = 1;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = tsk->curr_ret_stack;
 #endif
-- 
1.9.1

Re: [PATCH 2/7] i2c: i2c-mux-gpio: move header to platform_data

2018-04-19 Thread Peter Korsgaard

> "WS" == Wolfram Sang  writes:

WS> This header only contains platform_data. Move it to the proper directory.
WS> Signed-off-by: Wolfram Sang 

Thanks,

Acked-by: Peter Korsgaard 

--
Bye, Peter Korsgaard
This message is subject to the following terms and conditions: MAIL 
DISCLAIMER

[PATCH v6 0/2] PCI: mediatek: Fixups for the IRQ handle routine and MT7622's class code

2018-04-19 Thread honghui.zhang

From: Honghui Zhang 

Two fixups for mediatek's host bridge:
The first patch fixup class type and vendor ID for MT7622.
The second patch fixup the IRQ handle routine by using irq_chip solution
to avoid IRQ reentry which may exist for both MT2712 and MT7622.

Change since v5:
 - Make the comments consistend with the code modification in the first patch.
 - Using writew to performing a 16-bit write.
 - Using irq_chip solution to fix the IRQ issue.

The v5 patchset could be found in:
 https://patchwork.kernel.org/patch/10133303
 https://patchwork.kernel.org/patch/10133305

Change since v4:
 - Only setup vendor ID for MT7622, igorning the device ID since mediatek's
   host bridge driver does not cares about the device ID.

Change since v3:
 - Setup the class type and vendor ID at the beginning of startup instead
   of in a quirk.
 - Add mediatek's vendor ID, it could be found in:
   https://pcisig.com/membership/member-companies?combine=&page=4

Change since v2:
 - Move the initialize of the iterate before the loop to fix an
   INTx IRQ issue in the first patch

Change since v1:
 - Add the second patch.
 - Make the first patch's commit message more standard.
Honghui Zhang (2):
  PCI: mediatek: Set up vendor ID and class type for MT7622
  PCI: mediatek: Using chained IRQ to setup IRQ handle

 drivers/pci/host/pcie-mediatek.c | 220 +++
 include/linux/pci_ids.h  |   2 +
 2 files changed, 133 insertions(+), 89 deletions(-)

-- 
2.6.4

[PATCH 2/2] PCI: mediatek: Using chained IRQ to setup IRQ handle

2018-04-19 Thread honghui.zhang

From: Honghui Zhang 

Using irq_chip solution to setup IRQs for the consistent with IRQ framework.

Signed-off-by: Honghui Zhang 
---
 drivers/pci/host/pcie-mediatek.c | 192 +--
 1 file changed, 105 insertions(+), 87 deletions(-)

diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
index c3dc549..1d9c6f1 100644
--- a/drivers/pci/host/pcie-mediatek.c
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -11,8 +11,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -130,14 +132,12 @@ struct mtk_pcie_port;
 /**
  * struct mtk_pcie_soc - differentiate between host generations
  * @need_fix_class_id: whether this host's class ID needed to be fixed or not
- * @has_msi: whether this host supports MSI interrupts or not
  * @ops: pointer to configuration access functions
  * @startup: pointer to controller setting functions
  * @setup_irq: pointer to initialize IRQ functions
  */
 struct mtk_pcie_soc {
bool need_fix_class_id;
-   bool has_msi;
struct pci_ops *ops;
int (*startup)(struct mtk_pcie_port *port);
int (*setup_irq)(struct mtk_pcie_port *port, struct device_node *node);
@@ -161,7 +161,9 @@ struct mtk_pcie_soc {
  * @lane: lane count
  * @slot: port slot
  * @irq_domain: legacy INTx IRQ domain
+ * @inner_domain: inner IRQ domain
  * @msi_domain: MSI IRQ domain
+ * @lock: protect the msi_irq_in_use bitmap
  * @msi_irq_in_use: bit map for assigned MSI IRQ
  */
 struct mtk_pcie_port {
@@ -179,7 +181,9 @@ struct mtk_pcie_port {
u32 lane;
u32 slot;
struct irq_domain *irq_domain;
+   struct irq_domain *inner_domain;
struct irq_domain *msi_domain;
+   struct mutex lock;
DECLARE_BITMAP(msi_irq_in_use, MTK_MSI_IRQS_NUM);
 };
 
@@ -446,103 +450,122 @@ static int mtk_pcie_startup_port_v2(struct 
mtk_pcie_port *port)
return 0;
 }
 
-static int mtk_pcie_msi_alloc(struct mtk_pcie_port *port)
+static void mtk_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 {
-   int msi;
+   struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
+   phys_addr_t addr;
 
-   msi = find_first_zero_bit(port->msi_irq_in_use, MTK_MSI_IRQS_NUM);
-   if (msi < MTK_MSI_IRQS_NUM)
-   set_bit(msi, port->msi_irq_in_use);
-   else
-   return -ENOSPC;
+   /* MT2712/MT7622 only support 32-bit MSI addresses */
+   addr = virt_to_phys(port->base + PCIE_MSI_VECTOR);
+   msg->address_hi = 0;
+   msg->address_lo = lower_32_bits(addr);
 
-   return msi;
+   msg->data = data->hwirq;
+
+   dev_dbg(port->pcie->dev, "msi#%d address_hi %#x address_lo %#x\n",
+   (int)data->hwirq, msg->address_hi, msg->address_lo);
 }
 
-static void mtk_pcie_msi_free(struct mtk_pcie_port *port, unsigned long hwirq)
+static int mtk_msi_set_affinity(struct irq_data *irq_data,
+  const struct cpumask *mask, bool force)
 {
-   clear_bit(hwirq, port->msi_irq_in_use);
+   return -EINVAL;
 }
 
-static int mtk_pcie_msi_setup_irq(struct msi_controller *chip,
- struct pci_dev *pdev, struct msi_desc *desc)
-{
-   struct mtk_pcie_port *port;
-   struct msi_msg msg;
-   unsigned int irq;
-   int hwirq;
-   phys_addr_t msg_addr;
+static struct irq_chip mtk_msi_bottom_irq_chip = {
+   .name   = "MTK MSI",
+   .irq_compose_msi_msg= mtk_compose_msi_msg,
+   .irq_set_affinity   = mtk_msi_set_affinity,
+   .irq_mask   = pci_msi_mask_irq,
+   .irq_unmask = pci_msi_unmask_irq,
+};
 
-   port = mtk_pcie_find_port(pdev->bus, pdev->devfn);
-   if (!port)
-   return -EINVAL;
+static int mtk_pcie_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
+unsigned int nr_irqs, void *args)
+{
+   struct mtk_pcie_port *port = domain->host_data;
+   unsigned long bit;
 
-   hwirq = mtk_pcie_msi_alloc(port);
-   if (hwirq < 0)
-   return hwirq;
+   WARN_ON(nr_irqs != 1);
+   mutex_lock(&port->lock);
 
-   irq = irq_create_mapping(port->msi_domain, hwirq);
-   if (!irq) {
-   mtk_pcie_msi_free(port, hwirq);
-   return -EINVAL;
+   bit = find_first_zero_bit(port->msi_irq_in_use, MTK_MSI_IRQS_NUM);
+   if (bit >= MTK_MSI_IRQS_NUM) {
+   mutex_unlock(&port->lock);
+   return -ENOSPC;
}
 
-   chip->dev = &pdev->dev;
-
-   irq_set_msi_desc(irq, desc);
+   __set_bit(bit, port->msi_irq_in_use);
 
-   /* MT2712/MT7622 only support 32-bit MSI addresses */
-   msg_addr = virt_to_phys(port->base + PCIE_MSI_VECTOR);
-   msg.address_hi = 0;
-   msg.address_lo = lower_32_bits(msg_addr);
-   msg.data = hwirq;
+   mutex_unlock(&port->lock);
 
-   pci_write_msi

[PATCH v6 1/2] PCI: mediatek: Set up vendor ID and class type for MT7622

2018-04-19 Thread honghui.zhang

From: Honghui Zhang 

MT7622's hardware default value of vendor ID and class type is not correct,
fix that by setup the correct values before linkup with Endpoint.

Signed-off-by: Honghui Zhang 
---
 drivers/pci/host/pcie-mediatek.c | 30 +++---
 include/linux/pci_ids.h  |  2 ++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
index a8b20c5..c3dc549 100644
--- a/drivers/pci/host/pcie-mediatek.c
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -66,6 +66,10 @@
 
 /* PCIe V2 per-port registers */
 #define PCIE_MSI_VECTOR0x0c0
+
+#define PCIE_CONF_VEND_ID  0x100
+#define PCIE_CONF_CLASS_ID 0x106
+
 #define PCIE_INT_MASK  0x420
 #define INTX_MASK  GENMASK(19, 16)
 #define INTX_SHIFT 16
@@ -125,12 +129,14 @@ struct mtk_pcie_port;
 
 /**
  * struct mtk_pcie_soc - differentiate between host generations
+ * @need_fix_class_id: whether this host's class ID needed to be fixed or not
  * @has_msi: whether this host supports MSI interrupts or not
  * @ops: pointer to configuration access functions
  * @startup: pointer to controller setting functions
  * @setup_irq: pointer to initialize IRQ functions
  */
 struct mtk_pcie_soc {
+   bool need_fix_class_id;
bool has_msi;
struct pci_ops *ops;
int (*startup)(struct mtk_pcie_port *port);
@@ -375,6 +381,7 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port 
*port)
 {
struct mtk_pcie *pcie = port->pcie;
struct resource *mem = &pcie->mem;
+   const struct mtk_pcie_soc *soc = port->pcie->soc;
u32 val;
size_t size;
int err;
@@ -403,6 +410,15 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port 
*port)
   PCIE_MAC_SRSTB | PCIE_CRSTB;
writel(val, port->base + PCIE_RST_CTRL);
 
+   /* Set up vendor ID and class code */
+   if (soc->need_fix_class_id) {
+   val = PCI_VENDOR_ID_MEDIATEK;
+   writew(val, port->base + PCIE_CONF_VEND_ID);
+
+   val = PCI_CLASS_BRIDGE_PCI;
+   writew(val, port->base + PCIE_CONF_CLASS_ID);
+   }
+
/* 100ms timeout value should be enough for Gen1/2 training */
err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_V2, val,
 !!(val & PCIE_PORT_LINKUP_V2), 20,
@@ -1142,7 +1158,15 @@ static const struct mtk_pcie_soc mtk_pcie_soc_v1 = {
.startup = mtk_pcie_startup_port,
 };
 
-static const struct mtk_pcie_soc mtk_pcie_soc_v2 = {
+static const struct mtk_pcie_soc mtk_pcie_soc_mt2712 = {
+   .has_msi = true,
+   .ops = &mtk_pcie_ops_v2,
+   .startup = mtk_pcie_startup_port_v2,
+   .setup_irq = mtk_pcie_setup_irq,
+};
+
+static const struct mtk_pcie_soc mtk_pcie_soc_mt7622 = {
+   .need_fix_class_id = true,
.has_msi = true,
.ops = &mtk_pcie_ops_v2,
.startup = mtk_pcie_startup_port_v2,
@@ -1152,8 +1176,8 @@ static const struct mtk_pcie_soc mtk_pcie_soc_v2 = {
 static const struct of_device_id mtk_pcie_ids[] = {
{ .compatible = "mediatek,mt2701-pcie", .data = &mtk_pcie_soc_v1 },
{ .compatible = "mediatek,mt7623-pcie", .data = &mtk_pcie_soc_v1 },
-   { .compatible = "mediatek,mt2712-pcie", .data = &mtk_pcie_soc_v2 },
-   { .compatible = "mediatek,mt7622-pcie", .data = &mtk_pcie_soc_v2 },
+   { .compatible = "mediatek,mt2712-pcie", .data = &mtk_pcie_soc_mt2712 },
+   { .compatible = "mediatek,mt7622-pcie", .data = &mtk_pcie_soc_mt7622 },
{},
 };
 
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index a6b3066..9d4fca5 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2115,6 +2115,8 @@
 
 #define PCI_VENDOR_ID_MYRICOM  0x14c1
 
+#define PCI_VENDOR_ID_MEDIATEK 0x14c3
+
 #define PCI_VENDOR_ID_TITAN0x14D2
 #define PCI_DEVICE_ID_TITAN_010L   0x8001
 #define PCI_DEVICE_ID_TITAN_100L   0x8010
-- 
2.6.4

Re: DOS by unprivileged user

2018-04-19 Thread Mike Galbraith

On Thu, 2018-04-19 at 21:13 +0200, Ferry Toth wrote:
> It appears any ordinary user can easily create a DOS on linux.
> 
> One sure way to reproduce this is to open gitk on the linux kernel repo 
> (SIC) on a machine with 8GB RAM 16 GB swap on a HDD with btrfs and quad core 
> + hyperthreading. But I will be easy enough to get the same effect with more 
> RAM, other fs etc.
> 
> In this case gitk allocates more and more memory (until my system freezes 
> 6.5GB of 7.5GB avaiable), the system starts swapping or writing to tmp files 
> (can't investigate as there is no time until it freezes) and the io wait 
> goes to 100% on all cores. At this point it is impossible to login from 
> remote and local keyboard and mouse are frozen. Hard reset is the only way 
> out at this point.

datapoint: my i4790/ext4 box running master.yesterday booted mem=8G
became highly unpleasant to use, but I retained control, and the all
cores going to 100% thing did not happen at any time.

I didn't try constraining on the gitk user, just turned it loose a few
times to see if it managed to render box effectively dead.  It failed
to kill my box, but (expectedly) did make it suck rocks.

-Mike

Re: [PATCH] tpm: moves the delay_msec increment after sleep in tpm_transmit()

2018-04-19 Thread Jarkko Sakkinen

On Tue, Apr 10, 2018 at 03:31:09PM +0300, Jarkko Sakkinen wrote:
> On Mon, 2018-04-09 at 10:29 -0400, Mimi Zohar wrote:
> > If this change is acceptable, do you want to make the change or should Nayna
> > repost the patch?
> 
> No need. I'll move on to testing.

Tested-by: Jarkko Sakkinen 
Reviewed-by: Jarkko Sakkinen 

/Jarkko

Re: [PATCH 2/2] cpufreq: brcmstb-avs-cpufreq: prefer SCMI cpufreq if supported

2018-04-19 Thread Viresh Kumar

On 19-04-18, 11:37, Sudeep Holla wrote:
> 
> 
> On 19/04/18 05:16, Viresh Kumar wrote:
> > On 18-04-18, 08:56, Markus Mayer wrote:
> >> From: Jim Quinlan 
> >>
> >> If the SCMI cpufreq driver is supported, we bail, so that the new
> >> approach can be used.
> >>
> >> Signed-off-by: Jim Quinlan 
> >> Signed-off-by: Markus Mayer 
> >> ---
> >>  drivers/cpufreq/brcmstb-avs-cpufreq.c | 16 
> >>  1 file changed, 16 insertions(+)
> >>
> >> diff --git a/drivers/cpufreq/brcmstb-avs-cpufreq.c 
> >> b/drivers/cpufreq/brcmstb-avs-cpufreq.c
> >> index b07559b9ed99..b4861a730162 100644
> >> --- a/drivers/cpufreq/brcmstb-avs-cpufreq.c
> >> +++ b/drivers/cpufreq/brcmstb-avs-cpufreq.c
> >> @@ -164,6 +164,8 @@
> >>  #define BRCM_AVS_CPU_INTR "brcm,avs-cpu-l2-intr"
> >>  #define BRCM_AVS_HOST_INTR"sw_intr"
> >>  
> >> +#define ARM_SCMI_COMPAT   "arm,scmi"
> >> +
> >>  struct pmap {
> >>unsigned int mode;
> >>unsigned int p1;
> >> @@ -511,6 +513,20 @@ static int brcm_avs_prepare_init(struct 
> >> platform_device *pdev)
> >>struct device *dev;
> >>int host_irq, ret;
> >>  
> >> +  /*
> >> +   * If the SCMI cpufreq driver is supported, we bail, so that the more
> >> +   * modern approach can be used.
> >> +   */
> >> +  if (IS_ENABLED(CONFIG_ARM_SCMI_PROTOCOL)) {
> >> +  struct device_node *np;
> >> +
> >> +  np = of_find_compatible_node(NULL, NULL, ARM_SCMI_COMPAT);
> >> +  if (np) {
> >> +  of_node_put(np);
> >> +  return -ENXIO;
> >> +  }
> >> +  }
> >> +
> > 
> > What about adding !CONFIG_ARM_SCMI_PROTOCOL in Kconfig dependency and don't
> > compile the driver at all ?
> > 
> 
> Unfortunately, that may not be good idea with single image needing both
> configs to be enabled.

Sure, but looking at the above code, it looked like they don't need the other
config if SCMI is enabled.

-- 
viresh

Re: [RFC/RFT patch 0/7] timekeeping: Unify clock MONOTONIC and clock BOOTTIME

2018-04-19 Thread David Herrmann

Hey

On Tue, Mar 13, 2018 at 7:11 PM, John Stultz  wrote:
> On Mon, Mar 12, 2018 at 11:36 PM, Ingo Molnar  wrote:
>> Ok, I have edited all the changelogs accordingly (and also flipped around the
>> 'clock MONOTONIC' language to the more readable 'the MONOTONIC clock' 
>> variant),
>> the resulting titles are (in order):
>>
>>  72199320d49d: timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock
>>  d6ed449afdb3: timekeeping: Make the MONOTONIC clock behave like the 
>> BOOTTIME clock
>>  f2d6fdbfd238: Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior
>>  d6c7270e913d: timekeeping: Remove boot time specific code
>>  7250a4047aa6: posix-timers: Unify MONOTONIC and BOOTTIME clock behavior
>>  127bfa5f4342: hrtimer: Unify MONOTONIC and BOOTTIME clock behavior
>>  92af4dcb4e1c: tracing: Unify the "boot" and "mono" tracing clocks
>>
>> I'll push these out after testing.
>
> I'm still anxious about userspace effects given how much I've seen the
> current behavior documented, and wouldn't pushed for this myself (I'm
> a worrier), but at least I'm not seeing any failures in initial
> testing w/ kselftest so far.

I get lots of timer-errors on Arch-Linux booting current master, after
a suspend/resume cycle. Just a selection of errors I see on resume:

systemd[1]: systemd-journald.service: Main process exited,
code=dumped, status=6/ABRT
rtkit-daemon[742]: The canary thread is apparently starving. Taking action.
systemd[1]: systemd-udevd.service: Watchdog timeout (limit 3min)!
systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
kernel: e1000e :00:1f.6: Failed to restore TIMINCA clock rate delta: -22

Lots of crashes with SIGABRT due to these.

I did not bisect it, but it sounds related to me. Also, user-space
uses CLOCK_MONOTONIC for watchdog timers. That is, a process is
required to respond to a watchdog-request in a given MONOTONIC
time-frame. If this jumps during suspend/resume, watchdogs will fire
immediately. I don't see how this can work with the new MONOTONIC
behavior?

Thanks
David

Re: [greybus-dev] [PATCH 47/61] staging: greybus: simplify getting .drvdata

2018-04-19 Thread Viresh Kumar

On 19-04-18, 16:06, Wolfram Sang wrote:
> We should get drvdata from struct device directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang 
> ---
> 
> Build tested only. buildbot is happy. Please apply individually.
> 
>  drivers/staging/greybus/arche-platform.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH 08/61] dmaengine: dw: simplify getting .drvdata

2018-04-19 Thread Viresh Kumar

On 19-04-18, 16:05, Wolfram Sang wrote:
> We should get drvdata from struct device directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang 
> ---
> 
> Build tested only. buildbot is happy. Please apply individually.
> 
>  drivers/dma/dw/platform.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH] f2fs: sepearte hot/cold in free nid

2018-04-19 Thread Chao Yu

On 2018/4/20 11:37, Jaegeuk Kim wrote:
> On 04/20, Chao Yu wrote:
>> As most indirect node, dindirect node, and xattr node won't be updated
>> after they are created, but inode node and other direct node will change
>> more frequently, so store their nat entries mixedly in whole nat table
>> will suffer:
>> - fragment nat table soon due to different update rate
>> - more nat block update due to fragmented nat table
>>
>> In order to solve above issue, we're trying to separate whole nat table to
>> two part:
>> a. Hot free nid area:
>>  - range: [nid #0, nid #x)
>>  - store node block address for
>>* inode node
>>* other direct node
>> b. Cold free nid area:
>>  - range: [nid #x, max nid)
>>  - store node block address for
>>* indirect node
>>* dindirect node
>>* xattr node
>>
>> Allocation strategy example:
>>
>> Free nid: '-'
>> Used nid: '='
>>
>> 1. Initial status:
>> Free Nids:   
>> |---|
>>  ^   ^   ^   
>> ^
>> Alloc Range: |---|   
>> |---|
>>  hot_start   hot_end 
>> cold_start  cold_end
>>
>> 2. Free nids have ran out:
>> Free Nids:   
>> |===-===|
>>  ^   ^   ^   
>> ^
>> Alloc Range: |===|   
>> |===|
>>  hot_start   hot_end 
>> cold_start  cold_end
>>
>> 3. Expand hot/cold area range:
>> Free Nids:   
>> |===-===|
>>  ^   ^   ^   
>> ^
>> Alloc Range: |===|   
>> |===|
>>  hot_start   hot_end cold_start  
>> cold_end
>>
>> 4. Hot free nids have ran out:
>> Free Nids:   
>> |===-===|
>>  ^   ^   ^   
>> ^
>> Alloc Range: |===|   
>> |===|
>>  hot_start   hot_end cold_start  
>> cold_end
>>
>> 5. Expand hot area range, hot/cold area boundary has been fixed:
>> Free Nids:   
>> |===-===|
>>  ^   ^   
>> ^
>> Alloc Range: 
>> |===|===|
>>  hot_start   hot_end(cold_start) 
>> cold_end
>>
>> Run xfstests with generic/*:
>>
>> before
>> node_write:  169660
>> cp_count:60118
>> node/cp  2.82
>>
>> after:
>> node_write:  159145
>> cp_count:84501
>> node/cp: 2.64
> 
> Nice trial tho, I don't see much benefit on this huge patch. I guess we may be
> able to find an efficient way to achieve this issue rather than changing whole
> stable codes.

IMO, based on this, later, we can add more allocation policy to manage free nid
resource to get more benefit.

If you worry about code stability, we can queue this patch in dev-test branch to
test this longer time.

> 
> How about getting a free nid in the list from head or tail separately?

I don't think this can get benefit from long time used image, since nat table
will be fragmented anyway, then we won't know free nid in head or in tail comes
from hot nat block or cold nat block.

Anyway, I will have a try.

Thanks,

> 
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/checkpoint.c |   4 -
>>  fs/f2fs/debug.c  |   6 +-
>>  fs/f2fs/f2fs.h   |  19 +++-
>>  fs/f2fs/inode.c  |   2 +-
>>  fs/f2fs/namei.c  |   2 +-
>>  fs/f2fs/node.c   | 302 
>> ---
>>  fs/f2fs/node.h   |  17 +--
>>  fs/f2fs/segment.c|   8 +-
>>  fs/f2fs/shrinker.c   |   3 +-
>>  fs/f2fs/xattr.c  |  10 +-
>>  10 files changed, 221 insertions(+), 152 deletions(-)
>>
>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>> index 96785ffc6181..c17feec72c74 100644
>> --- a/fs/f2fs/checkpoint.c
>> +++ b/fs/f2fs/checkpoint.c
>> @@ -1029,14 +1029,10 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
>>  static void __prepare_cp_block(struct f2fs_sb_info *sbi)
>>  {
>>  struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
>> -struct f2fs_nm_info *nm_i = NM_I(sbi);
>> -nid_t last_nid = nm_i->next_scan_nid;
>>  
>> -next_free_nid(sbi, &last_nid);
>>  ckpt->valid_block_count = cpu_to_le64(valid_user_blocks(sbi));
>>  ckpt->valid_node_cou

Re: [PATCH] IB/core: Make ib_mad_client_id atomic

2018-04-19 Thread Doug Ledford

On Wed, 2018-04-18 at 16:24 +0200, Håkon Bugge wrote:
> Two kernel threads may get the same value for agent.hi_tid, if the
> agents are registered for different ports. As of now, this works, as
> the agent list is per port.
> 
> It is however confusing and not future robust. Hence, making it
> atomic.
> 

People sometimes underestimate the performance penalty of atomic ops. 
Every atomic op is the equivalent of a spin_lock/spin_unlock pair.  This
is why two atomics are worse than taking a spin_lock, doing what you
have to do, and releasing the spin_lock.  Is this really what you want
for a "confusing, let's make it robust" issue?

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

signature.asc
Description: This is a digitally signed message part

Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer

2018-04-19 Thread Jaegeuk Kim

On 04/20, Chao Yu wrote:
> On 2018/4/20 11:19, Jaegeuk Kim wrote:
> > On 04/18, Chao Yu wrote:
> >> Thread A   Thread BThread C
> >> - f2fs_remount
> >>  - stop_gc_thread
> >>- f2fs_sbi_store
> >>- issue_discard_thread
> >>sbi->gc_thread = NULL;
> >>  sbi->gc_thread->gc_wake = 1
> >>  access 
> >> sbi->gc_thread->gc_urgent
> > 
> > Do we simply need a lock for this?
> 
> Code will be more complicated for handling existed and new coming fields with
> the sbi->gc_thread pointer, and causing unneeded lock overhead, right?
> 
> So let's just allocate memory during fill_super?

No, the case is when stopping the thread. We can keep the gc_thread and indicate
its state as "disabled". Then, we need to handle other paths with the state?

> 
> Thanks,
> 
> > 
> >>
> >> Previously, we allocate memory for sbi->gc_thread based on background
> >> gc thread mount option, the memory can be released if we turn off
> >> that mount option, but still there are several places access gc_thread
> >> pointer without considering race condition, result in NULL point
> >> dereference.
> >>
> >> In order to fix this issue, keep gc_thread structure valid in sbi all
> >> the time instead of alloc/free it dynamically.
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/debug.c   |  3 +--
> >>  fs/f2fs/f2fs.h|  7 +++
> >>  fs/f2fs/gc.c  | 58 
> >> +--
> >>  fs/f2fs/segment.c |  4 ++--
> >>  fs/f2fs/super.c   | 13 +++--
> >>  fs/f2fs/sysfs.c   |  8 
> >>  6 files changed, 60 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> >> index 715beb85e9db..7bb036a3bb81 100644
> >> --- a/fs/f2fs/debug.c
> >> +++ b/fs/f2fs/debug.c
> >> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
> >>si->cache_mem = 0;
> >>  
> >>/* build gc */
> >> -  if (sbi->gc_thread)
> >> -  si->cache_mem += sizeof(struct f2fs_gc_kthread);
> >> +  si->cache_mem += sizeof(struct f2fs_gc_kthread);
> >>  
> >>/* build merge flush thread */
> >>if (SM_I(sbi)->fcc_info)
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index 567c6bb57ae3..c553f63199e8 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct 
> >> f2fs_sb_info *sbi)
> >>return (struct sit_info *)(SM_I(sbi)->sit_info);
> >>  }
> >>  
> >> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
> >> +{
> >> +  return (struct f2fs_gc_kthread *)(sbi->gc_thread);
> >> +}
> >> +
> >>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
> >>  {
> >>return (struct free_segmap_info *)(SM_I(sbi)->free_info);
> >> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t 
> >> pos, size_t len);
> >>  /*
> >>   * gc.c
> >>   */
> >> +int init_gc_context(struct f2fs_sb_info *sbi);
> >> +void destroy_gc_context(struct f2fs_sb_info * sbi);
> >>  int start_gc_thread(struct f2fs_sb_info *sbi);
> >>  void stop_gc_thread(struct f2fs_sb_info *sbi);
> >>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
> >> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >> index da89ca16a55d..7d310e454b77 100644
> >> --- a/fs/f2fs/gc.c
> >> +++ b/fs/f2fs/gc.c
> >> @@ -26,8 +26,8 @@
> >>  static int gc_thread_func(void *data)
> >>  {
> >>struct f2fs_sb_info *sbi = data;
> >> -  struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> >> -  wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
> >> +  struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> >> +  wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
> >>unsigned int wait_ms;
> >>  
> >>wait_ms = gc_th->min_sleep_time;
> >> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
> >>return 0;
> >>  }
> >>  
> >> -int start_gc_thread(struct f2fs_sb_info *sbi)
> >> +int init_gc_context(struct f2fs_sb_info *sbi)
> >>  {
> >>struct f2fs_gc_kthread *gc_th;
> >> -  dev_t dev = sbi->sb->s_bdev->bd_dev;
> >> -  int err = 0;
> >>  
> >>gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
> >> -  if (!gc_th) {
> >> -  err = -ENOMEM;
> >> -  goto out;
> >> -  }
> >> +  if (!gc_th)
> >> +  return -ENOMEM;
> >> +
> >> +  gc_th->f2fs_gc_task = NULL;
> >>  
> >>gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
> >>gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
> >> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
> >>gc_th->atomic_file[FG_GC] = 0;
> >>  
> >>sbi->gc_thread = gc_th;
> >> -  init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> >> -  sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> >> +
> >> +  return 0;
> >> +}
> >> +
> >> +void destroy_gc_co

Re: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-19 Thread Jason Wang




On 2018年04月20日 02:40, Michael S. Tsirkin wrote:

On Tue, Apr 10, 2018 at 03:25:45PM +0800, Jason Wang wrote:

One problem is that, different virtio ring compatible devices
may have different device interfaces. That is to say, we will
need different drivers in QEMU. It could be troublesome. And
that's what this patch trying to fix. The idea behind this
patch is very simple: mdev is a standard way to emulate device
in kernel.

So you just move the abstraction layer from qemu to kernel, and you still
need different drivers in kernel for different device interfaces of
accelerators. This looks even more complex than leaving it in qemu. As you
said, another idea is to implement userspace vhost backend for accelerators
which seems easier and could co-work with other parts of qemu without
inventing new type of messages.

I'm not quite sure. Do you think it's acceptable to
add various vendor specific hardware drivers in QEMU?


I don't object but we need to figure out the advantages of doing it in qemu
too.

Thanks

To be frank kernel is exactly where device drivers belong.  DPDK did
move them to userspace but that's merely a requirement for data path.
*If* you can have them in kernel that is best:
- update kernel and there's no need to rebuild userspace


Well, you still need to rebuild userspace since a new vhost backend is 
required which relies vhost protocol through mdev API. And I believe 
upgrading userspace package is considered to be more lightweight than 
upgrading kernel. With mdev, we're likely to repeat the story of vhost 
API, dealing with features/versions and inventing new API endless for 
new features. And you will still need to rebuild the userspace.



- apps can be written in any language no need to maintain multiple
   libraries or add wrappers


This is not a big issue consider It's not a generic network driver but a 
mdev driver, the only possible user is VM.



- security concerns are much smaller (ok people are trying to
   raise the bar with IOMMUs and such, but it's already pretty
   good even without)


Well, I think not, kernel bugs are much more serious than userspace 
ones. And I beg the kernel driver itself won't be small.




The biggest issue is that you let userspace poke at the
device which is also allowed by the IOMMU to poke at
kernel memory (needed for kernel driver to work).


I don't quite get. The userspace driver could be built on top of VFIO 
for sure. So kernel memory were perfectly isolated in this case.




Yes, maybe if device is not buggy it's all fine, but
it's better if we do not have to trust the device
otherwise the security picture becomes more murky.

I suggested attaching a PASID to (some) queues - see my old post "using
PASIDs to enable a safe variant of direct ring access".

Then using IOMMU with VFIO to limit access through queue to corrent
ranges of memory.


Well userspace driver could benefit from this too. And we can even go 
further by using nested IO page tables to share IOVA address space 
between devices and a VM.


Thanks

RE: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-19 Thread Liang, Cunming



> -Original Message-
> From: Bie, Tiwei
> Sent: Friday, April 20, 2018 11:28 AM
> To: Michael S. Tsirkin 
> Cc: Jason Wang ; alex.william...@redhat.com;
> ddut...@redhat.com; Duyck, Alexander H ;
> virtio-...@lists.oasis-open.org; linux-kernel@vger.kernel.org;
> k...@vger.kernel.org; virtualizat...@lists.linux-foundation.org;
> net...@vger.kernel.org; Daly, Dan ; Liang, Cunming
> ; Wang, Zhihong ; Tan,
> Jianfeng ; Wang, Xiao W ;
> Tian, Kevin 
> Subject: Re: [RFC] vhost: introduce mdev based hardware vhost backend
> 
> On Thu, Apr 19, 2018 at 09:40:23PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Apr 10, 2018 at 03:25:45PM +0800, Jason Wang wrote:
> > > > > > One problem is that, different virtio ring compatible devices
> > > > > > may have different device interfaces. That is to say, we will
> > > > > > need different drivers in QEMU. It could be troublesome. And
> > > > > > that's what this patch trying to fix. The idea behind this
> > > > > > patch is very simple: mdev is a standard way to emulate device
> > > > > > in kernel.
> > > > > So you just move the abstraction layer from qemu to kernel, and
> > > > > you still need different drivers in kernel for different device
> > > > > interfaces of accelerators. This looks even more complex than
> > > > > leaving it in qemu. As you said, another idea is to implement
> > > > > userspace vhost backend for accelerators which seems easier and
> > > > > could co-work with other parts of qemu without inventing new type of
> messages.
> > > > I'm not quite sure. Do you think it's acceptable to add various
> > > > vendor specific hardware drivers in QEMU?
> > > >
> > >
> > > I don't object but we need to figure out the advantages of doing it
> > > in qemu too.
> > >
> > > Thanks
> >
> > To be frank kernel is exactly where device drivers belong.  DPDK did
> > move them to userspace but that's merely a requirement for data path.
> > *If* you can have them in kernel that is best:
> > - update kernel and there's no need to rebuild userspace
> > - apps can be written in any language no need to maintain multiple
> >   libraries or add wrappers
> > - security concerns are much smaller (ok people are trying to
> >   raise the bar with IOMMUs and such, but it's already pretty
> >   good even without)
> >
> > The biggest issue is that you let userspace poke at the device which
> > is also allowed by the IOMMU to poke at kernel memory (needed for
> > kernel driver to work).
> 
> I think the device won't and shouldn't be allowed to poke at kernel memory. 
> Its
> kernel driver needs some kernel memory to work. But the device doesn't have
> the access to them. Instead, the device only has the access to:
> 
> (1) the entire memory of the VM (if vIOMMU isn't used) or
> (2) the memory belongs to the guest virtio device (if
> vIOMMU is being used).
> 
> Below is the reason:
> 
> For the first case, we should program the IOMMU for the hardware device based
> on the info in the memory table which is the entire memory of the VM.
> 
> For the second case, we should program the IOMMU for the hardware device
> based on the info in the shadow page table of the vIOMMU.
> 
> So the memory can be accessed by the device is limited, it should be safe
> especially for the second case.
> 
> My concern is that, in this RFC, we don't program the IOMMU for the mdev
> device in the userspace via the VFIO API directly. Instead, we pass the memory
> table to the kernel driver via the mdev device (BAR0) and ask the driver to 
> do the
> IOMMU programming. Someone may don't like it. The main reason why we don't
> program IOMMU via VFIO API in userspace directly is that, currently IOMMU
> drivers don't support mdev bus.
> 
> >
> > Yes, maybe if device is not buggy it's all fine, but it's better if we
> > do not have to trust the device otherwise the security picture becomes
> > more murky.
> >
> > I suggested attaching a PASID to (some) queues - see my old post
> > "using PASIDs to enable a safe variant of direct ring access".
> 
Ideally we can have a device binding with normal driver in host, meanwhile 
support to allocate a few queues attaching with PASID on-demand. By vhost mdev 
transport channel, the data path ability of queues(as a device) can expose to 
qemu vhost adaptor as a vDPA instance. Then we can avoid VF number limitation, 
providing vhost data path acceleration in a small granularity.

> It's pretty cool. We also have some similar ideas.
> Cunming will talk more about this.
> 
> Best regards,
> Tiwei Bie
> 
> >
> > Then using IOMMU with VFIO to limit access through queue to corrent
> > ranges of memory.
> >
> >
> > --
> > MST

Re: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-19 Thread Michael S. Tsirkin

On Fri, Apr 20, 2018 at 11:28:07AM +0800, Tiwei Bie wrote:
> On Thu, Apr 19, 2018 at 09:40:23PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Apr 10, 2018 at 03:25:45PM +0800, Jason Wang wrote:
> > > > > > One problem is that, different virtio ring compatible devices
> > > > > > may have different device interfaces. That is to say, we will
> > > > > > need different drivers in QEMU. It could be troublesome. And
> > > > > > that's what this patch trying to fix. The idea behind this
> > > > > > patch is very simple: mdev is a standard way to emulate device
> > > > > > in kernel.
> > > > > So you just move the abstraction layer from qemu to kernel, and you 
> > > > > still
> > > > > need different drivers in kernel for different device interfaces of
> > > > > accelerators. This looks even more complex than leaving it in qemu. 
> > > > > As you
> > > > > said, another idea is to implement userspace vhost backend for 
> > > > > accelerators
> > > > > which seems easier and could co-work with other parts of qemu without
> > > > > inventing new type of messages.
> > > > I'm not quite sure. Do you think it's acceptable to
> > > > add various vendor specific hardware drivers in QEMU?
> > > > 
> > > 
> > > I don't object but we need to figure out the advantages of doing it in 
> > > qemu
> > > too.
> > > 
> > > Thanks
> > 
> > To be frank kernel is exactly where device drivers belong.  DPDK did
> > move them to userspace but that's merely a requirement for data path.
> > *If* you can have them in kernel that is best:
> > - update kernel and there's no need to rebuild userspace
> > - apps can be written in any language no need to maintain multiple
> >   libraries or add wrappers
> > - security concerns are much smaller (ok people are trying to
> >   raise the bar with IOMMUs and such, but it's already pretty
> >   good even without)
> > 
> > The biggest issue is that you let userspace poke at the
> > device which is also allowed by the IOMMU to poke at
> > kernel memory (needed for kernel driver to work).
> 
> I think the device won't and shouldn't be allowed to
> poke at kernel memory. Its kernel driver needs some
> kernel memory to work. But the device doesn't have
> the access to them. Instead, the device only has the
> access to:
> 
> (1) the entire memory of the VM (if vIOMMU isn't used)
> or
> (2) the memory belongs to the guest virtio device (if
> vIOMMU is being used).
> 
> Below is the reason:
> 
> For the first case, we should program the IOMMU for
> the hardware device based on the info in the memory
> table which is the entire memory of the VM.
> 
> For the second case, we should program the IOMMU for
> the hardware device based on the info in the shadow
> page table of the vIOMMU.
> 
> So the memory can be accessed by the device is limited,
> it should be safe especially for the second case.
> 
> My concern is that, in this RFC, we don't program the
> IOMMU for the mdev device in the userspace via the VFIO
> API directly. Instead, we pass the memory table to the
> kernel driver via the mdev device (BAR0) and ask the
> driver to do the IOMMU programming. Someone may don't
> like it. The main reason why we don't program IOMMU via
> VFIO API in userspace directly is that, currently IOMMU
> drivers don't support mdev bus.

But it is a pci device after all, isn't it?
IOMMU drivers certainly support that ...

Another issue with this approach is that internal
kernel issues leak out to the interface.

> > 
> > Yes, maybe if device is not buggy it's all fine, but
> > it's better if we do not have to trust the device
> > otherwise the security picture becomes more murky.
> > 
> > I suggested attaching a PASID to (some) queues - see my old post "using
> > PASIDs to enable a safe variant of direct ring access".
> 
> It's pretty cool. We also have some similar ideas.
> Cunming will talk more about this.
> 
> Best regards,
> Tiwei Bie

An extra benefit to this could be that requests with PASID
undergo an extra level of translation.
We could use it to avoid the need for shadowing on intel.



Something like this:
- expose to guest a standard virtio device (no pasid support)
- back it by virtio device with pasid support on the host
  by attaching same pasid to all queues

now - guest will build 1 level of page tables

we build first level page tables for requests with pasid
and point the IOMMU to use the guest supplied page tables
for the second level of translation.

Now we do need to forward invalidations but we no
longer need to set the CM bit and shadow valid entries.



> > 
> > Then using IOMMU with VFIO to limit access through queue to corrent
> > ranges of memory.
> > 
> > 
> > -- 
> > MST

Re: [PATCH v1 5/7] soc: mediatek: add a fixed wait for SRAM stable

2018-04-19 Thread Sean Wang

On Thu, 2018-04-19 at 12:33 +0200, Matthias Brugger wrote:
> 
> On 04/03/2018 09:15 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > MT7622_POWER_DOMAIN_WB doesn't send an ACK when its managed SRAM becomes
> > stable, which is not like the behavior the other power domains should
> > have. Therefore, it's necessary for such a power domain to have a fixed
> > and well-predefined duration to wait until its managed SRAM can be allowed
> > to access by all functions running on the top.
> > 
> > Signed-off-by: Sean Wang 
> > Cc: Matthias Brugger 
> > Cc: Ulf Hansson 
> > Cc: Weiyi Lu 
> > ---
> >  drivers/soc/mediatek/mtk-scpsys.c | 17 -
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-scpsys.c 
> > b/drivers/soc/mediatek/mtk-scpsys.c
> > index f9b7248..19aceb8 100644
> > --- a/drivers/soc/mediatek/mtk-scpsys.c
> > +++ b/drivers/soc/mediatek/mtk-scpsys.c
> > @@ -121,6 +121,7 @@ struct scp_domain_data {
> > u32 bus_prot_mask;
> > enum clk_id clk_id[MAX_CLKS];
> > bool active_wakeup;
> > +   u32 us_sram_fwait;
> 
> Before adding more and more fields to scp_domain_data which get checked in 
> if's,
> I'd prefer to add a caps field used for bus_prot_mask, active_wakeup in a 
> first
> patch and add the cap FORCE_WAIT in a second patch.
> 
> Can you help to implement this Sean, or shall I give it a try?
> 

Sure, I have a willing to do and then see if you're also fond of it.

thanks!

> Regards,
> Matthias
> 
> >  };
> >  
> >  struct scp;
> > @@ -234,11 +235,16 @@ static int scpsys_power_on(struct generic_pm_domain 
> > *genpd)
> > val &= ~scpd->data->sram_pdn_bits;
> > writel(val, ctl_addr);
> >  
> > -   /* wait until SRAM_PDN_ACK all 0 */
> > -   ret = readl_poll_timeout(ctl_addr, tmp, (tmp & pdn_ack) == 0,
> > -MTK_POLL_DELAY_US, MTK_POLL_TIMEOUT);
> > -   if (ret < 0)
> > -   goto err_pwr_ack;
> > +   /* Either wait until SRAM_PDN_ACK all 0 or have a force wait */
> > +   if (!scpd->data->us_sram_fwait) {
> > +   ret = readl_poll_timeout(ctl_addr, tmp, (tmp & pdn_ack) == 0,
> > +MTK_POLL_DELAY_US, MTK_POLL_TIMEOUT);
> > +   if (ret < 0)
> > +   goto err_pwr_ack;
> > +   } else {
> > +   usleep_range(scpd->data->us_sram_fwait,
> > +scpd->data->us_sram_fwait + 100);
> > +   };
> >  
> > if (scpd->data->bus_prot_mask) {
> > ret = mtk_infracfg_clear_bus_protection(scp->infracfg,
> > @@ -783,6 +789,7 @@ static const struct scp_domain_data 
> > scp_domain_data_mt7622[] = {
> > .clk_id = {CLK_NONE},
> > .bus_prot_mask = MT7622_TOP_AXI_PROT_EN_WB,
> > .active_wakeup = true,
> > +   .us_sram_fwait = 12000,
> > },
> >  };
> >  
> >

Re: [PATCH v1 4/7] soc: mediatek: reuse regmap_read_poll_timeout helpers

2018-04-19 Thread Sean Wang

On Thu, 2018-04-19 at 12:23 +0200, Matthias Brugger wrote:
> 
> On 04/03/2018 09:15 AM, sean.w...@mediatek.com wrote:
> > From: Sean Wang 
> > 
> > Reuse the common helpers regmap_read_poll_timeout provided by Linux core
> > instead of an open-coded handling.
> > 
> > Signed-off-by: Sean Wang 
> > Cc: Matthias Brugger 
> > Cc: Ulf Hansson 
> > Cc: Weiyi Lu 
> > ---
> >  drivers/soc/mediatek/mtk-infracfg.c | 45 
> > +
> >  1 file changed, 10 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/soc/mediatek/mtk-infracfg.c 
> > b/drivers/soc/mediatek/mtk-infracfg.c
> > index 8c310de..b849aa5 100644
> > --- a/drivers/soc/mediatek/mtk-infracfg.c
> > +++ b/drivers/soc/mediatek/mtk-infracfg.c
> > @@ -12,6 +12,7 @@
> >   */
> >  
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -37,7 +38,6 @@
> >  int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask,
> > bool reg_update)
> >  {
> > -   unsigned long expired;
> > u32 val;
> > int ret;
> >  
> > @@ -47,22 +47,11 @@ int mtk_infracfg_set_bus_protection(struct regmap 
> > *infracfg, u32 mask,
> > else
> > regmap_write(infracfg, INFRA_TOPAXI_PROTECTEN_SET, mask);
> >  
> > -   expired = jiffies + HZ;
> > +   ret = regmap_read_poll_timeout(infracfg, INFRA_TOPAXI_PROTECTSTA1,
> > +  val, (val & mask) == mask, 10,
> > +  jiffies_to_usecs(HZ));
> 
> To align with the changes in scpsys, please define MTK_POLL_DELAY_US and
> MTK_POLL_TIMEOUT. I'm not really fan of passing macros as function arguments.
> 

Agreed on. will have an improve on it

thanks!

> Other then that, the patch looks good.
> 
> Thanks a lot,
> Matthias
> 
> >  
> > -   while (1) {
> > -   ret = regmap_read(infracfg, INFRA_TOPAXI_PROTECTSTA1, &val);
> > -   if (ret)
> > -   return ret;
> > -
> > -   if ((val & mask) == mask)
> > -   break;
> > -
> > -   cpu_relax();
> > -   if (time_after(jiffies, expired))
> > -   return -EIO;
> > -   }
> > -
> > -   return 0;
> > +   return ret;
> >  }
> >  
> >  /**
> > @@ -80,30 +69,16 @@ int mtk_infracfg_set_bus_protection(struct regmap 
> > *infracfg, u32 mask,
> >  int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask,
> > bool reg_update)
> >  {
> > -   unsigned long expired;
> > int ret;
> > +   u32 val;
> >  
> > if (reg_update)
> > regmap_update_bits(infracfg, INFRA_TOPAXI_PROTECTEN, mask, 0);
> > else
> > regmap_write(infracfg, INFRA_TOPAXI_PROTECTEN_CLR, mask);
> >  
> > -   expired = jiffies + HZ;
> > -
> > -   while (1) {
> > -   u32 val;
> > -
> > -   ret = regmap_read(infracfg, INFRA_TOPAXI_PROTECTSTA1, &val);
> > -   if (ret)
> > -   return ret;
> > -
> > -   if (!(val & mask))
> > -   break;
> > -
> > -   cpu_relax();
> > -   if (time_after(jiffies, expired))
> > -   return -EIO;
> > -   }
> > -
> > -   return 0;
> > +   ret = regmap_read_poll_timeout(infracfg, INFRA_TOPAXI_PROTECTSTA1,
> > +  val, !(val & mask), 10,
> > +  jiffies_to_usecs(HZ));
> > +   return ret;
> >  }
> >

Re: [PATCH] f2fs: sepearte hot/cold in free nid

2018-04-19 Thread Jaegeuk Kim

On 04/20, Chao Yu wrote:
> As most indirect node, dindirect node, and xattr node won't be updated
> after they are created, but inode node and other direct node will change
> more frequently, so store their nat entries mixedly in whole nat table
> will suffer:
> - fragment nat table soon due to different update rate
> - more nat block update due to fragmented nat table
> 
> In order to solve above issue, we're trying to separate whole nat table to
> two part:
> a. Hot free nid area:
>  - range: [nid #0, nid #x)
>  - store node block address for
>* inode node
>* other direct node
> b. Cold free nid area:
>  - range: [nid #x, max nid)
>  - store node block address for
>* indirect node
>* dindirect node
>* xattr node
> 
> Allocation strategy example:
> 
> Free nid: '-'
> Used nid: '='
> 
> 1. Initial status:
> Free Nids:
> |---|
>   ^   ^   ^   
> ^
> Alloc Range:  |---|   
> |---|
>   hot_start   hot_end 
> cold_start  cold_end
> 
> 2. Free nids have ran out:
> Free Nids:
> |===-===|
>   ^   ^   ^   
> ^
> Alloc Range:  |===|   
> |===|
>   hot_start   hot_end 
> cold_start  cold_end
> 
> 3. Expand hot/cold area range:
> Free Nids:
> |===-===|
>   ^   ^   ^   
> ^
> Alloc Range:  |===|   
> |===|
>   hot_start   hot_end cold_start  
> cold_end
> 
> 4. Hot free nids have ran out:
> Free Nids:
> |===-===|
>   ^   ^   ^   
> ^
> Alloc Range:  |===|   
> |===|
>   hot_start   hot_end cold_start  
> cold_end
> 
> 5. Expand hot area range, hot/cold area boundary has been fixed:
> Free Nids:
> |===-===|
>   ^   ^   
> ^
> Alloc Range:  
> |===|===|
>   hot_start   hot_end(cold_start) 
> cold_end
> 
> Run xfstests with generic/*:
> 
> before
> node_write:   169660
> cp_count: 60118
> node/cp   2.82
> 
> after:
> node_write:   159145
> cp_count: 84501
> node/cp:  2.64

Nice trial tho, I don't see much benefit on this huge patch. I guess we may be
able to find an efficient way to achieve this issue rather than changing whole
stable codes.

How about getting a free nid in the list from head or tail separately?

> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/checkpoint.c |   4 -
>  fs/f2fs/debug.c  |   6 +-
>  fs/f2fs/f2fs.h   |  19 +++-
>  fs/f2fs/inode.c  |   2 +-
>  fs/f2fs/namei.c  |   2 +-
>  fs/f2fs/node.c   | 302 
> ---
>  fs/f2fs/node.h   |  17 +--
>  fs/f2fs/segment.c|   8 +-
>  fs/f2fs/shrinker.c   |   3 +-
>  fs/f2fs/xattr.c  |  10 +-
>  10 files changed, 221 insertions(+), 152 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 96785ffc6181..c17feec72c74 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1029,14 +1029,10 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
>  static void __prepare_cp_block(struct f2fs_sb_info *sbi)
>  {
>   struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
> - struct f2fs_nm_info *nm_i = NM_I(sbi);
> - nid_t last_nid = nm_i->next_scan_nid;
>  
> - next_free_nid(sbi, &last_nid);
>   ckpt->valid_block_count = cpu_to_le64(valid_user_blocks(sbi));
>   ckpt->valid_node_count = cpu_to_le32(valid_node_count(sbi));
>   ckpt->valid_inode_count = cpu_to_le32(valid_inode_count(sbi));
> - ckpt->next_free_nid = cpu_to_le32(last_nid);
>  }
>  
>  /*
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index 7bb036a3bb81..b13c1d4f110f 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -100,7 +100,8 @@ static void update_general_status(struct f2fs_sb_info 
> *sbi)
>   si->dirty_nats = NM_I(sbi)->dirty_nat_cnt;
>   si->sits = MAIN_SEGS(sbi);
>   si->dirty_sits = SIT_I(sbi)->dirty_sentries;
> - si->free_nids = NM_I(sbi)->n

Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers

On Thu, Apr 19, 2018 at 07:44:40PM -0700, Eric Biggers wrote:
> On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> > On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> > 
> > > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > > without corresponding fill_super() is safe. We have so far crashed with
> > > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> > 
> > Consider the case when fill_super() returns an error immediately.
> > It is exactly the same situation.  And ->kill_sb() *is* called in cases
> > when fill_super() has failed.  Always had been - it's much less boilerplate
> > that way.
> > 
> > deactivate_locked_super() on that failure exit is the least painful
> > variant, unfortunately.
> > 
> > Filesystems with ->kill_sb() instances that rely upon something
> > done between sget() and the first failure exit after it need to be fixed.
> > And yes, that should've been spotted back then.  Sorry.
> > 
> > Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> > are safe and those are the majority.  Looking through the rest uncovers
> > some bugs; so far all I've seen were already there.  Note that normally
> > we have something like
> > static void affs_kill_sb(struct super_block *sb)
> > {
> > struct affs_sb_info *sbi = AFFS_SB(sb);
> > kill_block_super(sb);
> > if (sbi) {
> > affs_free_bitmap(sb);
> > affs_brelse(sbi->s_root_bh);
> > kfree(sbi->s_prefix);
> > mutex_destroy(&sbi->s_bmlock);
> > kfree(sbi);
> > }
> > }
> > which basically does one of the safe ones augmented with something that
> > takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> > Not everyone does, though:
> > 
> > jffs2_fill_super():
> > c = kzalloc(sizeof(*c), GFP_KERNEL);
> > if (!c)
> > return -ENOMEM;
> > in the very beginning.  So we can return from it with NULL ->s_fs_info.
> > Now, consider
> > struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> > if (!(sb->s_flags & MS_RDONLY))
> > jffs2_stop_garbage_collect_thread(c);
> > in jffs2_kill_sb() and
> > void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> > {
> > int wait = 0;
> > spin_lock(&c->erase_completion_lock);
> > if (c->gc_task) {
> > 
> > IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> > and eat an oops.  Always had been there, always hard to hit without
> > fault injectors and fortunately trivial to fix.
> > 
> > Similar in nfs_kill_super() calling nfs_free_server().
> > Similar in v9fs_kill_super() with 
> > v9fs_session_cancel()/v9fs_session_close() calls.
> > Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> > cifs_kill_sb()
> > (all trivial to fix)
> > 
> > Aha... nfsd_umount() is a new regression.
> > 
> > orangefs: old, trivial to fix.
> > 
> > cgroup_kill_sb(): old, hopefully easy to fix.  Note that 
> > kernfs_root_from_sb()
> > can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> > been
> > there.
> > 
> > AFAICS, after discarding the instances that do the right thing we are left 
> > with:
> > hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> > btrfs_kill_super,
> > cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> > proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> > 
> > Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are 
> > regressions.
> > So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've 
> > spotted
> > in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> > catching those - they are also dependent upon a specific allocation failing
> > at the right time).
> > 
> 
> Fix for the kernfs bug is now queued in vfs/for-linus:
> 
> #syz fix: kernfs: deal with early sget() failures
> 

But, there is still a related bug: when mounting sysfs, if register_shrinker()
fails in sget_userns(), then kernfs_kill_sb() gets called, which frees the
'struct kernfs_super_info'.  But, the 'struct kernfs_super_info' is also freed
in kernfs_mount_ns() by:

sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags,
 &init_user_ns, info);
if (IS_ERR(sb) || sb->s_fs_info != info)
kfree(info);
if (IS_ERR(sb))
return ERR_CAST(sb);

I guess the problem is that sget_userns() shouldn't take ownership of the 'info'
if it returns an error -- but, it actually does if register_shrinker() fails,
resulting in a double free.

Here is a reproducer and the KASAN splat.  This is on Linus' tree (87ef12027b9b)
with vfs/for-linus merged in.

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
int fd, i;
char buf[16];

unshare(CLO

[PATCH] ACPI / scan: Fix regression related to X-Gene UARTs

2018-04-19 Thread Mark Salter

Commit e361d1f85855 ("ACPI / scan: Fix enumeration for special UART
devices") caused a regression with some X-Gene based platforms (Mustang
and M400) with invalid DSDT. The DSDT makes it appear that the UART
device is also a slave device attached to itself. With the above commit
the UART won't be enumerated by ACPI scan (slave serial devices shouldn't
be). So check for X-Gene UART device and skip slace device check on it.

Signed-off-by: Mark Salter 
---
 drivers/acpi/scan.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index cc234e6a6297..1dcdd0122862 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1551,6 +1551,14 @@ static bool acpi_device_enumeration_by_parent(struct 
acpi_device *device)
 fwnode_property_present(&device->fwnode, "baud")))
return true;
 
+   /*
+* Firmware on some arm64 X-Gene platforms will make the UART
+* device appear as both a UART and a slave of that UART. Just
+* bail out here for X-Gene UARTs.
+*/
+   if (!strcmp(acpi_device_hid(device), "APMC0D08"))
+   return false;
+
INIT_LIST_HEAD(&resource_list);
acpi_dev_get_resources(device, &resource_list,
   acpi_check_serial_bus_slave,
-- 
2.14.3

Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write

2018-04-19 Thread Jaegeuk Kim

On 04/20, Chao Yu wrote:
> On 2018/4/20 11:12, Jaegeuk Kim wrote:
> > On 04/18, Chao Yu wrote:
> >> f2fs doesn't allow abuse on atomic write class interface, so except
> >> limiting in-mem pages' total memory usage capacity, we need to limit
> >> atomic-write usage as well when filesystem is seriously fragmented,
> >> otherwise we may run into infinite loop during foreground GC because
> >> target blocks in victim segment are belong to atomic opened file for
> >> long time.
> >>
> >> Now, we will detect failure due to atomic write in foreground GC, if
> >> the count exceeds threshold, we will drop all atomic written data in
> >> cache, by this, I expect it can keep our system running safely to
> >> prevent Dos attack.
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/f2fs.h|  1 +
> >>  fs/f2fs/file.c|  5 +
> >>  fs/f2fs/gc.c  | 27 +++
> >>  fs/f2fs/gc.h  |  3 +++
> >>  fs/f2fs/segment.c |  1 +
> >>  fs/f2fs/segment.h |  2 ++
> >>  6 files changed, 35 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index c1c3a1d11186..3453288d6a71 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -2249,6 +2249,7 @@ enum {
> >>FI_EXTRA_ATTR,  /* indicate file has extra attribute */
> >>FI_PROJ_INHERIT,/* indicate file inherits projectid */
> >>FI_PIN_FILE,/* indicate file should not be gced */
> >> +  FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been 
> >> dropped */
> >>  };
> >>  
> >>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> >> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >> index 7c90ded5a431..cddd9aee1bb2 100644
> >> --- a/fs/f2fs/file.c
> >> +++ b/fs/f2fs/file.c
> >> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file 
> >> *filp)
> >>  skip_flush:
> >>set_inode_flag(inode, FI_HOT_DATA);
> >>set_inode_flag(inode, FI_ATOMIC_FILE);
> >> +  clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >>f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
> >>  
> >>F2FS_I(inode)->inmem_task = current;
> >> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file 
> >> *filp)
> >>ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
> >>}
> >>  err_out:
> >> +  if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
> >> +  clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> >> +  ret = -EINVAL;
> >> +  }
> >>up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
> >>inode_unlock(inode);
> >>mnt_drop_write_file(filp);
> >> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >> index bfb7a4a3a929..495876ca62b6 100644
> >> --- a/fs/f2fs/gc.c
> >> +++ b/fs/f2fs/gc.c
> >> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
> >>gc_th->gc_urgent = 0;
> >>gc_th->gc_wake= 0;
> >>  
> >> +  gc_th->atomic_file = 0;
> >> +
> >>sbi->gc_thread = gc_th;
> >>init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> >>sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> >> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct 
> >> f2fs_summary *sum,
> >>   * This can be used to move blocks, aka LBAs, directly on disk.
> >>   */
> >>  static void move_data_block(struct inode *inode, block_t bidx,
> >> -  unsigned int segno, int off)
> >> +  int gc_type, unsigned int segno, int off)
> >>  {
> >>struct f2fs_io_info fio = {
> >>.sbi = F2FS_I_SB(inode),
> >> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, 
> >> block_t bidx,
> >>if (!check_valid_map(F2FS_I_SB(inode), segno, off))
> >>goto out;
> >>  
> >> -  if (f2fs_is_atomic_file(inode))
> >> +  if (f2fs_is_atomic_file(inode)) {
> >> +  F2FS_I_SB(inode)->gc_thread->atomic_file++;
> >>goto out;
> >> +  }
> >>  
> >>if (f2fs_is_pinned_file(inode)) {
> >>f2fs_pin_file_control(inode, true);
> >> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, 
> >> block_t bidx, int gc_type,
> >>if (!check_valid_map(F2FS_I_SB(inode), segno, off))
> >>goto out;
> >>  
> >> -  if (f2fs_is_atomic_file(inode))
> >> +  if (f2fs_is_atomic_file(inode)) {
> >> +  F2FS_I_SB(inode)->gc_thread->atomic_file++;
> >>goto out;
> >> +  }
> >>if (f2fs_is_pinned_file(inode)) {
> >>if (gc_type == FG_GC)
> >>f2fs_pin_file_control(inode, true);
> >> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, 
> >> struct f2fs_summary *sum,
> >>start_bidx = start_bidx_of_node(nofs, inode)
> >>+ ofs_in_node;
> >>if (f2fs_encrypted_file(inode))
> >> -  move_data_block(inode, start_bidx, segno, off);
> >> +  move_da

Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer

2018-04-19 Thread Chao Yu

On 2018/4/20 11:19, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> Thread A Thread BThread C
>> - f2fs_remount
>>  - stop_gc_thread
>>  - f2fs_sbi_store
>>  - issue_discard_thread
>>sbi->gc_thread = NULL;
>>sbi->gc_thread->gc_wake = 1
>>access 
>> sbi->gc_thread->gc_urgent
> 
> Do we simply need a lock for this?

Code will be more complicated for handling existed and new coming fields with
the sbi->gc_thread pointer, and causing unneeded lock overhead, right?

So let's just allocate memory during fill_super?

Thanks,

> 
>>
>> Previously, we allocate memory for sbi->gc_thread based on background
>> gc thread mount option, the memory can be released if we turn off
>> that mount option, but still there are several places access gc_thread
>> pointer without considering race condition, result in NULL point
>> dereference.
>>
>> In order to fix this issue, keep gc_thread structure valid in sbi all
>> the time instead of alloc/free it dynamically.
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/debug.c   |  3 +--
>>  fs/f2fs/f2fs.h|  7 +++
>>  fs/f2fs/gc.c  | 58 
>> +--
>>  fs/f2fs/segment.c |  4 ++--
>>  fs/f2fs/super.c   | 13 +++--
>>  fs/f2fs/sysfs.c   |  8 
>>  6 files changed, 60 insertions(+), 33 deletions(-)
>>
>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>> index 715beb85e9db..7bb036a3bb81 100644
>> --- a/fs/f2fs/debug.c
>> +++ b/fs/f2fs/debug.c
>> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
>>  si->cache_mem = 0;
>>  
>>  /* build gc */
>> -if (sbi->gc_thread)
>> -si->cache_mem += sizeof(struct f2fs_gc_kthread);
>> +si->cache_mem += sizeof(struct f2fs_gc_kthread);
>>  
>>  /* build merge flush thread */
>>  if (SM_I(sbi)->fcc_info)
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 567c6bb57ae3..c553f63199e8 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct 
>> f2fs_sb_info *sbi)
>>  return (struct sit_info *)(SM_I(sbi)->sit_info);
>>  }
>>  
>> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
>> +{
>> +return (struct f2fs_gc_kthread *)(sbi->gc_thread);
>> +}
>> +
>>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
>>  {
>>  return (struct free_segmap_info *)(SM_I(sbi)->free_info);
>> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t 
>> pos, size_t len);
>>  /*
>>   * gc.c
>>   */
>> +int init_gc_context(struct f2fs_sb_info *sbi);
>> +void destroy_gc_context(struct f2fs_sb_info * sbi);
>>  int start_gc_thread(struct f2fs_sb_info *sbi);
>>  void stop_gc_thread(struct f2fs_sb_info *sbi);
>>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index da89ca16a55d..7d310e454b77 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -26,8 +26,8 @@
>>  static int gc_thread_func(void *data)
>>  {
>>  struct f2fs_sb_info *sbi = data;
>> -struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
>> -wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
>> +struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>> +wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
>>  unsigned int wait_ms;
>>  
>>  wait_ms = gc_th->min_sleep_time;
>> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
>>  return 0;
>>  }
>>  
>> -int start_gc_thread(struct f2fs_sb_info *sbi)
>> +int init_gc_context(struct f2fs_sb_info *sbi)
>>  {
>>  struct f2fs_gc_kthread *gc_th;
>> -dev_t dev = sbi->sb->s_bdev->bd_dev;
>> -int err = 0;
>>  
>>  gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
>> -if (!gc_th) {
>> -err = -ENOMEM;
>> -goto out;
>> -}
>> +if (!gc_th)
>> +return -ENOMEM;
>> +
>> +gc_th->f2fs_gc_task = NULL;
>>  
>>  gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
>>  gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
>> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  gc_th->atomic_file[FG_GC] = 0;
>>  
>>  sbi->gc_thread = gc_th;
>> -init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>> -sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>> +
>> +return 0;
>> +}
>> +
>> +void destroy_gc_context(struct f2fs_sb_info *sbi)
>> +{
>> +kfree(GC_I(sbi));
>> +sbi->gc_thread = NULL;
>> +}
>> +
>> +int start_gc_thread(struct f2fs_sb_info *sbi)
>> +{
>> +struct f2fs_gc_kthread *gc_th = GC_I(sbi);
>> +dev_t dev = sbi->sb->s_bdev->bd_dev;
>> +int err = 0;
>> +
>> +init_waitqueue_head(&gc_th->gc_wait_queue_head);
>> +gc_th->f2fs_gc_task

Re: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-19 Thread Tiwei Bie

On Thu, Apr 19, 2018 at 09:40:23PM +0300, Michael S. Tsirkin wrote:
> On Tue, Apr 10, 2018 at 03:25:45PM +0800, Jason Wang wrote:
> > > > > One problem is that, different virtio ring compatible devices
> > > > > may have different device interfaces. That is to say, we will
> > > > > need different drivers in QEMU. It could be troublesome. And
> > > > > that's what this patch trying to fix. The idea behind this
> > > > > patch is very simple: mdev is a standard way to emulate device
> > > > > in kernel.
> > > > So you just move the abstraction layer from qemu to kernel, and you 
> > > > still
> > > > need different drivers in kernel for different device interfaces of
> > > > accelerators. This looks even more complex than leaving it in qemu. As 
> > > > you
> > > > said, another idea is to implement userspace vhost backend for 
> > > > accelerators
> > > > which seems easier and could co-work with other parts of qemu without
> > > > inventing new type of messages.
> > > I'm not quite sure. Do you think it's acceptable to
> > > add various vendor specific hardware drivers in QEMU?
> > > 
> > 
> > I don't object but we need to figure out the advantages of doing it in qemu
> > too.
> > 
> > Thanks
> 
> To be frank kernel is exactly where device drivers belong.  DPDK did
> move them to userspace but that's merely a requirement for data path.
> *If* you can have them in kernel that is best:
> - update kernel and there's no need to rebuild userspace
> - apps can be written in any language no need to maintain multiple
>   libraries or add wrappers
> - security concerns are much smaller (ok people are trying to
>   raise the bar with IOMMUs and such, but it's already pretty
>   good even without)
> 
> The biggest issue is that you let userspace poke at the
> device which is also allowed by the IOMMU to poke at
> kernel memory (needed for kernel driver to work).

I think the device won't and shouldn't be allowed to
poke at kernel memory. Its kernel driver needs some
kernel memory to work. But the device doesn't have
the access to them. Instead, the device only has the
access to:

(1) the entire memory of the VM (if vIOMMU isn't used)
or
(2) the memory belongs to the guest virtio device (if
vIOMMU is being used).

Below is the reason:

For the first case, we should program the IOMMU for
the hardware device based on the info in the memory
table which is the entire memory of the VM.

For the second case, we should program the IOMMU for
the hardware device based on the info in the shadow
page table of the vIOMMU.

So the memory can be accessed by the device is limited,
it should be safe especially for the second case.

My concern is that, in this RFC, we don't program the
IOMMU for the mdev device in the userspace via the VFIO
API directly. Instead, we pass the memory table to the
kernel driver via the mdev device (BAR0) and ask the
driver to do the IOMMU programming. Someone may don't
like it. The main reason why we don't program IOMMU via
VFIO API in userspace directly is that, currently IOMMU
drivers don't support mdev bus.

> 
> Yes, maybe if device is not buggy it's all fine, but
> it's better if we do not have to trust the device
> otherwise the security picture becomes more murky.
> 
> I suggested attaching a PASID to (some) queues - see my old post "using
> PASIDs to enable a safe variant of direct ring access".

It's pretty cool. We also have some similar ideas.
Cunming will talk more about this.

Best regards,
Tiwei Bie

> 
> Then using IOMMU with VFIO to limit access through queue to corrent
> ranges of memory.
> 
> 
> -- 
> MST

Re: [PATCH 4/5] f2fs: show GC failure info in debugfs

2018-04-19 Thread Chao Yu

On 2018/4/20 11:15, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> This patch adds to show GC failure information in debugfs, now it just
>> shows count of failure caused by atomic write.
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/debug.c |  5 +
>>  fs/f2fs/f2fs.h  |  1 +
>>  fs/f2fs/gc.c| 13 +++--
>>  fs/f2fs/gc.h|  2 +-
>>  4 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>> index a66107b5cfff..715beb85e9db 100644
>> --- a/fs/f2fs/debug.c
>> +++ b/fs/f2fs/debug.c
>> @@ -104,6 +104,8 @@ static void update_general_status(struct f2fs_sb_info 
>> *sbi)
>>  si->avail_nids = NM_I(sbi)->available_nids;
>>  si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
>>  si->bg_gc = sbi->bg_gc;
>> +si->bg_atomic = sbi->gc_thread->atomic_file[BG_GC];
>> +si->fg_atomic = sbi->gc_thread->atomic_file[FG_GC];
> 
> Need to change the naming like skipped_atomic_files?

OK

> 
>>  si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
>>  * 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
>>  / 2;
>> @@ -342,6 +344,9 @@ static int stat_show(struct seq_file *s, void *v)
>>  si->bg_data_blks);
>>  seq_printf(s, "  - node blocks : %d (%d)\n", si->node_blks,
>>  si->bg_node_blks);
>> +seq_printf(s, "Failure : atomic write %d (%d)\n",
> 
> It's not failure.

Alright... just skip..

> 
>> +si->bg_atomic + si->fg_atomic,
>> +si->bg_atomic);
>>  seq_puts(s, "\nExtent Cache:\n");
>>  seq_printf(s, "  - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
>>  si->hit_largest, si->hit_cached,
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 3453288d6a71..567c6bb57ae3 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -3003,6 +3003,7 @@ struct f2fs_stat_info {
>>  int bg_node_segs, bg_data_segs;
>>  int tot_blks, data_blks, node_blks;
>>  int bg_data_blks, bg_node_blks;
>> +unsigned int bg_atomic, fg_atomic;
>>  int curseg[NR_CURSEG_TYPE];
>>  int cursec[NR_CURSEG_TYPE];
>>  int curzone[NR_CURSEG_TYPE];
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index 495876ca62b6..da89ca16a55d 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -135,7 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  gc_th->gc_urgent = 0;
>>  gc_th->gc_wake= 0;
>>  
>> -gc_th->atomic_file = 0;
>> +gc_th->atomic_file[BG_GC] = 0;
>> +gc_th->atomic_file[FG_GC] = 0;
> 
> Need to merge the previous patch with this.

Let me merge them. :)

Thanks,

> 
>>  
>>  sbi->gc_thread = gc_th;
>>  init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>> @@ -633,7 +634,7 @@ static void move_data_block(struct inode *inode, block_t 
>> bidx,
>>  goto out;
>>  
>>  if (f2fs_is_atomic_file(inode)) {
>> -F2FS_I_SB(inode)->gc_thread->atomic_file++;
>> +F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>>  goto out;
>>  }
>>  
>> @@ -742,7 +743,7 @@ static void move_data_page(struct inode *inode, block_t 
>> bidx, int gc_type,
>>  goto out;
>>  
>>  if (f2fs_is_atomic_file(inode)) {
>> -F2FS_I_SB(inode)->gc_thread->atomic_file++;
>> +F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>>  goto out;
>>  }
>>  if (f2fs_is_pinned_file(inode)) {
>> @@ -1024,7 +1025,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  .ilist = LIST_HEAD_INIT(gc_list.ilist),
>>  .iroot = RADIX_TREE_INIT(GFP_NOFS),
>>  };
>> -unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
>> +unsigned int last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>>  unsigned int skipped_round = 0, round = 0;
>>  
>>  trace_f2fs_gc_begin(sbi->sb, sync, background,
>> @@ -1078,9 +1079,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>>  total_freed += seg_freed;
>>  
>>  if (gc_type == FG_GC) {
>> -if (sbi->gc_thread->atomic_file > last_atomic_file)
>> +if (sbi->gc_thread->atomic_file[FG_GC] > last_atomic_file)
>>  skipped_round++;
>> -last_atomic_file = sbi->gc_thread->atomic_file;
>> +last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>>  round++;
>>  }
>>  
>> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
>> index bc1d21d46ae7..a6cffe6b249b 100644
>> --- a/fs/f2fs/gc.h
>> +++ b/fs/f2fs/gc.h
>> @@ -41,7 +41,7 @@ struct f2fs_gc_kthread {
>>  unsigned int gc_wake;
>>  
>>  /* for stuck statistic */
>> -unsigned int atomic_file;
>> +unsigned int atomic_file[2];
>>  };
>>  
>>  struct gc_inode_list {
>> -- 
>> 2.15.0.55.gc2ece9dc4de6
> 
> .
>

Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write

2018-04-19 Thread Chao Yu

On 2018/4/20 11:12, Jaegeuk Kim wrote:
> On 04/18, Chao Yu wrote:
>> f2fs doesn't allow abuse on atomic write class interface, so except
>> limiting in-mem pages' total memory usage capacity, we need to limit
>> atomic-write usage as well when filesystem is seriously fragmented,
>> otherwise we may run into infinite loop during foreground GC because
>> target blocks in victim segment are belong to atomic opened file for
>> long time.
>>
>> Now, we will detect failure due to atomic write in foreground GC, if
>> the count exceeds threshold, we will drop all atomic written data in
>> cache, by this, I expect it can keep our system running safely to
>> prevent Dos attack.
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/f2fs.h|  1 +
>>  fs/f2fs/file.c|  5 +
>>  fs/f2fs/gc.c  | 27 +++
>>  fs/f2fs/gc.h  |  3 +++
>>  fs/f2fs/segment.c |  1 +
>>  fs/f2fs/segment.h |  2 ++
>>  6 files changed, 35 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index c1c3a1d11186..3453288d6a71 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -2249,6 +2249,7 @@ enum {
>>  FI_EXTRA_ATTR,  /* indicate file has extra attribute */
>>  FI_PROJ_INHERIT,/* indicate file inherits projectid */
>>  FI_PIN_FILE,/* indicate file should not be gced */
>> +FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been 
>> dropped */
>>  };
>>  
>>  static inline void __mark_inode_dirty_flag(struct inode *inode,
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 7c90ded5a431..cddd9aee1bb2 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file 
>> *filp)
>>  skip_flush:
>>  set_inode_flag(inode, FI_HOT_DATA);
>>  set_inode_flag(inode, FI_ATOMIC_FILE);
>> +clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>>  f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>>  
>>  F2FS_I(inode)->inmem_task = current;
>> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file 
>> *filp)
>>  ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>>  }
>>  err_out:
>> +if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
>> +clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>> +ret = -EINVAL;
>> +}
>>  up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
>>  inode_unlock(inode);
>>  mnt_drop_write_file(filp);
>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>> index bfb7a4a3a929..495876ca62b6 100644
>> --- a/fs/f2fs/gc.c
>> +++ b/fs/f2fs/gc.c
>> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>>  gc_th->gc_urgent = 0;
>>  gc_th->gc_wake= 0;
>>  
>> +gc_th->atomic_file = 0;
>> +
>>  sbi->gc_thread = gc_th;
>>  init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>>  sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct 
>> f2fs_summary *sum,
>>   * This can be used to move blocks, aka LBAs, directly on disk.
>>   */
>>  static void move_data_block(struct inode *inode, block_t bidx,
>> -unsigned int segno, int off)
>> +int gc_type, unsigned int segno, int off)
>>  {
>>  struct f2fs_io_info fio = {
>>  .sbi = F2FS_I_SB(inode),
>> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, 
>> block_t bidx,
>>  if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>>  goto out;
>>  
>> -if (f2fs_is_atomic_file(inode))
>> +if (f2fs_is_atomic_file(inode)) {
>> +F2FS_I_SB(inode)->gc_thread->atomic_file++;
>>  goto out;
>> +}
>>  
>>  if (f2fs_is_pinned_file(inode)) {
>>  f2fs_pin_file_control(inode, true);
>> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t 
>> bidx, int gc_type,
>>  if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>>  goto out;
>>  
>> -if (f2fs_is_atomic_file(inode))
>> +if (f2fs_is_atomic_file(inode)) {
>> +F2FS_I_SB(inode)->gc_thread->atomic_file++;
>>  goto out;
>> +}
>>  if (f2fs_is_pinned_file(inode)) {
>>  if (gc_type == FG_GC)
>>  f2fs_pin_file_control(inode, true);
>> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, 
>> struct f2fs_summary *sum,
>>  start_bidx = start_bidx_of_node(nofs, inode)
>>  + ofs_in_node;
>>  if (f2fs_encrypted_file(inode))
>> -move_data_block(inode, start_bidx, segno, off);
>> +move_data_block(inode, start_bidx, gc_type,
>> +segno, off);
>>  else
>>

Re: [PATCH 5/5] f2fs: fix to avoid race during access gc_thread pointer

2018-04-19 Thread Jaegeuk Kim

On 04/18, Chao Yu wrote:
> Thread A  Thread BThread C
> - f2fs_remount
>  - stop_gc_thread
>   - f2fs_sbi_store
>   - issue_discard_thread
>sbi->gc_thread = NULL;
> sbi->gc_thread->gc_wake = 1
> access 
> sbi->gc_thread->gc_urgent

Do we simply need a lock for this?

> 
> Previously, we allocate memory for sbi->gc_thread based on background
> gc thread mount option, the memory can be released if we turn off
> that mount option, but still there are several places access gc_thread
> pointer without considering race condition, result in NULL point
> dereference.
> 
> In order to fix this issue, keep gc_thread structure valid in sbi all
> the time instead of alloc/free it dynamically.
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/debug.c   |  3 +--
>  fs/f2fs/f2fs.h|  7 +++
>  fs/f2fs/gc.c  | 58 
> +--
>  fs/f2fs/segment.c |  4 ++--
>  fs/f2fs/super.c   | 13 +++--
>  fs/f2fs/sysfs.c   |  8 
>  6 files changed, 60 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index 715beb85e9db..7bb036a3bb81 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -223,8 +223,7 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
>   si->cache_mem = 0;
>  
>   /* build gc */
> - if (sbi->gc_thread)
> - si->cache_mem += sizeof(struct f2fs_gc_kthread);
> + si->cache_mem += sizeof(struct f2fs_gc_kthread);
>  
>   /* build merge flush thread */
>   if (SM_I(sbi)->fcc_info)
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 567c6bb57ae3..c553f63199e8 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1412,6 +1412,11 @@ static inline struct sit_info *SIT_I(struct 
> f2fs_sb_info *sbi)
>   return (struct sit_info *)(SM_I(sbi)->sit_info);
>  }
>  
> +static inline struct f2fs_gc_kthread *GC_I(struct f2fs_sb_info *sbi)
> +{
> + return (struct f2fs_gc_kthread *)(sbi->gc_thread);
> +}
> +
>  static inline struct free_segmap_info *FREE_I(struct f2fs_sb_info *sbi)
>  {
>   return (struct free_segmap_info *)(SM_I(sbi)->free_info);
> @@ -2954,6 +2959,8 @@ bool f2fs_overwrite_io(struct inode *inode, loff_t pos, 
> size_t len);
>  /*
>   * gc.c
>   */
> +int init_gc_context(struct f2fs_sb_info *sbi);
> +void destroy_gc_context(struct f2fs_sb_info * sbi);
>  int start_gc_thread(struct f2fs_sb_info *sbi);
>  void stop_gc_thread(struct f2fs_sb_info *sbi);
>  block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index da89ca16a55d..7d310e454b77 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -26,8 +26,8 @@
>  static int gc_thread_func(void *data)
>  {
>   struct f2fs_sb_info *sbi = data;
> - struct f2fs_gc_kthread *gc_th = sbi->gc_thread;
> - wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head;
> + struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> + wait_queue_head_t *wq = &gc_th->gc_wait_queue_head;
>   unsigned int wait_ms;
>  
>   wait_ms = gc_th->min_sleep_time;
> @@ -114,17 +114,15 @@ static int gc_thread_func(void *data)
>   return 0;
>  }
>  
> -int start_gc_thread(struct f2fs_sb_info *sbi)
> +int init_gc_context(struct f2fs_sb_info *sbi)
>  {
>   struct f2fs_gc_kthread *gc_th;
> - dev_t dev = sbi->sb->s_bdev->bd_dev;
> - int err = 0;
>  
>   gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
> - if (!gc_th) {
> - err = -ENOMEM;
> - goto out;
> - }
> + if (!gc_th)
> + return -ENOMEM;
> +
> + gc_th->f2fs_gc_task = NULL;
>  
>   gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
>   gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
> @@ -139,26 +137,41 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>   gc_th->atomic_file[FG_GC] = 0;
>  
>   sbi->gc_thread = gc_th;
> - init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> - sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> +
> + return 0;
> +}
> +
> +void destroy_gc_context(struct f2fs_sb_info *sbi)
> +{
> + kfree(GC_I(sbi));
> + sbi->gc_thread = NULL;
> +}
> +
> +int start_gc_thread(struct f2fs_sb_info *sbi)
> +{
> + struct f2fs_gc_kthread *gc_th = GC_I(sbi);
> + dev_t dev = sbi->sb->s_bdev->bd_dev;
> + int err = 0;
> +
> + init_waitqueue_head(&gc_th->gc_wait_queue_head);
> + gc_th->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
>   "f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
>   if (IS_ERR(gc_th->f2fs_gc_task)) {
>   err = PTR_ERR(gc_th->f2fs_gc_task);
> - kfree(gc_th);
> - sbi->gc_thread = NULL;
> + gc_th->f2fs_gc_task = NULL;
>   }
> -out:
> +
>

Re: [PATCH 4/5] f2fs: show GC failure info in debugfs

2018-04-19 Thread Jaegeuk Kim

On 04/18, Chao Yu wrote:
> This patch adds to show GC failure information in debugfs, now it just
> shows count of failure caused by atomic write.
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/debug.c |  5 +
>  fs/f2fs/f2fs.h  |  1 +
>  fs/f2fs/gc.c| 13 +++--
>  fs/f2fs/gc.h|  2 +-
>  4 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index a66107b5cfff..715beb85e9db 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -104,6 +104,8 @@ static void update_general_status(struct f2fs_sb_info 
> *sbi)
>   si->avail_nids = NM_I(sbi)->available_nids;
>   si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
>   si->bg_gc = sbi->bg_gc;
> + si->bg_atomic = sbi->gc_thread->atomic_file[BG_GC];
> + si->fg_atomic = sbi->gc_thread->atomic_file[FG_GC];

Need to change the naming like skipped_atomic_files?

>   si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
>   * 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
>   / 2;
> @@ -342,6 +344,9 @@ static int stat_show(struct seq_file *s, void *v)
>   si->bg_data_blks);
>   seq_printf(s, "  - node blocks : %d (%d)\n", si->node_blks,
>   si->bg_node_blks);
> + seq_printf(s, "Failure : atomic write %d (%d)\n",

It's not failure.

> + si->bg_atomic + si->fg_atomic,
> + si->bg_atomic);
>   seq_puts(s, "\nExtent Cache:\n");
>   seq_printf(s, "  - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
>   si->hit_largest, si->hit_cached,
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 3453288d6a71..567c6bb57ae3 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -3003,6 +3003,7 @@ struct f2fs_stat_info {
>   int bg_node_segs, bg_data_segs;
>   int tot_blks, data_blks, node_blks;
>   int bg_data_blks, bg_node_blks;
> + unsigned int bg_atomic, fg_atomic;
>   int curseg[NR_CURSEG_TYPE];
>   int cursec[NR_CURSEG_TYPE];
>   int curzone[NR_CURSEG_TYPE];
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 495876ca62b6..da89ca16a55d 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -135,7 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>   gc_th->gc_urgent = 0;
>   gc_th->gc_wake= 0;
>  
> - gc_th->atomic_file = 0;
> + gc_th->atomic_file[BG_GC] = 0;
> + gc_th->atomic_file[FG_GC] = 0;

Need to merge the previous patch with this.

>  
>   sbi->gc_thread = gc_th;
>   init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
> @@ -633,7 +634,7 @@ static void move_data_block(struct inode *inode, block_t 
> bidx,
>   goto out;
>  
>   if (f2fs_is_atomic_file(inode)) {
> - F2FS_I_SB(inode)->gc_thread->atomic_file++;
> + F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>   goto out;
>   }
>  
> @@ -742,7 +743,7 @@ static void move_data_page(struct inode *inode, block_t 
> bidx, int gc_type,
>   goto out;
>  
>   if (f2fs_is_atomic_file(inode)) {
> - F2FS_I_SB(inode)->gc_thread->atomic_file++;
> + F2FS_I_SB(inode)->gc_thread->atomic_file[gc_type]++;
>   goto out;
>   }
>   if (f2fs_is_pinned_file(inode)) {
> @@ -1024,7 +1025,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>   .ilist = LIST_HEAD_INIT(gc_list.ilist),
>   .iroot = RADIX_TREE_INIT(GFP_NOFS),
>   };
> - unsigned int last_atomic_file = sbi->gc_thread->atomic_file;
> + unsigned int last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>   unsigned int skipped_round = 0, round = 0;
>  
>   trace_f2fs_gc_begin(sbi->sb, sync, background,
> @@ -1078,9 +1079,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>   total_freed += seg_freed;
>  
>   if (gc_type == FG_GC) {
> - if (sbi->gc_thread->atomic_file > last_atomic_file)
> + if (sbi->gc_thread->atomic_file[FG_GC] > last_atomic_file)
>   skipped_round++;
> - last_atomic_file = sbi->gc_thread->atomic_file;
> + last_atomic_file = sbi->gc_thread->atomic_file[FG_GC];
>   round++;
>   }
>  
> diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h
> index bc1d21d46ae7..a6cffe6b249b 100644
> --- a/fs/f2fs/gc.h
> +++ b/fs/f2fs/gc.h
> @@ -41,7 +41,7 @@ struct f2fs_gc_kthread {
>   unsigned int gc_wake;
>  
>   /* for stuck statistic */
> - unsigned int atomic_file;
> + unsigned int atomic_file[2];
>  };
>  
>  struct gc_inode_list {
> -- 
> 2.15.0.55.gc2ece9dc4de6

Re: [f2fs-dev] [PATCH] f2fs: sepearte hot/cold in free nid

2018-04-19 Thread Chao Yu

On 2018/4/20 10:30, heyunlei wrote:
> 
> 
>> -Original Message-
>> From: Chao Yu [mailto:yuch...@huawei.com]
>> Sent: Friday, April 20, 2018 9:53 AM
>> To: jaeg...@kernel.org
>> Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
>> Subject: [f2fs-dev] [PATCH] f2fs: sepearte hot/cold in free nid
>>
>> As most indirect node, dindirect node, and xattr node won't be updated
>> after they are created, but inode node and other direct node will change
>> more frequently, so store their nat entries mixedly in whole nat table
>> will suffer:
>> - fragment nat table soon due to different update rate
>> - more nat block update due to fragmented nat table
>>
> 
> BTW, should we enable this patch:  f2fs: reuse nids more aggressively?
> 
> I think it will decrease nat area fragment and will decrease io of nat?

For a fragmented nat table, there will be no different in between reusing
obsolete nid or allocating nid from next nat block.

IMO, in order to decrease nat block write, it needs to add more allocation
algorithm like a filesystem does, but firstly, I'd like to separate hot entry
from cold one.

Thanks,

Re: [PATCH 3/5] f2fs: avoid stucking GC due to atomic write

2018-04-19 Thread Jaegeuk Kim

On 04/18, Chao Yu wrote:
> f2fs doesn't allow abuse on atomic write class interface, so except
> limiting in-mem pages' total memory usage capacity, we need to limit
> atomic-write usage as well when filesystem is seriously fragmented,
> otherwise we may run into infinite loop during foreground GC because
> target blocks in victim segment are belong to atomic opened file for
> long time.
> 
> Now, we will detect failure due to atomic write in foreground GC, if
> the count exceeds threshold, we will drop all atomic written data in
> cache, by this, I expect it can keep our system running safely to
> prevent Dos attack.
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/f2fs.h|  1 +
>  fs/f2fs/file.c|  5 +
>  fs/f2fs/gc.c  | 27 +++
>  fs/f2fs/gc.h  |  3 +++
>  fs/f2fs/segment.c |  1 +
>  fs/f2fs/segment.h |  2 ++
>  6 files changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index c1c3a1d11186..3453288d6a71 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -2249,6 +2249,7 @@ enum {
>   FI_EXTRA_ATTR,  /* indicate file has extra attribute */
>   FI_PROJ_INHERIT,/* indicate file inherits projectid */
>   FI_PIN_FILE,/* indicate file should not be gced */
> + FI_ATOMIC_REVOKE_REQUEST,/* indicate atomic committed data has been 
> dropped */
>  };
>  
>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 7c90ded5a431..cddd9aee1bb2 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1698,6 +1698,7 @@ static int f2fs_ioc_start_atomic_write(struct file 
> *filp)
>  skip_flush:
>   set_inode_flag(inode, FI_HOT_DATA);
>   set_inode_flag(inode, FI_ATOMIC_FILE);
> + clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
>   f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>  
>   F2FS_I(inode)->inmem_task = current;
> @@ -1746,6 +1747,10 @@ static int f2fs_ioc_commit_atomic_write(struct file 
> *filp)
>   ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
>   }
>  err_out:
> + if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
> + clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
> + ret = -EINVAL;
> + }
>   up_write(&F2FS_I(inode)->dio_rwsem[WRITE]);
>   inode_unlock(inode);
>   mnt_drop_write_file(filp);
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index bfb7a4a3a929..495876ca62b6 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -135,6 +135,8 @@ int start_gc_thread(struct f2fs_sb_info *sbi)
>   gc_th->gc_urgent = 0;
>   gc_th->gc_wake= 0;
>  
> + gc_th->atomic_file = 0;
> +
>   sbi->gc_thread = gc_th;
>   init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head);
>   sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
> @@ -603,7 +605,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct 
> f2fs_summary *sum,
>   * This can be used to move blocks, aka LBAs, directly on disk.
>   */
>  static void move_data_block(struct inode *inode, block_t bidx,
> - unsigned int segno, int off)
> + int gc_type, unsigned int segno, int off)
>  {
>   struct f2fs_io_info fio = {
>   .sbi = F2FS_I_SB(inode),
> @@ -630,8 +632,10 @@ static void move_data_block(struct inode *inode, block_t 
> bidx,
>   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>   goto out;
>  
> - if (f2fs_is_atomic_file(inode))
> + if (f2fs_is_atomic_file(inode)) {
> + F2FS_I_SB(inode)->gc_thread->atomic_file++;
>   goto out;
> + }
>  
>   if (f2fs_is_pinned_file(inode)) {
>   f2fs_pin_file_control(inode, true);
> @@ -737,8 +741,10 @@ static void move_data_page(struct inode *inode, block_t 
> bidx, int gc_type,
>   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
>   goto out;
>  
> - if (f2fs_is_atomic_file(inode))
> + if (f2fs_is_atomic_file(inode)) {
> + F2FS_I_SB(inode)->gc_thread->atomic_file++;
>   goto out;
> + }
>   if (f2fs_is_pinned_file(inode)) {
>   if (gc_type == FG_GC)
>   f2fs_pin_file_control(inode, true);
> @@ -900,7 +906,8 @@ static void gc_data_segment(struct f2fs_sb_info *sbi, 
> struct f2fs_summary *sum,
>   start_bidx = start_bidx_of_node(nofs, inode)
>   + ofs_in_node;
>   if (f2fs_encrypted_file(inode))
> - move_data_block(inode, start_bidx, segno, off);
> + move_data_block(inode, start_bidx, gc_type,
> + segno, off);
>   else
>   move_data_page(inode, start_bidx, gc_type,
>

Re: [PATCH] virtio_ring: switch to dma_XX barriers for rpmsg

2018-04-19 Thread Jason Wang




On 2018年04月20日 01:35, Michael S. Tsirkin wrote:

virtio is using barriers to order memory accesses, thus
dma_wmb/rmb is a good match.

Build-tested on x86: Before

[mst@tuck linux]$ size drivers/virtio/virtio_ring.o
textdata bss dec hex filename
   11392 820   0   122122fb4 drivers/virtio/virtio_ring.o

After
mst@tuck linux]$ size drivers/virtio/virtio_ring.o
textdata bss dec hex filename
   11284 820   0   121042f48 drivers/virtio/virtio_ring.o

Cc: Ohad Ben-Cohen 
Cc: Bjorn Andersson 
Cc: linux-remotep...@vger.kernel.org
Signed-off-by: Michael S. Tsirkin 
---

It's good in theory, but could one of RPMSG maintainers please review
and ack this patch? Or even better test it?

All these barriers are useless on Intel anyway ...

  include/linux/virtio_ring.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index bbf3252..fab0213 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -35,7 +35,7 @@ static inline void virtio_rmb(bool weak_barriers)
if (weak_barriers)
virt_rmb();
else
-   rmb();
+   dma_rmb();
  }
  
  static inline void virtio_wmb(bool weak_barriers)

@@ -43,7 +43,7 @@ static inline void virtio_wmb(bool weak_barriers)
if (weak_barriers)
virt_wmb();
else
-   wmb();
+   dma_wmb();
  }
  
  static inline void virtio_store_mb(bool weak_barriers,


Acked-by: Jason Wang

Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings

2018-04-19 Thread Dan Williams

On Thu, Apr 19, 2018 at 3:44 AM, Jan Kara  wrote:
> On Fri 13-04-18 15:03:51, Dan Williams wrote:
>> On Mon, Apr 9, 2018 at 9:51 AM, Dan Williams  
>> wrote:
>> > On Mon, Apr 9, 2018 at 9:49 AM, Jan Kara  wrote:
>> >> On Sat 07-04-18 12:38:24, Dan Williams wrote:
>> > [..]
>> >>> I wonder if this can be trivially solved by using srcu. I.e. we don't
>> >>> need to wait for a global quiescent state, just a
>> >>> get_user_pages_fast() quiescent state. ...or is that an abuse of the
>> >>> srcu api?
>> >>
>> >> Well, I'd rather use the percpu rwsemaphore (linux/percpu-rwsem.h) than
>> >> SRCU. It is a more-or-less standard locking mechanism rather than relying
>> >> on implementation properties of SRCU which is a data structure protection
>> >> method. And the overhead of percpu rwsemaphore for your use case should be
>> >> about the same as that of SRCU.
>> >
>> > I was just about to ask that. Yes, it seems they would share similar
>> > properties and it would be better to use the explicit implementation
>> > rather than a side effect of srcu.
>>
>> ...unfortunately:
>>
>>  BUG: sleeping function called from invalid context at
>> ./include/linux/percpu-rwsem.h:34
>>  [..]
>>  Call Trace:
>>   dump_stack+0x85/0xcb
>>   ___might_sleep+0x15b/0x240
>>   dax_layout_lock+0x18/0x80
>>   get_user_pages_fast+0xf8/0x140
>>
>> ...and thinking about it more srcu is a better fit. We don't need the
>> 100% exclusion provided by an rwsem we only need the guarantee that
>> all cpus that might have been running get_user_pages_fast() have
>> finished it at least once.
>>
>> In my tests synchronize_srcu is a bit slower than unpatched for the
>> trivial 100 truncate test, but certainly not the 200x latency you were
>> seeing with syncrhonize_rcu.
>>
>> Elapsed time:
>> 0.006149178 unpatched
>> 0.009426360 srcu
>
> Hum, right. Yesterday I was looking into KSM for a different reason and
> I've noticed it also does writeprotect pages and deals with races with GUP.
> And what KSM relies on is:
>
> write_protect_page()
>   ...
>   entry = ptep_clear_flush(vma, pvmw.address, pvmw.pte);
>   /*
>* Check that no O_DIRECT or similar I/O is in progress on the
>* page
>*/
>   if (page_mapcount(page) + 1 + swapped != page_count(page)) {
> page used -> bail

Slick.

>   }
>
> And this really works because gup_pte_range() does:
>
>   page = pte_page(pte);
>   head = compound_head(page);
>
>   if (!page_cache_get_speculative(head))
> goto pte_unmap;
>
>   if (unlikely(pte_val(pte) != pte_val(*ptep))) {
> bail

Need to add a similar check to __gup_device_huge_pmd.

>   }
>
> So either write_protect_page() page sees the elevated reference or
> gup_pte_range() bails because it will see the pte changed.
>
> In the truncate path things are a bit different but in principle the same
> should work - once truncate blocks page faults and unmaps pages from page
> tables, we can be sure GUP will not grab the page anymore or we'll see
> elevated page count. So IMO there's no need for any additional locking
> against the GUP path (but a comment explaining this is highly desirable I
> guess).

Yes, those "pte_val(pte) != pte_val(*ptep)" checks should be
documented for the same reason we require comments on rmb/wmb pairs.
I'll take a look, thanks Jan.

Re: general protection fault in kernfs_kill_sb

2018-04-19 Thread Eric Biggers

On Mon, Apr 02, 2018 at 03:34:15PM +0100, Al Viro wrote:
> On Mon, Apr 02, 2018 at 07:40:22PM +0900, Tetsuo Handa wrote:
> 
> > That commit assumes that calling kill_sb() from deactivate_locked_super(s)
> > without corresponding fill_super() is safe. We have so far crashed with
> > rpc_mount() and kernfs_mount_ns(). Is that really safe?
> 
>   Consider the case when fill_super() returns an error immediately.
> It is exactly the same situation.  And ->kill_sb() *is* called in cases
> when fill_super() has failed.  Always had been - it's much less boilerplate
> that way.
> 
>   deactivate_locked_super() on that failure exit is the least painful
> variant, unfortunately.
> 
>   Filesystems with ->kill_sb() instances that rely upon something
> done between sget() and the first failure exit after it need to be fixed.
> And yes, that should've been spotted back then.  Sorry.
> 
> Fortunately, we don't have many of those - kill_{block,litter,anon}_super()
> are safe and those are the majority.  Looking through the rest uncovers
> some bugs; so far all I've seen were already there.  Note that normally
> we have something like
> static void affs_kill_sb(struct super_block *sb)
> {
> struct affs_sb_info *sbi = AFFS_SB(sb);
> kill_block_super(sb);
> if (sbi) {
> affs_free_bitmap(sb);
> affs_brelse(sbi->s_root_bh);
> kfree(sbi->s_prefix);
> mutex_destroy(&sbi->s_bmlock);
> kfree(sbi);
> }
> }
> which basically does one of the safe ones augmented with something that
> takes care *not* to assume that e.g. ->s_fs_info has been allocated.
> Not everyone does, though:
> 
> jffs2_fill_super():
> c = kzalloc(sizeof(*c), GFP_KERNEL);
> if (!c)
> return -ENOMEM;
> in the very beginning.  So we can return from it with NULL ->s_fs_info.
> Now, consider
> struct jffs2_sb_info *c = JFFS2_SB_INFO(sb);
> if (!(sb->s_flags & MS_RDONLY))
> jffs2_stop_garbage_collect_thread(c);
> in jffs2_kill_sb() and
> void jffs2_stop_garbage_collect_thread(struct jffs2_sb_info *c)
> {
> int wait = 0;
> spin_lock(&c->erase_completion_lock);
> if (c->gc_task) {
> 
> IOW, fail that kzalloc() (or, indeed, an allocation in register_shrinker())
> and eat an oops.  Always had been there, always hard to hit without
> fault injectors and fortunately trivial to fix.
> 
> Similar in nfs_kill_super() calling nfs_free_server().
> Similar in v9fs_kill_super() with v9fs_session_cancel()/v9fs_session_close() 
> calls.
> Similar in hypfs_kill_super(), afs_kill_super(), btrfs_kill_super(), 
> cifs_kill_sb()
> (all trivial to fix)
> 
> Aha... nfsd_umount() is a new regression.
> 
> orangefs: old, trivial to fix.
> 
> cgroup_kill_sb(): old, hopefully easy to fix.  Note that kernfs_root_from_sb()
> can bloody well return NULL, making cgroup_root_from_kf() oops.  Always had 
> been
> there.
> 
> AFAICS, after discarding the instances that do the right thing we are left 
> with:
> hypfs_kill_super, rdt_kill_sb, v9fs_kill_super, afs_kill_super, 
> btrfs_kill_super,
> cifs_kill_sb, jffs2_kill_sb, nfs_kill_super, nfsd_umount, orangefs_kill_sb,
> proc_kill_sb, sysfs_kill_sb, cgroup_kill_sb, rpc_kill_sb.
> 
> Out of those, nfsd_umount(), proc_kill_sb() and rpc_kill_sb() are regressions.
> So are rdt_kill_sb() and sysfs_kill_sb() (victims of the issue you've spotted
> in kernfs_kill_sb()).  The rest are old (and I wonder if syzbot had been
> catching those - they are also dependent upon a specific allocation failing
> at the right time).
> 

Fix for the kernfs bug is now queued in vfs/for-linus:

#syz fix: kernfs: deal with early sget() failures

syzkaller just recently (3 weeks ago) gained the ability to mount filesystem
images, so that's the main reason for the increase in filesystem bug reports.
Each time syzkaller is updated to cover more code, bugs are found.

- Eric

Re: [v2] prctl: Deprecate non PR_SET_MM_MAP operations

2018-04-19 Thread Sergey Senozhatsky

On (04/05/18 21:26), Cyrill Gorcunov wrote:
[..]
> -
>  #ifdef CONFIG_CHECKPOINT_RESTORE
>   if (opt == PR_SET_MM_MAP || opt == PR_SET_MM_MAP_SIZE)
>   return prctl_set_mm_map(opt, (const void __user *)addr, arg4);
>  #endif
>  
> - if (!capable(CAP_SYS_RESOURCE))
> - return -EPERM;
> -
> - if (opt == PR_SET_MM_EXE_FILE)
> - return prctl_set_mm_exe_file(mm, (unsigned int)addr);
> -
> - if (opt == PR_SET_MM_AUXV)
> - return prctl_set_auxv(mm, addr, arg4);

Then validate_prctl_map() and prctl_set_mm_exe_file() can be moved
under CONFIG_CHECKPOINT_RESTORE ifdef.

---

 kernel/sys.c | 126 +--
 1 file changed, 63 insertions(+), 63 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 6bdffe264303..86e5ef1a5612 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1815,68 +1815,7 @@ SYSCALL_DEFINE1(umask, int, mask)
return mask;
 }
 
-static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
-{
-   struct fd exe;
-   struct file *old_exe, *exe_file;
-   struct inode *inode;
-   int err;
-
-   exe = fdget(fd);
-   if (!exe.file)
-   return -EBADF;
-
-   inode = file_inode(exe.file);
-
-   /*
-* Because the original mm->exe_file points to executable file, make
-* sure that this one is executable as well, to avoid breaking an
-* overall picture.
-*/
-   err = -EACCES;
-   if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
-   goto exit;
-
-   err = inode_permission(inode, MAY_EXEC);
-   if (err)
-   goto exit;
-
-   /*
-* Forbid mm->exe_file change if old file still mapped.
-*/
-   exe_file = get_mm_exe_file(mm);
-   err = -EBUSY;
-   if (exe_file) {
-   struct vm_area_struct *vma;
-
-   down_read(&mm->mmap_sem);
-   for (vma = mm->mmap; vma; vma = vma->vm_next) {
-   if (!vma->vm_file)
-   continue;
-   if (path_equal(&vma->vm_file->f_path,
-  &exe_file->f_path))
-   goto exit_err;
-   }
-
-   up_read(&mm->mmap_sem);
-   fput(exe_file);
-   }
-
-   err = 0;
-   /* set the new file, lockless */
-   get_file(exe.file);
-   old_exe = xchg(&mm->exe_file, exe.file);
-   if (old_exe)
-   fput(old_exe);
-exit:
-   fdput(exe);
-   return err;
-exit_err:
-   up_read(&mm->mmap_sem);
-   fput(exe_file);
-   goto exit;
-}
-
+#ifdef CONFIG_CHECKPOINT_RESTORE
 /*
  * WARNING: we don't require any capability here so be very careful
  * in what is allowed for modification from userspace.
@@ -1968,7 +1907,68 @@ static int validate_prctl_map(struct prctl_mm_map 
*prctl_map)
return error;
 }
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
+{
+   struct fd exe;
+   struct file *old_exe, *exe_file;
+   struct inode *inode;
+   int err;
+
+   exe = fdget(fd);
+   if (!exe.file)
+   return -EBADF;
+
+   inode = file_inode(exe.file);
+
+   /*
+* Because the original mm->exe_file points to executable file, make
+* sure that this one is executable as well, to avoid breaking an
+* overall picture.
+*/
+   err = -EACCES;
+   if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
+   goto exit;
+
+   err = inode_permission(inode, MAY_EXEC);
+   if (err)
+   goto exit;
+
+   /*
+* Forbid mm->exe_file change if old file still mapped.
+*/
+   exe_file = get_mm_exe_file(mm);
+   err = -EBUSY;
+   if (exe_file) {
+   struct vm_area_struct *vma;
+
+   down_read(&mm->mmap_sem);
+   for (vma = mm->mmap; vma; vma = vma->vm_next) {
+   if (!vma->vm_file)
+   continue;
+   if (path_equal(&vma->vm_file->f_path,
+  &exe_file->f_path))
+   goto exit_err;
+   }
+
+   up_read(&mm->mmap_sem);
+   fput(exe_file);
+   }
+
+   err = 0;
+   /* set the new file, lockless */
+   get_file(exe.file);
+   old_exe = xchg(&mm->exe_file, exe.file);
+   if (old_exe)
+   fput(old_exe);
+exit:
+   fdput(exe);
+   return err;
+exit_err:
+   up_read(&mm->mmap_sem);
+   fput(exe_file);
+   goto exit;
+}
+
 static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long 
data_size)
 {
struct prctl_mm_map prctl_map = { .exe_fd = (u32)-1, };

Re: [PATCH v5 4/4] zram: introduce zram memory tracking

2018-04-19 Thread Minchan Kim

On Fri, Apr 20, 2018 at 11:18:34AM +0900, Sergey Senozhatsky wrote:
> On (04/20/18 11:09), Minchan Kim wrote:
> [..]
> > > hm, OK, can we get this info into the changelog?  
> > 
> > No problem. I will add as follows,
> > 
> > "I used the feature a few years ago to find memory hoggers in userspace
> > to notice them what memory they have wasted without touch for a long time.
> > With it, they could reduce unnecessary memory space. However, at that time,
> > I hacked up zram for the feature but now I need the feature again so
> > I decided it would be better to upstream rather than keeping it alone.
> > I hope I submit the userspace tool to use the feature soon"
> 
> Shall we then just wait until you resubmit the "complete" patch set: zram
> tracking + the user space tool which would parse the tracking output?

tl;dr: I think userspace tool is just ancillary, not must.

Although my main purpose is to find idle memory hogger, I don't think
userspace tool to find is must to merge this feature because someone
might want to do other thing regardless of the tool.

Examples from my mind is to see how swap write pattern going on,
how sparse swap write happens and so on. :)

RE: [f2fs-dev] [PATCH] f2fs: sepearte hot/cold in free nid

2018-04-19 Thread heyunlei



>-Original Message-
>From: Chao Yu [mailto:yuch...@huawei.com]
>Sent: Friday, April 20, 2018 9:53 AM
>To: jaeg...@kernel.org
>Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
>Subject: [f2fs-dev] [PATCH] f2fs: sepearte hot/cold in free nid
>
>As most indirect node, dindirect node, and xattr node won't be updated
>after they are created, but inode node and other direct node will change
>more frequently, so store their nat entries mixedly in whole nat table
>will suffer:
>- fragment nat table soon due to different update rate
>- more nat block update due to fragmented nat table
>

BTW, should we enable this patch:  f2fs: reuse nids more aggressively?

I think it will decrease nat area fragment and will decrease io of nat?

>In order to solve above issue, we're trying to separate whole nat table to
>two part:
>a. Hot free nid area:
> - range: [nid #0, nid #x)
> - store node block address for
>   * inode node
>   * other direct node
>b. Cold free nid area:
> - range: [nid #x, max nid)
> - store node block address for
>   * indirect node
>   * dindirect node
>   * xattr node
>
>Allocation strategy example:
>
>Free nid: '-'
>Used nid: '='
>
>1. Initial status:
>Free Nids: 
>|---|
>   ^   ^   ^   
> ^
>Alloc Range:   |---|   
>|---|
>   hot_start   hot_end 
> cold_start  cold_end
>
>2. Free nids have ran out:
>Free Nids: 
>|===-===|
>   ^   ^   ^   
> ^
>Alloc Range:   |===|   
>|===|
>   hot_start   hot_end 
> cold_start  cold_end
>
>3. Expand hot/cold area range:
>Free Nids: 
>|===-===|
>   ^   ^   ^   
> ^
>Alloc Range:   |===|   
>|===|
>   hot_start   hot_end cold_start  
> cold_end
>
>4. Hot free nids have ran out:
>Free Nids: 
>|===-===|
>   ^   ^   ^   
> ^
>Alloc Range:   |===|   
>|===|
>   hot_start   hot_end cold_start  
> cold_end
>
>5. Expand hot area range, hot/cold area boundary has been fixed:
>Free Nids: 
>|===-===|
>   ^   ^   
> ^
>Alloc Range:   
>|===|===|
>   hot_start   hot_end(cold_start) 
> cold_end
>
>Run xfstests with generic/*:
>
>before
>node_write:169660
>cp_count:  60118
>node/cp2.82
>
>after:
>node_write:159145
>cp_count:  84501
>node/cp:   2.64
>
>Signed-off-by: Chao Yu 
>---
> fs/f2fs/checkpoint.c |   4 -
> fs/f2fs/debug.c  |   6 +-
> fs/f2fs/f2fs.h   |  19 +++-
> fs/f2fs/inode.c  |   2 +-
> fs/f2fs/namei.c  |   2 +-
> fs/f2fs/node.c   | 302 ---
> fs/f2fs/node.h   |  17 +--
> fs/f2fs/segment.c|   8 +-
> fs/f2fs/shrinker.c   |   3 +-
> fs/f2fs/xattr.c  |  10 +-
> 10 files changed, 221 insertions(+), 152 deletions(-)
>
>diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>index 96785ffc6181..c17feec72c74 100644
>--- a/fs/f2fs/checkpoint.c
>+++ b/fs/f2fs/checkpoint.c
>@@ -1029,14 +1029,10 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
> static void __prepare_cp_block(struct f2fs_sb_info *sbi)
> {
>   struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
>-  struct f2fs_nm_info *nm_i = NM_I(sbi);
>-  nid_t last_nid = nm_i->next_scan_nid;
>
>-  next_free_nid(sbi, &last_nid);
>   ckpt->valid_block_count = cpu_to_le64(valid_user_blocks(sbi));
>   ckpt->valid_node_count = cpu_to_le32(valid_node_count(sbi));
>   ckpt->valid_inode_count = cpu_to_le32(valid_inode_count(sbi));
>-  ckpt->next_free_nid = cpu_to_le32(last_nid);
> }
>
> /*
>diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>index 7bb036a3bb81..b13c1d4f110f 100644
>--- a/fs/f2fs/debug.c
>+++ b/fs/f2fs/debug.c
>@@ -100,7 +100,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
>   si->dirty_nats = NM_I(sbi)->dirty_nat_cnt;
>   si->sits = MAIN_SEGS(sbi);
>   si->dirty_sits = SIT

Re: BUG: corrupted list in __dentry_kill

2018-04-19 Thread Eric Biggers

On Sat, Mar 31, 2018 at 04:01:02PM -0700, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on bpf-next commit
> 7828f20e3779e4e85e55371e0e43f5006a15fb41 (Sat Mar 31 00:17:57 2018 +)
> Merge branch 'bpf-cgroup-bind-connect'
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=f3bd89a5ab3266b10540
> 
> So far this crash happened 22 times on bpf-next, upstream.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=6290970458980352
> syzkaller reproducer:
> https://syzkaller.appspot.com/x/repro.syz?id=6577156880596992
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=5107570603720704
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=5909223872832634926
> compiler: gcc (GCC) 7.1.1 20170620
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+f3bd89a5ab3266b10...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> RBP: 7ffd1bbb3ae0 R08: 2200 R09: 0003
> R10:  R11: 0246 R12: 
> R13: 0003 R14: 1380 R15: 7ffd1bbb3378
> list_del corruption. prev->next should be a8104008, but was
> 081c6144
> [ cut here ]
> kernel BUG at lib/list_debug.c:53!
> invalid opcode:  [#1] SMP KASAN
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 4448 Comm: syzkaller853443 Not tainted 4.16.0-rc6+ #43
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:__list_del_entry_valid+0xef/0x150 lib/list_debug.c:51
> RSP: 0018:8801b8f977a0 EFLAGS: 00010282
> RAX: 0054 RBX: 8801b0c1cf60 RCX: 
> RDX: 0054 RSI: 1100371f2ea9 RDI: ed00371f2ee8
> RBP: 8801b8f977b8 R08: 1100371f2e40 R09: 
> R10: 8801b8f97778 R11:  R12: 8801b0c1cde0
> R13: 1100371f2efd R14: 8801b0c1cc70 R15: dc00
> FS:  023a5880() GS:8801db00() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 004b6fbc CR3: 0001c85a6004 CR4: 001606f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  __list_del_entry include/linux/list.h:117 [inline]
>  dentry_unlist fs/dcache.c:518 [inline]
>  __dentry_kill+0x260/0x700 fs/dcache.c:571
>  dentry_kill fs/dcache.c:616 [inline]
>  dput.part.20+0x5a0/0x830 fs/dcache.c:831
>  dput+0x1f/0x30 fs/dcache.c:795
>  rpc_gssd_dummy_depopulate net/sunrpc/rpc_pipe.c:1381 [inline]
>  rpc_fill_super+0x628/0xae0 net/sunrpc/rpc_pipe.c:1426
>  mount_ns+0xc4/0x190 fs/super.c:1036
>  rpc_mount+0x9e/0xd0 net/sunrpc/rpc_pipe.c:1451
>  mount_fs+0x66/0x2d0 fs/super.c:1222
>  vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
>  vfs_kern_mount fs/namespace.c:2509 [inline]
>  do_new_mount fs/namespace.c:2512 [inline]
>  do_mount+0xea4/0x2bb0 fs/namespace.c:2842
>  SYSC_mount fs/namespace.c:3058 [inline]
>  SyS_mount+0xab/0x120 fs/namespace.c:3035
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x442759
> RSP: 002b:7ffd1bbb3238 EFLAGS: 0246 ORIG_RAX: 00a5
> RAX: ffda RBX:  RCX: 00442759
> RDX: 22c0 RSI: 2140 RDI: 2300
> RBP: 7ffd1bbb3ae0 R08: 2200 R09: 0003
> R10:  R11: 0246 R12: 
> R13: 0003 R14: 1380 R15: 7ffd1bbb3378
> Code: 4c 89 e2 48 c7 c7 c0 bf 75 87 e8 35 c1 46 fe 0f 0b 48 c7 c7 20 c0 75
> 87 e8 27 c1 46 fe 0f 0b 48 c7 c7 80 c0 75 87 e8 19 c1 46 fe <0f> 0b 48 c7 c7
> e0 c0 75 87 e8 0b c1 46 fe 0f 0b 48 89 df 48 89
> RIP: __list_del_entry_valid+0xef/0x150 lib/list_debug.c:51 RSP:
> 8801b8f977a0
> ---[ end trace e1b9954cded9aca7 ]---
> 
> 
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
> 
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title

#syz fix: rpc_pipefs: fix double-dput()

Al, it would be helpful if for syzbot-reported bugs you used the Reported-by
line suggested in the bug report.  That allows the bug to be automatically
closed, and it's also useful for people who are looking for syzbot bug fixes.

Thanks!

Eric

Re: [PATCH v5 4/4] zram: introduce zram memory tracking

2018-04-19 Thread Sergey Senozhatsky

On (04/20/18 11:09), Minchan Kim wrote:
[..]
> > hm, OK, can we get this info into the changelog?  
> 
> No problem. I will add as follows,
> 
> "I used the feature a few years ago to find memory hoggers in userspace
> to notice them what memory they have wasted without touch for a long time.
> With it, they could reduce unnecessary memory space. However, at that time,
> I hacked up zram for the feature but now I need the feature again so
> I decided it would be better to upstream rather than keeping it alone.
> I hope I submit the userspace tool to use the feature soon"

Shall we then just wait until you resubmit the "complete" patch set: zram
tracking + the user space tool which would parse the tracking output?

-ss

Re: [PATCH] iommu/vt-d: fix usage of force parameter in intel_ir_reconfigure_irte()

2018-04-19 Thread Jag Raman



> On Apr 4, 2018, at 2:06 PM, Jag Raman  wrote:
> 
> 
> 
> On 3/6/2018 5:39 PM, Jagannathan Raman wrote:
>> It was noticed that the IRTE configured for guest OS kernel
>> was over-written while the guest was running. As a result,
>> vt-d Posted Interrupts configured for the guest are not being
>> delivered directly, and instead bounces off the host. Every
>> interrupt delivery takes a VM Exit.
>> It was noticed that the following stack is doing the over-write:
>> [  147.463177]  modify_irte+0x171/0x1f0
>> [  147.463405]  intel_ir_set_affinity+0x5c/0x80
>> [  147.463641]  msi_domain_set_affinity+0x32/0x90
>> [  147.463881]  irq_do_set_affinity+0x37/0xd0
>> [  147.464125]  irq_set_affinity_locked+0x9d/0xb0
>> [  147.464374]  __irq_set_affinity+0x42/0x70
>> [  147.464627]  write_irq_affinity.isra.5+0xe1/0x110
>> [  147.464895]  proc_reg_write+0x38/0x70
>> [  147.465150]  __vfs_write+0x36/0x180
>> [  147.465408]  ? handle_mm_fault+0xdf/0x200
>> [  147.465671]  ? _cond_resched+0x15/0x30
>> [  147.465936]  vfs_write+0xad/0x1a0
>> [  147.466204]  SyS_write+0x52/0xc0
>> [  147.466472]  do_syscall_64+0x74/0x1a0
>> [  147.466744]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>> reversing the sense of force check in intel_ir_reconfigure_irte()
>> restores proper posted interrupt functionality
>> Signed-off-by: Jagannathan Raman 
>> ---
>>  Hi Thomas,
>>  I noticed that you added intel_ir_reconfigure_irte() with the
>>  following commit:
>>  d491bdff888e ("iommu/vt-d: Reevaluate vector configuration on
>>  activate()")
>>  Could you please confirm the usage of "force" parameter in
>>  intel_ir_reconfigure_irte()?
>>  drivers/iommu/intel_irq_remapping.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/drivers/iommu/intel_irq_remapping.c 
>> b/drivers/iommu/intel_irq_remapping.c
>> index 66f69af..3062a15 100644
>> --- a/drivers/iommu/intel_irq_remapping.c
>> +++ b/drivers/iommu/intel_irq_remapping.c
>> @@ -1136,7 +1136,7 @@ static void intel_ir_reconfigure_irte(struct irq_data 
>> *irqd, bool force)
>>  irte->dest_id = IRTE_DEST(cfg->dest_apicid);
>>  /* Update the hardware only if the interrupt is in remapped mode. */
>> -if (!force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
>> +if (force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
>>  modify_irte(&ir_data->irq_2_iommu, irte);
>>  }
>>  
> 
> *ping*

*ping*

> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] printk: Ratelimit messages printed by console drivers

2018-04-19 Thread Sergey Senozhatsky

On (04/19/18 14:53), Petr Mladek wrote:
> > > > 
> > > > Besides 100 lines is absolutely not enough for any real lockdep splat.
> > > > My call would be - up to 1000 lines in a 1 minute interval.
> 
> But this would break the intention of this patch.

You picked an arbitrary value and now you are saying that any other
value will not work?

> Come on guys! The first reaction how to fix the infinite loop was
> to fix the console drivers and remove the recursive messages. We are
> talking about messages that should not be there or they should
> get replaced by WARN_ONCE(), print_once() or so. This patch only
> give us a chance to see the problem and do not blow up immediately.
> 
> I am fine with increasing the number of lines. But we need to keep
> the timeout long. In fact, 1 hour is still rather short from my POV.

Disagree.

I saw 3 or 4 lockdep reports coming from console drivers. "100 lines"
is way too restrictive. I want to have a complete report; not the first
50 lines, not the first 103 lines, which would "hint" me that "hey, there
is something wrong there, but you are on your own to figure out the rest".

> > > Well, if we want to basically turn printk_safe() into 
> > > printk_safe_ratelimited().
> > > I'm not so sure about it.
> 
> No, it is not about printk_safe(). The ratelimit is active when
> console_owner == current. It triggers when printk() is called
> inside

"console_owner == current" is exactly the point when we call console
drivers and add scheduler, networking, timekeeping, etc. locks to the
picture. And so far all of the lockdeps reports that we had were from
call_console_drivers(). So it very much is about printk_safe().

> > > Besides the patch also rate limits printk_nmi->logbuf - the logbuf
> > > PRINTK_NMI_DEFERRED_CONTEXT_MASK bypass, which is way too important
> > > to rate limit it - for no reason.
> 
> Again. It has the effect only when console_owner == current. It means
> that it affects "only" NMIs that interrupt console_unlock() when calling
> console drivers.

What is your objection here? NMIs can come anytime.

> > One more thing,
> > I'd really prefer to rate limit the function which flushes per-CPU
> > printk_safe buffers; not the function that appends new messages to
> > the per-CPU printk_safe buffers.
> 
> I wonder if this opinion is still valid after explaining the
> dependency on printk_safe(). In each case, it sounds weird
> to block printk_safe buffers with some "unwanted" messages.
> Or maybe I miss something.

I'm not following.

The fact that some consoles under some circumstances can add unwanted
messages to the buffer does not look like a good enough reason to start
rate limiting _all_ messages and to potentially discard the _important_
ones.

-ss

Re: [PATCH net-next] net-next/hinic: add arm64 support

2018-04-19 Thread Zhao Chen

On 2018/4/20 1:34, David Miller wrote:
> From: Zhao Chen 
> Date: Wed, 18 Apr 2018 06:07:39 -0400
> 
>> This patch enables arm64 platform support for the HINIC driver.
>>
>> Signed-off-by: Zhao Chen 
> 
> Applied, thank you.
> 
> .
> 
Thanks, David.

Re: [PATCH v5 4/4] zram: introduce zram memory tracking

2018-04-19 Thread Minchan Kim

On Wed, Apr 18, 2018 at 02:07:15PM -0700, Andrew Morton wrote:
> On Wed, 18 Apr 2018 10:26:36 +0900 Minchan Kim  wrote:
> 
> > Hi Andrew,
> > 
> > On Tue, Apr 17, 2018 at 02:59:21PM -0700, Andrew Morton wrote:
> > > On Mon, 16 Apr 2018 18:09:46 +0900 Minchan Kim  wrote:
> > > 
> > > > zRam as swap is useful for small memory device. However, swap means
> > > > those pages on zram are mostly cold pages due to VM's LRU algorithm.
> > > > Especially, once init data for application are touched for launching,
> > > > they tend to be not accessed any more and finally swapped out.
> > > > zRAM can store such cold pages as compressed form but it's pointless
> > > > to keep in memory. Better idea is app developers free them directly
> > > > rather than remaining them on heap.
> > > > 
> > > > This patch tell us last access time of each block of zram via
> > > > "cat /sys/kernel/debug/zram/zram0/block_state".
> > > > 
> > > > The output is as follows,
> > > >   30075.033841 .wh
> > > >   30163.806904 s..
> > > >   30263.806919 ..h
> > > > 
> > > > First column is zram's block index and 3rh one represents symbol
> > > > (s: same page w: written page to backing store h: huge page) of the
> > > > block state. Second column represents usec time unit of the block
> > > > was last accessed. So above example means the 300th block is accessed
> > > > at 75.033851 second and it was huge so it was written to the backing
> > > > store.
> > > > 
> > > > Admin can leverage this information to catch cold|incompressible pages
> > > > of process with *pagemap* once part of heaps are swapped out.
> > > 
> > > A few things..
> > > 
> > > - Terms like "Admin can" and "Admin could" are worrisome.  How do we
> > >   know that admins *will* use this?  How do we know that we aren't
> > >   adding a bunch of stuff which nobody will find to be (sufficiently)
> > >   useful?  For example, is there some userspace tool to which you are
> > >   contributing which will be updated to use this feature?
> > 
> > Actually, I used this feature two years ago to find memory hogger
> > although the feature was very fast prototyping. It was very useful
> > to reduce memory cost in embedded space.
> > 
> > The reason I am trying to upstream the feature is I need the feature
> > again. :)
> > 
> > Yub, I have a userspace tool to use the feature although it was
> > not compatible with this new version. It should be updated with
> > new format. I will find a time to submit the tool.
> 
> hm, OK, can we get this info into the changelog?  

No problem. I will add as follows,

"I used the feature a few years ago to find memory hoggers in userspace
to notice them what memory they have wasted without touch for a long time.
With it, they could reduce unnecessary memory space. However, at that time,
I hacked up zram for the feature but now I need the feature again so
I decided it would be better to upstream rather than keeping it alone.
I hope I submit the userspace tool to use the feature soon"

> 
> > > 
> > > - block_state's second column is in microseconds since some
> > >   undocumented time.  But how is userspace to know how much time has
> > >   elapsed since the access?  ie, "current time".
> > 
> > It's a sched_clock so it should be elapsed time since the system boot.
> > I should have written it explictly.
> > I will fix it.
> > 
> > > 
> > > - Is the sched_clock() return value suitable for exporting to
> > >   userspace?  Is it monotonic?  Is it consistent across CPUs, across
> > >   CPU hotadd/remove, across suspend/resume, etc?  Does it run all the
> > >   way up to 2^64 on all CPU types, or will some processors wrap it at
> > >   (say) 32 bits?  etcetera.  Documentation/timers/timekeeping.txt
> > >   points out that suspend/resume can mess it up and that the counter
> > >   can drift between cpus.
> > 
> > Good point!
> > 
> > I just referenced it from ftrace because I thought the goal is similiar
> > "no need to be exact unless the drift is frequent but wanted to be fast"
> > 
> > AFAIK, ftrace/printk is active user of the function so if the problem
> > happens frequently, it might be serious. :)
> 
> It could be that ktime_get() is a better fit here - especially if
> sched_clock() goes nuts after resume.  Unfortunately ktime_get()
> appears to be totally undocumented :(
> 

I will use ktime_get_boottime(). With it, zram is not demamaged by
suspend/resume and code would be more simple/clear. For user, it
would be more straightforward to parse the time.

Thanks for good suggestion, Andrew!

RE: [PATCH v2] usb: chipidea: Hook into mux framework to toggle usb switch

2018-04-19 Thread Peter Chen

 
 
> --- a/drivers/usb/chipidea/Kconfig
> +++ b/drivers/usb/chipidea/Kconfig
> @@ -3,6 +3,8 @@ config USB_CHIPIDEA
>   depends on ((USB_EHCI_HCD && USB_GADGET) || (USB_EHCI_HCD
> && !USB_GADGET) || (!USB_EHCI_HCD && USB_GADGET)) && HAS_DMA
>   select EXTCON
>   select RESET_CONTROLLER
> + select MULTIPLEXER
> + select MUX_GPIO

The above two configurations are only used at your specific platforms, please 
add
them at either your platform defconfig or the related hardware driver's 
Kconfig. 

>   help
> Say Y here if your system has a dual role high speed USB
> controller based on ChipIdea silicon IP. It supports:
> diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c index
> 33ae87f..8fa0991 100644
> --- a/drivers/usb/chipidea/core.c
> +++ b/drivers/usb/chipidea/core.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "ci.h"
>  #include "udc.h"
> @@ -687,6 +688,10 @@ static int ci_get_platdata(struct device *dev,
>   if (of_find_property(dev->of_node, "non-zero-ttctrl-ttha", NULL))
>   platdata->flags |= CI_HDRC_SET_NON_ZERO_TTHA;
> 
> + platdata->usb_switch = devm_mux_control_get_optional(dev, "usb_switch");
> + if (IS_ERR(platdata->usb_switch))
> + return PTR_ERR(platdata->usb_switch);
> +
>   ext_id = ERR_PTR(-ENODEV);
>   ext_vbus = ERR_PTR(-ENODEV);
>   if (of_property_read_bool(dev->of_node, "extcon")) { diff --git
> a/drivers/usb/chipidea/host.c b/drivers/usb/chipidea/host.c index 
> af45aa32..d9d2d00
> 100644
> --- a/drivers/usb/chipidea/host.c
> +++ b/drivers/usb/chipidea/host.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "../host/ehci.h"
> 
> @@ -161,6 +162,10 @@ static int host_start(struct ci_hdrc *ci)
>   if (ci_otg_is_fsm_mode(ci)) {
>   otg->host = &hcd->self;
>   hcd->self.otg_port = 1;
> + } else {
> + ret = mux_control_select(ci->platdata->usb_switch, 1);
> + if (ret)
> + goto disable_reg;

What will happen if ci->platdata->usb_switch  is NULL?

>   }
>   }
> 
> @@ -181,6 +186,8 @@ static void host_stop(struct ci_hdrc *ci)
>   struct usb_hcd *hcd = ci->hcd;
> 
>   if (hcd) {
> + if (!ci_otg_is_fsm_mode(ci))
> + mux_control_deselect(ci->platdata->usb_switch);

Ditto.

>   if (ci->platdata->notify_event)
>   ci->platdata->notify_event(ci,
>   CI_HDRC_CONTROLLER_STOPPED_EVENT);
> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index
> 9852ec5..209d3f6 100644
> --- a/drivers/usb/chipidea/udc.c
> +++ b/drivers/usb/chipidea/udc.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "ci.h"
>  #include "udc.h"
> @@ -1965,16 +1966,26 @@ void ci_hdrc_gadget_destroy(struct ci_hdrc *ci)
> 
>  static int udc_id_switch_for_device(struct ci_hdrc *ci)  {
> + int ret = 0;
> +
>   if (ci->is_otg)
>   /* Clear and enable BSV irq */
>   hw_write_otgsc(ci, OTGSC_BSVIS | OTGSC_BSVIE,
>   OTGSC_BSVIS | OTGSC_BSVIE);
> 
> - return 0;
> + if (!ci_otg_is_fsm_mode(ci))
> + ret = mux_control_select(ci->platdata->usb_switch, 0);
> +

Ditto

> + if (ci->is_otg && ret)
> + hw_write_otgsc(ci, OTGSC_BSVIE | OTGSC_BSVIS,
> OTGSC_BSVIS);
> +
> + return ret;
>  }
> 
>  static void udc_id_switch_for_host(struct ci_hdrc *ci)  {
> + mux_control_deselect(ci->platdata->usb_switch);
> +
>   /*
>* host doesn't care B_SESSION_VALID event
>* so clear and disbale BSV irq
> diff --git a/include/linux/usb/chipidea.h b/include/linux/usb/chipidea.h index
> 07f9936..9ea55a1 100644
> --- a/include/linux/usb/chipidea.h
> +++ b/include/linux/usb/chipidea.h
> @@ -10,6 +10,7 @@
>  #include 
> 
>  struct ci_hdrc;
> +struct mux_control;
> 
>  /**
>   * struct ci_hdrc_cable - structure for external connector cable state 
> tracking @@ -
> 76,6 +77,7 @@ struct ci_hdrc_platform_data {
>   /* VBUS and ID signal state tracking, using extcon framework */
>   struct ci_hdrc_cablevbus_extcon;
>   struct ci_hdrc_cableid_extcon;
> + struct mux_control  *usb_switch;
>   u32 phy_clkgate_delay_us;
 
If CONFIG_USB_CHIPIDEA_HOST is not defined, it may cause build error

Peter

[PATCH] f2fs: sepearte hot/cold in free nid

2018-04-19 Thread Chao Yu

As most indirect node, dindirect node, and xattr node won't be updated
after they are created, but inode node and other direct node will change
more frequently, so store their nat entries mixedly in whole nat table
will suffer:
- fragment nat table soon due to different update rate
- more nat block update due to fragmented nat table

In order to solve above issue, we're trying to separate whole nat table to
two part:
a. Hot free nid area:
 - range: [nid #0, nid #x)
 - store node block address for
   * inode node
   * other direct node
b. Cold free nid area:
 - range: [nid #x, max nid)
 - store node block address for
   * indirect node
   * dindirect node
   * xattr node

Allocation strategy example:

Free nid: '-'
Used nid: '='

1. Initial status:
Free Nids:  
|---|
^   ^   ^   
^
Alloc Range:|---|   
|---|
hot_start   hot_end 
cold_start  cold_end

2. Free nids have ran out:
Free Nids:  
|===-===|
^   ^   ^   
^
Alloc Range:|===|   
|===|
hot_start   hot_end 
cold_start  cold_end

3. Expand hot/cold area range:
Free Nids:  
|===-===|
^   ^   ^   
^
Alloc Range:|===|   
|===|
hot_start   hot_end cold_start  
cold_end

4. Hot free nids have ran out:
Free Nids:  
|===-===|
^   ^   ^   
^
Alloc Range:|===|   
|===|
hot_start   hot_end cold_start  
cold_end

5. Expand hot area range, hot/cold area boundary has been fixed:
Free Nids:  
|===-===|
^   ^   
^
Alloc Range:
|===|===|
hot_start   hot_end(cold_start) 
cold_end

Run xfstests with generic/*:

before
node_write: 169660
cp_count:   60118
node/cp 2.82

after:
node_write: 159145
cp_count:   84501
node/cp:2.64

Signed-off-by: Chao Yu 
---
 fs/f2fs/checkpoint.c |   4 -
 fs/f2fs/debug.c  |   6 +-
 fs/f2fs/f2fs.h   |  19 +++-
 fs/f2fs/inode.c  |   2 +-
 fs/f2fs/namei.c  |   2 +-
 fs/f2fs/node.c   | 302 ---
 fs/f2fs/node.h   |  17 +--
 fs/f2fs/segment.c|   8 +-
 fs/f2fs/shrinker.c   |   3 +-
 fs/f2fs/xattr.c  |  10 +-
 10 files changed, 221 insertions(+), 152 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 96785ffc6181..c17feec72c74 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1029,14 +1029,10 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
 static void __prepare_cp_block(struct f2fs_sb_info *sbi)
 {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
-   struct f2fs_nm_info *nm_i = NM_I(sbi);
-   nid_t last_nid = nm_i->next_scan_nid;
 
-   next_free_nid(sbi, &last_nid);
ckpt->valid_block_count = cpu_to_le64(valid_user_blocks(sbi));
ckpt->valid_node_count = cpu_to_le32(valid_node_count(sbi));
ckpt->valid_inode_count = cpu_to_le32(valid_inode_count(sbi));
-   ckpt->next_free_nid = cpu_to_le32(last_nid);
 }
 
 /*
diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 7bb036a3bb81..b13c1d4f110f 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -100,7 +100,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->dirty_nats = NM_I(sbi)->dirty_nat_cnt;
si->sits = MAIN_SEGS(sbi);
si->dirty_sits = SIT_I(sbi)->dirty_sentries;
-   si->free_nids = NM_I(sbi)->nid_cnt[FREE_NID];
+   si->free_nids = NM_I(sbi)->nid_cnt[FREE_HOT_NID] +
+   NM_I(sbi)->nid_cnt[FREE_COLD_NID];
si->avail_nids = NM_I(sbi)->available_nids;
si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
si->bg_gc = sbi->bg_gc;
@@ -235,7 +236,8 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
}
 
/* free nids */
-   si->cache_mem += (NM_I(sbi)->nid_cnt[FREE_NID] +
+   si->cac

Re: [PATCHv2] printk: wake up klogd in vprintk_emit

2018-04-19 Thread Sergey Senozhatsky

On (04/19/18 12:02), Petr Mladek wrote:
> On Thu 2018-04-19 10:42:50, Sergey Senozhatsky wrote:
> > We wake up klogd very late - only when current console_sem owner
> > is done pushing pending kernel messages to the serial/net consoles.
> > In some cases this results in lost syslog messages, because kernel
> > log buffer is a circular buffer and if we don't wakeup syslog long
> > enough there are chances that logbuf simply will wrap around.
> > 
> > The patch moves the klog wake up call to vprintk_emit(), which is
> > the only legit way for a kernel message to appear in the logbuf,
> > right before we attempt to grab the console_sem (possibly spinning
> > on it waiting for the hand off) and call console drivers.
> 
> The last two lines need an update. What about?

Ah. Indeed!

> "right after the attempt to handle consoles. As a result, klog
> will get waken either after flushing the new message to consoles
> or immediately when consoles are still busy with older messages."

Looks good. Do you want me to resend the patch?

> Otherwise, it looks nice:
> 
> Reviewed-by: Petr Mladek 

Thanks.

-ss

Re: [PATCH] time: tick-sched: use bool for tick_stopped

2018-04-19 Thread yuankuiz

On 2018-04-11 07:20 AM, yuank...@codeaurora.org wrote:

++
On 2018-04-11 07:09 AM, yuank...@codeaurora.org wrote:

++

On 2018-04-10 10:49 PM, yuank...@codeaurora.org wrote:

Typo...

On 2018-04-10 10:08 PM, yuank...@codeaurora.org wrote:

On 2018-04-10 07:06 PM, Thomas Gleixner wrote:

On Tue, 10 Apr 2018, yuank...@codeaurora.org wrote:

On 2018-04-10 05:10 PM, Thomas Gleixner wrote:
> On Tue, 10 Apr 2018, yuank...@codeaurora.org wrote:
> > On 2018-04-10 04:00 PM, Rafael J. Wysocki wrote:
> > > On Tue, Apr 10, 2018 at 9:33 AM,   wrote:
> > > > From: John Zhao 
> > > >
> > > > Variable tick_stopped returned by tick_nohz_tick_stopped
> > > > can have only true / false values. Since the return type
> > > > of the tick_nohz_tick_stopped is also bool, variable
> > > > tick_stopped nice to have data type as bool in place of unsigned int.
> > > > Moreover, the executed instructions cost could be minimal
> > > > without potiential data type conversion.
> > > >
> > > > Signed-off-by: John Zhao 
> > > > ---
> > > >  kernel/time/tick-sched.h | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
> > > > index 6de959a..4d34309 100644
> > > > --- a/kernel/time/tick-sched.h
> > > > +++ b/kernel/time/tick-sched.h
> > > > @@ -48,8 +48,8 @@ struct tick_sched {
> > > > unsigned long   check_clocks;
> > > > enum tick_nohz_mode nohz_mode;
> > > >
> > > > +   booltick_stopped: 1;
> > > > unsigned intinidle  : 1;
> > > > -   unsigned inttick_stopped: 1;
> > > > unsigned intidle_active : 1;
> > > > unsigned intdo_timer_last   : 1;
> > > > unsigned intgot_idle_tick   : 1;
> > >
> > > I don't think this is a good idea at all.
> > >
> > > Please see https://lkml.org/lkml/2017/11/21/384 for example.
> > [ZJ] Thanks for this sharing. Looks like, this patch fall into the case of
> > "Maybe".
>
> This patch falls into the case 'pointless' because it adds extra storage
[ZJ] 1 bit vs 1 bit. no more.

Groan. No. Care to look at the data structure? You create a new 
storage,

[ZJ] Say, {unsigned int, unsigned int, unsigned int, unsigned int,
unsigned int} becomes
  {bool, unsigned int, unsigned int, unsigned int, 
unsigned int}

As specified by the rule No.10 at the section 6.7.2.1 of C99 TC2 as:
"If enough space remains, a bit-field that immediately follows 
another

bit-field in a
structure shall be packed into adjacent bits of the same unit." What
is the new storage so far?
[ZJ] Further prototyping has been given based on gcc for both of x86_64 
and armv8-a,
 unsigned int and bool share the same 4 bytes without the addtional 
storage for sure.
 Open this and welcome if any other difference behaviour could be 
captured.

which is incidentally merged into the other bitfield by the 
compiler at a
different bit position, but there is no guarantee that a compiler 
does

that. It's free to use distinct storage for that bool based bit.

[ZJ] Per the rule No.10 at section 6.7.2.1 of C99 TC2 as:
" If insufficient space remains, whether  a  bit-field  that  does
not  fit  is  put  into
the  next  unit  or overlaps  adjacent  units  is 
implementation-defined."

So, implementation is never mind which type will be stored if any.

>> > for no benefit at all.
[ZJ] tick_stopped is returned by the tick_nohz_tick_stopped() 
which is bool.

The benefit is no any potiential type conversion could be minded.

A bit stays a bit. 'bool foo : 1;' or 'unsigned int foo : 1' has to 
be
evaluated as a bit. So there is a type conversion from BIT to bool 
required

because BIT != bool.

[ZJ] Per the rule No.9 at section 6.7.2.1 of C99 TC2 as:
"If  the  value  0  or  1  is  stored  into  a  nonzero-width
bit-field  of  types
_Bool, the value of the bit-field shall compare equal to the value 
stored."

Obviously, it is nothing related to type conversion actually.

By chance the evaluation can be done by evaluating the byte in 
which the
bit is placed just because the compiler knows that the remaining 
bits are
not used. There is no guarantee that this is done, it happens to be 
true

for a particular compiler.
[ZJ] Actually, such as GCC owe that kind of guarantee to be promised 
by ABI.

But that does not make it any more interesting. It just makes the 
code

harder to read and eventually leads to bigger storage.

[ZJ] To get the benctifit to be profiled, it is given as:
number of instructions of function tick_nohz_tick_stopped():
[ZJ] Here, I used is not the tick_nohz_tick_stopped(), but an 
evaluation() as:

#include 
#include 

struct tick_sched {
unsigned int inidle : 1;
unsigned int tick_stopped   : 1;
};

bool get_status()
{
struct tick_sched *ts;
ts->tick_stopped = 1;

Re: [PATCH v5 11/23] ASoC: qdsp6: q6adm: Add q6adm driver

2018-04-19 Thread kbuild test robot

Hi Srinivas,

I love your patch! Yet something to improve:

[auto build test ERROR on asoc/for-next]
[also build test ERROR on v4.17-rc1 next-20180419]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/srinivas-kandagatla-linaro-org/ASoC-qcom-Add-support-to-QDSP-based-Audio/20180419-212438
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 
for-next
config: arm-allyesconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

Note: the 
linux-review/srinivas-kandagatla-linaro-org/ASoC-qcom-Add-support-to-QDSP-based-Audio/20180419-212438
 HEAD b636a6bdb47d4dfaa6bc43f2ef431f5426b51581 builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

>> sound/soc/qcom/qdsp6/q6adm.c:21:10: fatal error: q6core.h: No such file or 
>> directory
#include "q6core.h"
 ^~
   compilation terminated.

vim +21 sound/soc/qcom/qdsp6/q6adm.c

 4  
 5  #include 
 6  #include 
 7  #include 
 8  #include 
 9  #include 
10  #include 
11  #include 
12  #include 
13  #include 
14  #include 
15  #include 
16  #include 
17  #include 
18  #include 
19  #include "q6adm.h"
20  #include "q6afe.h"
  > 21  #include "q6core.h"
22  #include "q6dsp-errno.h"
23  #include "q6dsp-common.h"
24  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v2 08/10] dt-bindings: media: Document bindings for the Sunxi-Cedrus VPU driver

2018-04-19 Thread Tomasz Figa

Hi Paul, Philipp,

On Fri, Apr 20, 2018 at 1:04 AM Philipp Zabel 
wrote:

> Hi Paul,

> On Thu, 2018-04-19 at 17:45 +0200, Paul Kocialkowski wrote:
> > This adds a device-tree binding document that specifies the properties
> > used by the Sunxi-Cedurs VPU driver, as well as examples.
> >
> > Signed-off-by: Paul Kocialkowski 
> > ---
> >  .../devicetree/bindings/media/sunxi-cedrus.txt | 50
++
> >  1 file changed, 50 insertions(+)
> >  create mode 100644
Documentation/devicetree/bindings/media/sunxi-cedrus.txt
> >
> > diff --git a/Documentation/devicetree/bindings/media/sunxi-cedrus.txt
b/Documentation/devicetree/bindings/media/sunxi-cedrus.txt
> > new file mode 100644
> > index ..71ad3f9c3352
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/media/sunxi-cedrus.txt
> > @@ -0,0 +1,50 @@
> > +Device-tree bindings for the VPU found in Allwinner SoCs, referred to
as the
> > +Video Engine (VE) in Allwinner literature.
> > +
> > +The VPU can only access the first 256 MiB of DRAM, that are DMA-mapped
starting
> > +from the DRAM base. This requires specific memory allocation and
handling.

And no IOMMU? Brings back memories.

> > +
> > +Required properties:
> > +- compatible : "allwinner,sun4i-a10-video-engine";
> > +- memory-region : DMA pool for buffers allocation;
> > +- clocks : list of clock specifiers, corresponding to
entries in
> > +  the clock-names property;
> > +- clock-names: should contain "ahb", "mod" and "ram"
entries;
> > +- assigned-clocks   : list of clocks assigned to the VE;
> > +- assigned-clocks-rates : list of clock rates for the clocks assigned
to the VE;
> > +- resets : phandle for reset;
> > +- interrupts : should contain VE interrupt number;
> > +- reg: should contain register base and length
of VE.
> > +
> > +Example:
> > +
> > +reserved-memory {
> > + #address-cells = <1>;
> > + #size-cells = <1>;
> > + ranges;
> > +
> > + /* Address must be kept in the lower 256 MiBs of DRAM for VE. */
> > + ve_memory: cma@4a00 {
> > + compatible = "shared-dma-pool";
> > + reg = <0x4a00 0x600>;
> > + no-map;
> > + linux,cma-default;
> > + };
> > +};
> > +
> > +video-engine@1c0e000 {

> This is not really required by any specification, and not as common as
> gpu@..., but could this reasonably be called "vpu@1c0e000" to follow
> somewhat-common practice?

AFAIR the name is supposed to be somewhat readable for someone that doesn't
know the hardware. To me, "video-engine" sounds more obvious than "vpu",
but we actually use "codec" already, in case of MFC and JPEG codec on
Exynos. If encode/decode is the only functionality of this block, I'd
personally go with "codec". If it can do other things, e.g.
scaling/rotation without encode/decode, I'd probably call it
"video-processor".

Best regards,
Tomasz

RE: [PATCH resend] usb: chipidea: Don't select EXTCON

2018-04-19 Thread Peter Chen

 
>  drivers/usb/chipidea/Kconfig | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/usb/chipidea/Kconfig b/drivers/usb/chipidea/Kconfig index
> 785f0ed037f7..97509172d536 100644
> --- a/drivers/usb/chipidea/Kconfig
> +++ b/drivers/usb/chipidea/Kconfig
> @@ -1,7 +1,6 @@
>  config USB_CHIPIDEA
>   tristate "ChipIdea Highspeed Dual Role Controller"
>   depends on ((USB_EHCI_HCD && USB_GADGET) || (USB_EHCI_HCD
> && !USB_GADGET) || (!USB_EHCI_HCD && USB_GADGET)) && HAS_DMA
> - select EXTCON
>   select RESET_CONTROLLER
>   help
> Say Y here if your system has a dual role high speed USB
> --
> 2.17.0

Hi Jisheng,

Sorry to reply late, are you really care 2KB code side? Since many users use
EXTCON to handle vbus and id, it is hard just delete it. I could accept patch
for your specific platforms, like:

+   select EXTCON if !ARCH_

But please note, even your board uses SoC id/vbus pin to detect related external
signal, the other boards use the same SoC may use external gpios to do it.

Peter

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-19 Thread yuankuiz


On 2018-04-19 06:42 PM, yuank...@codeaurora.org wrote:

On 2018-04-19 02:48 PM, yuank...@codeaurora.org wrote:

On 2018-04-19 01:16 PM, Julia Lawall wrote:

On Wed, 18 Apr 2018, Joe Perches wrote:


On Thu, 2018-04-19 at 06:40 +0200, Julia Lawall wrote:
>
> On Wed, 18 Apr 2018, Joe Perches wrote:
>
> > On Tue, 2018-04-17 at 17:07 +0800, yuank...@codeaurora.org wrote:
> > > Hi julia,
> > >
> > > On 2018-04-15 05:19 AM, Julia Lawall wrote:
> > > > On Wed, 11 Apr 2018, Joe Perches wrote:
> > > >
> > > > > On Thu, 2018-04-12 at 08:22 +0200, Julia Lawall wrote:
> > > > > > On Wed, 11 Apr 2018, Joe Perches wrote:
> > > > > > > On Wed, 2018-04-11 at 09:29 -0700, Andrew Morton wrote:
> > > > > > > > We already have some 500 bools-in-structs
> > > > > > >
> > > > > > > I got at least triple that only in include/
> > > > > > > so I expect there are at probably an order
> > > > > > > of magnitude more than 500 in the kernel.
> > > > > > >
> > > > > > > I suppose some cocci script could count the
> > > > > > > actual number of instances.  A regex can not.
> > > > > >
> > > > > > I got 12667.
> > > > >
> > > > > Could you please post the cocci script?
> > > > >
> > > > > > I'm not sure to understand the issue.  Will using a bitfield help 
if there
> > > > > > are no other bitfields in the structure?
> > > > >
> > > > > IMO, not really.
> > > > >
> > > > > The primary issue is described by Linus here:
> > > > > https://lkml.org/lkml/2017/11/21/384
> > > > >
> > > > > I personally do not find a significant issue with
> > > > > uncontrolled sizes of bool in kernel structs as
> > > > > all of the kernel structs are transitory and not
> > > > > written out to storage.
> > > > >
> > > > > I suppose bool bitfields are also OK, but for the
> > > > > RMW required.
> > > > >
> > > > > Using unsigned int :1 bitfield instead of bool :1
> > > > > has the negative of truncation so that the uint
> > > > > has to be set with !! instead of a simple assign.
> > > >
> > > > At least with gcc 5.4.0, a number of structures become larger with
> > > > unsigned int :1. bool:1 seems to mostly solve this problem.  The
> > > > structure
> > > > ichx_desc, defined in drivers/gpio/gpio-ich.c seems to become larger
> > > > with
> > > > both approaches.
> > >
> > > [ZJ] Hopefully, this could make it better in your environment.
> > >   IMHO, this is just for double check.
> >
> > I doubt this is actually better or smaller code.
> >
> > Check the actual object code using objdump and the
> > struct alignment using pahole.
>
> I didn't have a chance to try it, but it looks quite likely to result in a
> smaller data structure based on the other examples that I looked at.

I _really_ doubt there is any difference in size between the
below in any architecture

struct foo {
int bar;
bool baz:1;
int qux;
};

and

struct foo {
int bar;
bool baz;
int qux;
};

Where there would be a difference in size is

struct foo {
int bar;
bool baz1:1;
bool baz2:1;
int qux;
};

and

struct foo {
int bar;
bool baz1;
bool baz2;

int qux;
};
[ZJ] Even though, two bool:1 are grouped in the #3, finally 4 bytes are 
padded

 due for int is the most significant in the type size.
 At least, they are all the same per x86 and arm based on gcc.(12 
bytes).
[ZJ] However, #3 could be difference to #4 if compiling it if the size 
of (_Bool)

 is a bigger value(4 bytes maybe available in Alpha EV45 for ex.).


In the situation of the example there are two bools together in the 
middle
of the structure and one at the end.  Somehow, even converting to 
bool:1
increases the size.  But it seems plausible that putting all three 
bools
together and converting them all to :1 would reduce the size.  I 
don't
know.  The size increase (more than 8 bytes) seems out of proportion 
for 3

bools.

[ZJ] Typically, addition saving is due for difference padding.


I was able to check around 3000 structures that were not declared 
with any

attributes, that don't declare named types internally, and that are
compiled for x86.  Around 10% become smaller whn using bool:1, 
typically

by at most 8 bytes.

[ZJ] As my example, int (*)() requested 8 bytes in x86 arch, then 8
bytes is similiar to that.
 While it request 4 bytes in arm arch. Typically, my previous
example struct can
 reach to 32 bytes in x86 arch(compared to 40 bytes for original 
version).
 Similarly, 20 bytes in arm arch(compared to 24 bytes per original 
version).


julia

Re: [RFC PATCH ghak32 V2 06/13] audit: add support for non-syscall auxiliary records

2018-04-19 Thread Richard Guy Briggs

On 2018-04-18 20:39, Paul Moore wrote:
> On Fri, Mar 16, 2018 at 5:00 AM, Richard Guy Briggs  wrote:
> > Standalone audit records have the timestamp and serial number generated
> > on the fly and as such are unique, making them standalone.  This new
> > function audit_alloc_local() generates a local audit context that will
> > be used only for a standalone record and its auxiliary record(s).  The
> > context is discarded immediately after the local associated records are
> > produced.
> >
> > Signed-off-by: Richard Guy Briggs 
> > ---
> >  include/linux/audit.h |  8 
> >  kernel/auditsc.c  | 20 +++-
> >  2 files changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index ed16bb6..c0b83cb 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -227,7 +227,9 @@ static inline int audit_log_container_info(struct 
> > audit_context *context,
> >  /* These are defined in auditsc.c */
> > /* Public API */
> >  extern int  audit_alloc(struct task_struct *task);
> > +extern struct audit_context *audit_alloc_local(void);
> >  extern void __audit_free(struct task_struct *task);
> > +extern void audit_free_context(struct audit_context *context);
> >  extern void __audit_syscall_entry(int major, unsigned long a0, unsigned 
> > long a1,
> >   unsigned long a2, unsigned long a3);
> >  extern void __audit_syscall_exit(int ret_success, long ret_value);
> > @@ -472,6 +474,12 @@ static inline int audit_alloc(struct task_struct *task)
> >  {
> > return 0;
> >  }
> > +static inline struct audit_context *audit_alloc_local(void)
> > +{
> > +   return NULL;
> > +}
> > +static inline void audit_free_context(struct audit_context *context)
> > +{ }
> >  static inline void audit_free(struct task_struct *task)
> >  { }
> >  static inline void audit_syscall_entry(int major, unsigned long a0,
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 2932ef1..7103d23 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -959,8 +959,26 @@ int audit_alloc(struct task_struct *tsk)
> > return 0;
> >  }
> >
> > -static inline void audit_free_context(struct audit_context *context)
> > +struct audit_context *audit_alloc_local(void)
> >  {
> > +   struct audit_context *context;
> > +
> > +   if (!audit_ever_enabled)
> > +   return NULL; /* Return if not auditing. */
> > +
> > +   context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
> > +   if (!context)
> > +   return NULL;
> > +   context->serial = audit_serial();
> > +   context->ctime = current_kernel_time64();
> > +   context->in_syscall = 1;
> > +   return context;
> > +}
> > +
> > +inline void audit_free_context(struct audit_context *context)
> > +{
> > +   if (!context)
> > +   return;
> > audit_free_names(context);
> > unroll_tree_refs(context, NULL, 0);
> > free_tree_refs(context);
> 
> I'm reserving the option to comment on this idea further as I make my
> way through the patchset, but audit_free_context() definitely
> shouldn't be declared as an inline function.

Ok, I think I follow.  When it wasn't exported, inline was fine, but now
that it has been exported, it should no longer be inlined, or should use
an intermediate function name to export so that local uses of it can
remain inline.

> paul moore

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

[PATCH] arm64: dts: msm8996: Add modem remoteproc

2018-04-19 Thread Bjorn Andersson

Add the modem remoteproc node and the child smd-edge in order to be able
to boot the modem Hexagon found in MSM8996 based devices.

Also extend the tcsr mutex node size, to cover the registers at the end
of the block, used for halting the modem subsystem.

Signed-off-by: Bjorn Andersson 
---

This patch depends on the recent patch adding a few additional clocks to
gcc-msm8996.

 arch/arm64/boot/dts/qcom/msm8996.dtsi | 64 +--
 1 file changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 4010e2f1a177..70009b15ffd1 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -620,7 +620,6 @@
pm8994_l31: l31 {};
pm8994_l32: l32 {};
};
-
};
};
 
@@ -637,7 +636,7 @@
 
tcsr_mutex_regs: syscon@74 {
compatible = "syscon";
-   reg = <0x74 0x2>;
+   reg = <0x74 0x4>;
};
 
tcsr: syscon@7a {
@@ -1850,6 +1849,67 @@
power-domains = <&mmcc VENUS_CORE1_GDSC>;
};
};
+
+   remoteproc@208 {
+   compatible = "qcom,msm8996-mss-pil";
+
+   reg = <0x208 0x100>,
+ <0x218 0x040>;
+   reg-names = "qdsp6", "rmb";
+
+   interrupts-extended = <&intc 0 448 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 0 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 1 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 2 
IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 3 
IRQ_TYPE_EDGE_RISING>;
+   interrupt-names = "wdog", "fatal", "ready",
+ "handover", "stop-ack";
+
+   clocks = <&xo_board>,
+<&gcc GCC_MSS_CFG_AHB_CLK>,
+<&rpmcc RPM_SMD_PCNOC_CLK>,
+<&gcc GCC_MSS_Q6_BIMC_AXI_CLK>,
+<&gcc GCC_BOOT_ROM_AHB_CLK>,
+<&gcc GCC_MSS_GPLL0_DIV_CLK>,
+<&gcc GCC_MSS_SNOC_AXI_CLK>,
+<&gcc GCC_MSS_MNOC_BIMC_AXI_CLK>,
+<&rpmcc RPM_SMD_QDSS_CLK>;
+
+   clock-names = "xo", "iface", "pnoc", "bus",
+ "mem", "gpll0_mss_clk", "snoc_axi_clk",
+ "mnoc_axi_clk", "qdss";
+
+   mx-supply = <&pm8994_s2>;
+   cx-supply = <&pm8994_s1>;
+   pll-supply = <&pm8994_l12>;
+
+   resets = <&gcc GCC_MSS_RESTART>;
+   reset-names = "mss_restart";
+
+   qcom,halt-regs = <&tcsr_mutex_regs 0x23000 0x25000 
0x24000>;
+
+   qcom,smem-states = <&modem_smp2p_out 0>;
+   qcom,smem-state-names = "stop";
+
+   status = "disabled";
+
+   mba {
+   memory-region = <&mba_region>;
+   };
+
+   mpss {
+   memory-region = <&mpss_region>;
+   };
+
+   smd-edge {
+   interrupts = <0 449 IRQ_TYPE_EDGE_RISING>;
+
+   label = "modem";
+   mboxes = <&apcs_glb 12>;
+   qcom,smd-edge = <0>;
+   qcom,remote-pid = <1>;
+   };
+   };
};
 
sound: sound {
-- 
2.16.2

[PATCH v3] rpmsg: qcom_smd: Access APCS through mailbox framework

2018-04-19 Thread Bjorn Andersson

Attempt to acquire the APCS IPC through the mailbox framework and fall
back to the old syscon based approach, to allow us to move away from
using the syscon.

Reviewed-by: Arun Kumar Neelakantam 
Signed-off-by: Bjorn Andersson 
---

Changes since v2:
- Added comment about mbox_send_message() return value.

 .../devicetree/bindings/soc/qcom/qcom,smd.txt  |  8 ++-
 drivers/rpmsg/Kconfig  |  1 +
 drivers/rpmsg/qcom_smd.c   | 67 --
 3 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,smd.txt 
b/Documentation/devicetree/bindings/soc/qcom/qcom,smd.txt
index ea1dc75ec9ea..234ae2256501 100644
--- a/Documentation/devicetree/bindings/soc/qcom/qcom,smd.txt
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,smd.txt
@@ -22,9 +22,15 @@ The edge is described by the following properties:
Definition: should specify the IRQ used by the remote processor to
signal this processor about communication related updates
 
-- qcom,ipc:
+- mboxes:
Usage: required
Value type: 
+   Definition: reference to the associated doorbell in APCS, as described
+   in mailbox/mailbox.txt
+
+- qcom,ipc:
+   Usage: required, unless mboxes is specified
+   Value type: 
Definition: three entries specifying the outgoing ipc bit used for
signaling the remote processor:
- phandle to a syscon node representing the apcs registers
diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
index 0fe6eac46512..2e4fb4ffd562 100644
--- a/drivers/rpmsg/Kconfig
+++ b/drivers/rpmsg/Kconfig
@@ -39,6 +39,7 @@ config RPMSG_QCOM_GLINK_SMEM
 
 config RPMSG_QCOM_SMD
tristate "Qualcomm Shared Memory Driver (SMD)"
+   depends on MAILBOX
depends on QCOM_SMEM
select RPMSG
help
diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
index bc0b30657230..3ff271a44bef 100644
--- a/drivers/rpmsg/qcom_smd.c
+++ b/drivers/rpmsg/qcom_smd.c
@@ -14,6 +14,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -107,6 +108,8 @@ static const struct {
  * @ipc_regmap:regmap handle holding the outgoing ipc register
  * @ipc_offset:offset within @ipc_regmap of the register for 
ipc
  * @ipc_bit:   bit in the register at @ipc_offset of @ipc_regmap
+ * @mbox_client:   mailbox client handle
+ * @mbox_chan: apcs ipc mailbox channel handle
  * @channels:  list of all channels detected on this edge
  * @channels_lock: guard for modifications of @channels
  * @allocated: array of bitmaps representing already allocated channels
@@ -129,6 +132,9 @@ struct qcom_smd_edge {
int ipc_offset;
int ipc_bit;
 
+   struct mbox_client mbox_client;
+   struct mbox_chan *mbox_chan;
+
struct list_head channels;
spinlock_t channels_lock;
 
@@ -366,7 +372,17 @@ static void qcom_smd_signal_channel(struct 
qcom_smd_channel *channel)
 {
struct qcom_smd_edge *edge = channel->edge;
 
-   regmap_write(edge->ipc_regmap, edge->ipc_offset, BIT(edge->ipc_bit));
+   if (edge->mbox_chan) {
+   /*
+* We can ignore a failing mbox_send_message() as the only
+* possible cause is that the FIFO in the framework is full of
+* other writes to the same bit.
+*/
+   mbox_send_message(edge->mbox_chan, NULL);
+   mbox_client_txdone(edge->mbox_chan, 0);
+   } else {
+   regmap_write(edge->ipc_regmap, edge->ipc_offset, 
BIT(edge->ipc_bit));
+   }
 }
 
 /*
@@ -1326,27 +1342,37 @@ static int qcom_smd_parse_edge(struct device *dev,
key = "qcom,remote-pid";
of_property_read_u32(node, key, &edge->remote_pid);
 
-   syscon_np = of_parse_phandle(node, "qcom,ipc", 0);
-   if (!syscon_np) {
-   dev_err(dev, "no qcom,ipc node\n");
-   return -ENODEV;
-   }
+   edge->mbox_client.dev = dev;
+   edge->mbox_client.knows_txdone = true;
+   edge->mbox_chan = mbox_request_channel(&edge->mbox_client, 0);
+   if (IS_ERR(edge->mbox_chan)) {
+   if (PTR_ERR(edge->mbox_chan) != -ENODEV)
+   return PTR_ERR(edge->mbox_chan);
 
-   edge->ipc_regmap = syscon_node_to_regmap(syscon_np);
-   if (IS_ERR(edge->ipc_regmap))
-   return PTR_ERR(edge->ipc_regmap);
+   edge->mbox_chan = NULL;
 
-   key = "qcom,ipc";
-   ret = of_property_read_u32_index(node, key, 1, &edge->ipc_offset);
-   if (ret < 0) {
-   dev_err(dev, "no offset in %s\n", key);
-   return -EINVAL;
-   }
+   syscon_np = of_parse_phandle(node, "qcom,ipc", 0);
+   if (!syscon_np) {
+   dev_err(dev, "no qcom,ipc node\n");
+

[GIT] Networking

2018-04-19 Thread David Miller


1) Unbalanced refcounting in TIPC, from Jon Maloy.

2) Only allow TCP_MD5SIG to be set on sockets in close or
   listen state.  Once the connection is established it makes
   no sense to change this.  From Eric Dumazet.

3) Missing attribute validation in neigh_dump_table(), also from Eric
   Dumazet.

4) Fix address comparisons in SCTP, from Xin Long.

5) Neigh proxy table clearing can deadlock, from Wolfgang
   Bumiller.

6) Fix tunnel refcounting in l2tp, from Guillaume Nault.

7) Fix double list insert in team driver, from Paolo Abeni.

8) af_vsock.ko module was accidently made unremovable, from
   Stefan Hajnoczi.

9) Fix reference to freed llc_sap object in llc stack, from
   Cong Wang.

10) Don't assume netdevice struct is DMA'able memory in virtio_net
driver, from Michael S. Tsirkin.

Please pull, thanks a lot!

The following changes since commit 5d1365940a68dd57b031b6e3c07d7d451cd69daf:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-04-12 
11:09:05 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to 1255fcb2a655f05e02f3a74675a6d6525f187afd:

  net/smc: fix shutdown in state SMC_LISTEN (2018-04-19 16:38:39 -0400)


Anders Roxell (1):
  selftests: net: add in_netns.sh to TEST_PROGS

Bert Kenward (1):
  sfc: check RSS is active for filter insert

Bjørn Mork (1):
  tun: fix vlan packet truncation

Colin Ian King (2):
  net: caif: fix spelling mistake "UKNOWN" -> "UNKNOWN"
  atm: iphase: fix spelling mistake: "Tansmit" -> "Transmit"

Cong Wang (1):
  llc: hold llc_sap before release_sock()

Dan Carpenter (1):
  Revert "macsec: missing dev_put() on error in macsec_newlink()"

David S. Miller (6):
  Merge branch 'ibmvnic-Fix-parameter-change-request-handling'
  Merge branch 
'nfp-improve-signal-handing-on-FW-waits-and-flower-control-message-Jakub 
Kicinski says:
  Merge branch 'l2tp-remove-unsafe-calls-to-l2tp_tunnel_find_nth'
  Merge branch 'sfc-ARFS-fixes'
  Merge branch 'tipc-Better-check-user-provided-attributes'
  Merge branch 'virtio-ctrl-buffer-fixes'

Doron Roberts-Kedes (1):
  strparser: Fix incorrect strp->need_bytes value.

Edward Cree (3):
  sfc: insert ARFS filters with replace_equal=true
  sfc: pass the correctly bogus filter_id to rps_may_expire_flow()
  sfc: limit ARFS workitems in flight per channel

Eric Biggers (1):
  KEYS: DNS: limit the length of option strings

Eric Dumazet (5):
  tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets
  net: validate attribute sizes in neigh_dump_table()
  net: af_packet: fix race in PACKET_{R|T}X_RING
  tipc: add policy for TIPC_NLA_NET_ADDR
  tipc: fix possible crash in __tipc_nl_net_set()

Gao Feng (1):
  net: Fix one possible memleak in ip_setup_cork

Guillaume Nault (3):
  l2tp: hold reference on tunnels in netlink dumps
  l2tp: hold reference on tunnels printed in pppol2tp proc file
  l2tp: hold reference on tunnels printed in l2tp/tunnels debugfs file

Jakub Kicinski (2):
  nfp: ignore signals when communicating with management FW
  nfp: print a message when mutex wait is interrupted

Jason Wang (1):
  virtio-net: add missing virtqueue kick when flushing packets

Jon Maloy (3):
  tipc: fix unbalanced reference counter
  tipc: fix missing initializer in tipc_sendmsg()
  tipc: fix use-after-free in tipc_nametbl_stop

Jonathan Corbet (1):
  MAINTAINERS: Direct networking documentation changes to netdev

Jose Abreu (1):
  net: stmmac: Disable ACS Feature for GMAC >= 4

Kees Cook (2):
  ibmvnic: Define vnic_login_client_data name field as unsized array
  net/tls: Remove VLA usage

Laura Abbott (1):
  mISDN: Remove VLAs

Maxime Chevallier (2):
  net: mvpp2: Fix TCAM filter reserved range
  net: mvpp2: Fix DMA address mask size

Michael S. Tsirkin (3):
  virtio_net: split out ctrl buffer
  virtio_net: fix adding vids on big-endian
  virtio_net: sparse annotation fix

Nathan Fontenot (2):
  ibmvnic: Handle all login error conditions
  ibmvnic: Do not notify peers on parameter change resets

Nicolas Dechesne (1):
  net: qrtr: add MODULE_ALIAS_NETPROTO macro

Olivier Gayot (1):
  docs: ip-sysctl.txt: fix name of some ipv6 variables

Paolo Abeni (1):
  team: avoid adding twice the same option to the event list

Pawel Dembicki (1):
  net: qmi_wwan: add Wistron Neweb D19Q1

Pieter Jansen van Vuuren (2):
  nfp: flower: move route ack control messages out of the workqueue
  nfp: flower: split and limit cmsg skb lists

Raghuram Chary J (1):
  lan78xx: PHY DSP registers initialization to address EEE link drop issues 
with long cables

Randy Dunlap (1):
  textsearch: fix kernel-doc warnings and add kernel-api section

Richard Cochran (1):
  net: dsa: mv

[PATCH] serial: imx: enable IMX21_UCR3_RXDMUXSEL for non-dte-mode

2018-04-19 Thread Chris Ruehl

Fix a problem introduced with
commit e61c38d85b73 ("serial: imx: setup DCEDTE early and ensure DCD and RI 
irqs to be off")
result in non dte-mode imx-uart fail receive data.
By add back IMX21_UCR3_RXDMUXSEL the serial port works as expected.

Signed-off-by: Chris Ruehl 
---
 drivers/tty/serial/imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 91f3a1a..3d09933 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -1391,7 +1391,7 @@ static int imx_uart_startup(struct uart_port *port)
 
ucr3 = imx_uart_readl(sport, UCR3);
 
-   ucr3 |= UCR3_DTRDEN | UCR3_RI | UCR3_DCD;
+   ucr3 |= IMX21_UCR3_RXDMUXSEL | UCR3_DTRDEN | UCR3_RI | UCR3_DCD;
 
if (sport->dte_mode)
/* disable broken interrupts */
-- 
2.1.4

[REVIEW][PATCH 16/17] signal/alpha: Replace TRAP_FIXME with TRAP_UNK

2018-04-19 Thread Eric W. Biederman

Using an si_code of 0 that aliases with SI_USER is clearly the wrong
thing to do, and causes problems in interesting ways.

For it really is not clear to me if using TRAP_UNK bugcheck or
the default case of gentrap is really the best way to handle
things.  There is certainly enough information that that a more
specific si_code could potentially be used.  That said TRAP_UNK
is definitely an improvement over 0 as it removes the ambiguiuty
of what si_code of 0 with SIGTRAP means on alpha.

Recent history suggests no actually cares about crazy corner cases of
the kernel behavior like this so I don't expect any regressions from
changing this.  However if something does happen this change is easy
to revert.

Cc: Helge Deller 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: linux-al...@vger.kernel.org
Fixes: 0a635c7a84cf ("Fill in siginfo_t.")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" 
---
 arch/alpha/include/uapi/asm/siginfo.h | 7 ---
 arch/alpha/kernel/traps.c | 4 ++--
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/siginfo.h 
b/arch/alpha/include/uapi/asm/siginfo.h
index 3ebbb1e17902..db3f0138536f 100644
--- a/arch/alpha/include/uapi/asm/siginfo.h
+++ b/arch/alpha/include/uapi/asm/siginfo.h
@@ -7,11 +7,4 @@
 
 #include 
 
-/*
- * SIGTRAP si_codes
- */
-#ifdef __KERNEL__
-#define TRAP_FIXME 0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 #endif
diff --git a/arch/alpha/kernel/traps.c b/arch/alpha/kernel/traps.c
index 422b676b28f2..242c83d86ace 100644
--- a/arch/alpha/kernel/traps.c
+++ b/arch/alpha/kernel/traps.c
@@ -288,7 +288,7 @@ do_entIF(unsigned long type, struct pt_regs *regs)
  case 1: /* bugcheck */
info.si_signo = SIGTRAP;
info.si_errno = 0;
-   info.si_code = TRAP_FIXME;
+   info.si_code = TRAP_UNK;
info.si_addr = (void __user *) regs->pc;
info.si_trapno = 0;
send_sig_info(SIGTRAP, &info, current);
@@ -350,7 +350,7 @@ do_entIF(unsigned long type, struct pt_regs *regs)
case GEN_SUBRNG7:
default:
signo = SIGTRAP;
-   code = TRAP_FIXME;
+   code = TRAP_UNK;
break;
}
 
-- 
2.14.1

[REVIEW][PATCH 14/17] signal/unicore32: Use FPE_FLTUNK instead of 0 in ucf64_raise_sigfpe

2018-04-19 Thread Eric W. Biederman

The si_code of 0 (aka SI_USER) has fields si_pid and si_uid not
si_addr so it so only by luck would the appropriate fields by copied
to userspace by copy_siginfo_to_user.

This is just broken and wrong.

Make it obvious what is happening by moving the si_code from a
parameter of the one call to ucf64_raise_sigfpe to a constant value
that info.si_code gets set to.

Explicitly set the si_code to FPE_FLTUNK the newly reserved floating
point si_code for an unknown floating point exception.

It looks like there is a fair chance that this is a code path that has
never been used in real life on unicore32.  The bad si_code and the
print statement that calls it an unhandled exception.  So I really
don't expect anyone will mind if this just gets fixed.

In similar situations on more popular architectures the conclusion was
just fix it.

Cc: Guan Xuetao 
Cc: Arnd Bergmann 
Fixes: d9bc15794d12 ("unicore32 additional architecture files: float point 
handling")
Signed-off-by: "Eric W. Biederman" 
---
 arch/unicore32/kernel/fpu-ucf64.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/unicore32/kernel/fpu-ucf64.c 
b/arch/unicore32/kernel/fpu-ucf64.c
index d785955e1c29..8594b168f25e 100644
--- a/arch/unicore32/kernel/fpu-ucf64.c
+++ b/arch/unicore32/kernel/fpu-ucf64.c
@@ -52,14 +52,14 @@
  * Raise a SIGFPE for the current process.
  * sicode describes the signal being raised.
  */
-void ucf64_raise_sigfpe(unsigned int sicode, struct pt_regs *regs)
+void ucf64_raise_sigfpe(struct pt_regs *regs)
 {
siginfo_t info;
 
clear_siginfo(&info);
 
info.si_signo = SIGFPE;
-   info.si_code = sicode;
+   info.si_code = FPE_FLTUNK;
info.si_addr = (void __user *)(instruction_pointer(regs) - 4);
 
/*
@@ -94,7 +94,7 @@ void ucf64_exchandler(u32 inst, u32 fpexc, struct pt_regs 
*regs)
pr_debug("UniCore-F64 FPSCR 0x%08x INST 0x%08x\n",
cff(FPSCR), inst);
 
-   ucf64_raise_sigfpe(0, regs);
+   ucf64_raise_sigfpe(regs);
return;
}
 
-- 
2.14.1

[REVIEW][PATCH 17/17] signal/powerpc: Replace TRAP_FIXME with TRAP_UNK

2018-04-19 Thread Eric W. Biederman

Using an si_code of 0 that aliases with SI_USER is clearly the wrong
thing todo, and causes problems in interesting ways.

For use in unknown_exception the recently defined TRAP_UNK
semantically is a perfect fit.  For use in RunModeException it looks
like something more specific than TRAP_UNK could be used.  No one has
bothered to find a better fit than the broken si_code of 0 in all of
these years and I don't see an obvious better fit so TRAP_UNK is
switching RunModeException to return TRAP_UNK is clearly an
improvement.

Recent history suggests no actually cares about crazy corner
cases of the kernel behavior like this so I don't expect any
regressions from changing this.  However if something does
happen this change is easy to revert.

Though I wonder if SIGKILL might not be a better fit.

Cc: Paul Mackerras 
Cc: Kumar Gala 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: linuxppc-...@lists.ozlabs.org
Fixes: 9bad068c24d7 ("[PATCH] ppc32: support for e500 and 85xx")
Fixes: 0ed70f6105ef ("PPC32: Provide proper siginfo information on various 
exceptions.")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" 
---
 arch/powerpc/include/uapi/asm/siginfo.h | 8 
 arch/powerpc/kernel/traps.c | 4 ++--
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/siginfo.h 
b/arch/powerpc/include/uapi/asm/siginfo.h
index 0437afc9ef3c..1d51d9b88221 100644
--- a/arch/powerpc/include/uapi/asm/siginfo.h
+++ b/arch/powerpc/include/uapi/asm/siginfo.h
@@ -15,12 +15,4 @@
 
 #include 
 
-/*
- * SIGTRAP si_codes
- */
-#ifdef __KERNEL__
-#define TRAP_FIXME 0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
-
 #endif /* _ASM_POWERPC_SIGINFO_H */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index fdf9400beec8..0e17dcb48720 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -969,7 +969,7 @@ void unknown_exception(struct pt_regs *regs)
printk("Bad trap at PC: %lx, SR: %lx, vector=%lx\n",
   regs->nip, regs->msr, regs->trap);
 
-   _exception(SIGTRAP, regs, TRAP_FIXME, 0);
+   _exception(SIGTRAP, regs, TRAP_UNK, 0);
 
exception_exit(prev_state);
 }
@@ -991,7 +991,7 @@ void instruction_breakpoint_exception(struct pt_regs *regs)
 
 void RunModeException(struct pt_regs *regs)
 {
-   _exception(SIGTRAP, regs, TRAP_FIXME, 0);
+   _exception(SIGTRAP, regs, TRAP_UNK, 0);
 }
 
 void single_step_exception(struct pt_regs *regs)
-- 
2.14.1

[REVIEW][PATCH 13/17] signal/powerpc: Replace FPE_FIXME with FPE_FLTUNK

2018-04-19 Thread Eric W. Biederman

Using an si_code of 0 that aliases with SI_USER is clearly the
wrong thing todo, and causes problems in interesting ways.

The newly defined FPE_FLTUNK semantically appears to fit the
bill so use it instead.

Cc: Paul Mackerras 
Cc: Kumar Gala 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc:  linuxppc-...@lists.ozlabs.org
Fixes: 9bad068c24d7 ("[PATCH] ppc32: support for e500 and 85xx")
Fixes: 0ed70f6105ef ("PPC32: Provide proper siginfo information on various 
exceptions.")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" 
---
 arch/powerpc/include/uapi/asm/siginfo.h | 7 ---
 arch/powerpc/kernel/traps.c | 6 +++---
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/siginfo.h 
b/arch/powerpc/include/uapi/asm/siginfo.h
index 9f142451a01f..0437afc9ef3c 100644
--- a/arch/powerpc/include/uapi/asm/siginfo.h
+++ b/arch/powerpc/include/uapi/asm/siginfo.h
@@ -15,13 +15,6 @@
 
 #include 
 
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 /*
  * SIGTRAP si_codes
  */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 087855caf6a9..fdf9400beec8 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1031,7 +1031,7 @@ static void emulate_single_step(struct pt_regs *regs)
 
 static inline int __parse_fpscr(unsigned long fpscr)
 {
-   int ret = FPE_FIXME;
+   int ret = FPE_FLTUNK;
 
/* Invalid operation */
if ((fpscr & FPSCR_VE) && (fpscr & FPSCR_VX))
@@ -1972,7 +1972,7 @@ void SPEFloatingPointException(struct pt_regs *regs)
extern int do_spe_mathemu(struct pt_regs *regs);
unsigned long spefscr;
int fpexc_mode;
-   int code = FPE_FIXME;
+   int code = FPE_FLTUNK;
int err;
 
flush_spe_to_thread(current);
@@ -2041,7 +2041,7 @@ void SPEFloatingPointRoundException(struct pt_regs *regs)
printk(KERN_ERR "unrecognized spe instruction "
   "in %s at %lx\n", current->comm, regs->nip);
} else {
-   _exception(SIGFPE, regs, FPE_FIXME, regs->nip);
+   _exception(SIGFPE, regs, FPE_FLTUNK, regs->nip);
return;
}
 }
-- 
2.14.1

[REVIEW][PATCH 11/17] signal/alpha: Replace FPE_FIXME with FPE_FLTUNK

2018-04-19 Thread Eric W. Biederman

Using an si_code of 0 that aliases with SI_USER is clearly the wrong
thing todo, and causes problems in interesting ways.

The newly defined FPE_FLTUNK semantically appears to fit the bill so
use it instead.

Given recent experience in this area odds are it will not break
anything.  Fixing it removes a hazard to kernel maintenance.

Cc: Helge Deller 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: linux-al...@vger.kernel.org
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: 0a635c7a84cf ("Fill in siginfo_t.")
Signed-off-by: "Eric W. Biederman" 
---
 arch/alpha/include/uapi/asm/siginfo.h | 7 ---
 arch/alpha/kernel/osf_sys.c   | 2 +-
 arch/alpha/kernel/traps.c | 2 +-
 3 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/siginfo.h 
b/arch/alpha/include/uapi/asm/siginfo.h
index 0cf3b527b274..3ebbb1e17902 100644
--- a/arch/alpha/include/uapi/asm/siginfo.h
+++ b/arch/alpha/include/uapi/asm/siginfo.h
@@ -7,13 +7,6 @@
 
 #include 
 
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 /*
  * SIGTRAP si_codes
  */
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index f5f154942aab..bb3619118926 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -872,7 +872,7 @@ SYSCALL_DEFINE5(osf_setsysinfo, unsigned long, op, void 
__user *, buffer,
fex = (exc >> IEEE_STATUS_TO_EXCSUM_SHIFT) & swcr;
if (fex) {
siginfo_t info;
-   int si_code = FPE_FIXME;
+   int si_code = FPE_FLTUNK;
 
if (fex & IEEE_TRAP_ENABLE_DNO) si_code = FPE_FLTUND;
if (fex & IEEE_TRAP_ENABLE_INE) si_code = FPE_FLTRES;
diff --git a/arch/alpha/kernel/traps.c b/arch/alpha/kernel/traps.c
index 91636765dd6d..422b676b28f2 100644
--- a/arch/alpha/kernel/traps.c
+++ b/arch/alpha/kernel/traps.c
@@ -328,7 +328,7 @@ do_entIF(unsigned long type, struct pt_regs *regs)
break;
case GEN_ROPRAND:
signo = SIGFPE;
-   code = FPE_FIXME;
+   code = FPE_FLTUNK;
break;
 
case GEN_DECOVF:
-- 
2.14.1

[REVIEW][PATCH 15/17] signal: Add TRAP_UNK si_code for undiagnosted trap exceptions

2018-04-19 Thread Eric W. Biederman

Both powerpc and alpha have cases where they wronly set si_code to 0
in combination with SIGTRAP and don't mean SI_USER.

About half the time this is because the architecture can not report
accurately what kind of trap exception triggered the trap exception.
The other half the time it looks like no one has bothered to
figure out an appropriate si_code.

For the cases where the architecture does not have enough information
or is too lazy to figure out exactly what kind of trap exception
it is define TRAP_UNK.

Cc: linux-...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-al...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: "Eric W. Biederman" 
---
 arch/x86/kernel/signal_compat.c| 2 +-
 include/uapi/asm-generic/siginfo.h | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index 14c057f29979..9ccbf0576cd0 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -29,7 +29,7 @@ static inline void signal_compat_build_tests(void)
BUILD_BUG_ON(NSIGFPE  != 15);
BUILD_BUG_ON(NSIGSEGV != 7);
BUILD_BUG_ON(NSIGBUS  != 5);
-   BUILD_BUG_ON(NSIGTRAP != 4);
+   BUILD_BUG_ON(NSIGTRAP != 5);
BUILD_BUG_ON(NSIGCHLD != 6);
BUILD_BUG_ON(NSIGSYS  != 1);
 
diff --git a/include/uapi/asm-generic/siginfo.h 
b/include/uapi/asm-generic/siginfo.h
index 558b902f18d4..80e2a7227205 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -249,7 +249,8 @@ typedef struct siginfo {
 #define TRAP_TRACE 2   /* process trace trap */
 #define TRAP_BRANCH 3  /* process taken branch trap */
 #define TRAP_HWBKPT 4  /* hardware breakpoint/watchpoint */
-#define NSIGTRAP   4
+#define TRAP_UNK   5   /* undiagnosed trap */
+#define NSIGTRAP   5
 
 /*
  * There is an additional set of SIGTRAP si_codes used by ptrace
-- 
2.14.1

[REVIEW][PATCH 12/17] signal/ia64: Replace FPE_FIXME with FPE_FLTUNK

2018-04-19 Thread Eric W. Biederman

Using an si_code of 0 that aliases with SI_USER is clearly the wrong
thing todo, and causes problems in interesting ways.

The newly defined FPE_FLTUNK semantically appears to fit the bill so
use it instead.

Given recent experience in this area odds are it will not
break anything.  Fixing it removes a hazard to kernel maintenance.

Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: linux-i...@vger.kernel.org
Fixes: 987159266c45 ("Linux version 2.3.48")
Signed-off-by: "Eric W. Biederman" 
---
 arch/ia64/include/uapi/asm/siginfo.h | 7 ---
 arch/ia64/kernel/traps.c | 4 ++--
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/ia64/include/uapi/asm/siginfo.h 
b/arch/ia64/include/uapi/asm/siginfo.h
index 5aa454ed89db..52b5af424511 100644
--- a/arch/ia64/include/uapi/asm/siginfo.h
+++ b/arch/ia64/include/uapi/asm/siginfo.h
@@ -27,11 +27,4 @@
 #define __ISR_VALID_BIT0
 #define __ISR_VALID(1 << __ISR_VALID_BIT)
 
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 #endif /* _UAPI_ASM_IA64_SIGINFO_H */
diff --git a/arch/ia64/kernel/traps.c b/arch/ia64/kernel/traps.c
index 972873ed1ae5..c6f4932073a1 100644
--- a/arch/ia64/kernel/traps.c
+++ b/arch/ia64/kernel/traps.c
@@ -353,7 +353,7 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, 
unsigned long isr)
clear_siginfo(&siginfo);
siginfo.si_signo = SIGFPE;
siginfo.si_errno = 0;
-   siginfo.si_code = FPE_FIXME;/* default code */
+   siginfo.si_code = FPE_FLTUNK;   /* default code */
siginfo.si_addr = (void __user *) (regs->cr_iip + 
ia64_psr(regs)->ri);
if (isr & 0x11) {
siginfo.si_code = FPE_FLTINV;
@@ -380,7 +380,7 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, 
unsigned long isr)
clear_siginfo(&siginfo);
siginfo.si_signo = SIGFPE;
siginfo.si_errno = 0;
-   siginfo.si_code = FPE_FIXME;/* default code */
+   siginfo.si_code = FPE_FLTUNK;   /* default code */
siginfo.si_addr = (void __user *) (regs->cr_iip + 
ia64_psr(regs)->ri);
if (isr & 0x880) {
siginfo.si_code = FPE_FLTOVF;
-- 
2.14.1

Re: [PATCH v2 net 0/3] virtio: ctrl buffer fixes

2018-04-19 Thread David Miller

From: "Michael S. Tsirkin" 
Date: Fri, 20 Apr 2018 03:49:19 +0300

> On Thu, Apr 19, 2018 at 04:34:22PM -0400, David Miller wrote:
>> From: "Michael S. Tsirkin" 
>> Date: Thu, 19 Apr 2018 08:30:47 +0300
>> 
>> > Here are a couple of fixes related to the virtio control buffer.
>> > Lightly tested on x86 only.
>> 
>> Thanks for taking care of the control buffer DMA'ability issue.
>> 
>> Want any of these queued up for -stable?
> 
> Good point. Patches 1-2 for sure. 

Ok, queued up 1 and 2.

[REVIEW][PATCH 02/17] sparc: fix compat siginfo ABI regression

2018-04-19 Thread Eric W. Biederman

From: "Dmitry V. Levin" 

Starting with commit v4.14-rc1~60^2^2~1, a SIGFPE signal sent via kill
results to wrong values in si_pid and si_uid fields of compat siginfo_t.

This happens due to FPE_FIXME being defined to 0 for sparc, and at the
same time siginfo_layout() introduced by the same commit returns
SIL_FAULT for SIGFPE if si_code == SI_USER and FPE_FIXME is defined to 0.

Fix this regression by removing FPE_FIXME macro and changing all its users
to assign FPE_FLTUNK to si_code instead of FPE_FIXME.

Note that FPE_FLTUNK is a new macro introduced by commit
266da65e9156d93e1126e185259a4aae68188d0e.

Tested with commit v4.16-11958-g16e205cf42da.

This bug was found by strace test suite.

In the discussion about FPE_FLTUNK on sparc David Miller said:
> Eric, feel free to do something similar on Sparc.

Link: https://github.com/strace/strace/issues/21
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Fixes: 2.3.41
Cc: David Miller 
Cc: sparcli...@vger.kernel.org
Conceptually-Acked-By: David Miller 
Thanks-to: Anatoly Pugachev 
Signed-off-by: Dmitry V. Levin 
Signed-off-by: Eric W. Biederman 
---
 arch/sparc/include/uapi/asm/siginfo.h | 7 ---
 arch/sparc/kernel/traps_32.c  | 2 +-
 arch/sparc/kernel/traps_64.c  | 2 +-
 3 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/sparc/include/uapi/asm/siginfo.h 
b/arch/sparc/include/uapi/asm/siginfo.h
index 896ce447d16a..e7049550ac82 100644
--- a/arch/sparc/include/uapi/asm/siginfo.h
+++ b/arch/sparc/include/uapi/asm/siginfo.h
@@ -17,13 +17,6 @@
 
 #define SI_NOINFO  32767   /* no information in siginfo_t */
 
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
 /*
  * SIGEMT si_codes
  */
diff --git a/arch/sparc/kernel/traps_32.c b/arch/sparc/kernel/traps_32.c
index b1ed763e4787..33cd35bf3dc8 100644
--- a/arch/sparc/kernel/traps_32.c
+++ b/arch/sparc/kernel/traps_32.c
@@ -307,7 +307,7 @@ void do_fpe_trap(struct pt_regs *regs, unsigned long pc, 
unsigned long npc,
info.si_errno = 0;
info.si_addr = (void __user *)pc;
info.si_trapno = 0;
-   info.si_code = FPE_FIXME;
+   info.si_code = FPE_FLTUNK;
if ((fsr & 0x1c000) == (1 << 14)) {
if (fsr & 0x10)
info.si_code = FPE_FLTINV;
diff --git a/arch/sparc/kernel/traps_64.c b/arch/sparc/kernel/traps_64.c
index 462a21abd105..e81072ac52c3 100644
--- a/arch/sparc/kernel/traps_64.c
+++ b/arch/sparc/kernel/traps_64.c
@@ -2372,7 +2372,7 @@ static void do_fpe_common(struct pt_regs *regs)
info.si_errno = 0;
info.si_addr = (void __user *)regs->tpc;
info.si_trapno = 0;
-   info.si_code = FPE_FIXME;
+   info.si_code = FPE_FLTUNK;
if ((fsr & 0x1c000) == (1 << 14)) {
if (fsr & 0x10)
info.si_code = FPE_FLTINV;
-- 
2.14.1

[REVIEW][PATCH 05/17] signal/nds32: Use force_sig(SIGILL) in do_revisn

2018-04-19 Thread Eric W. Biederman

As originally committed do_revisn would deliver a siginfo for SIGILL
with an si_code composed of random stack contents.  That makes no
sense and is not something userspace can depend on.  So simplify
the code and just use "force_sig(SIG_ILL, current)" instead.

Fixes: 2923f5ea7738 ("nds32: Exception handling")
Cc: Vincent Chen 
Cc: Greentime Hu 
Cc: Arnd Bergmann 
Signed-off-by: "Eric W. Biederman" 
---
 arch/nds32/kernel/traps.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/nds32/kernel/traps.c b/arch/nds32/kernel/traps.c
index 65961bf91d64..8e9a5b1f6234 100644
--- a/arch/nds32/kernel/traps.c
+++ b/arch/nds32/kernel/traps.c
@@ -356,14 +356,11 @@ void do_dispatch_tlb_misc(unsigned long entry, unsigned 
long addr,
 
 void do_revinsn(struct pt_regs *regs)
 {
-   siginfo_t si;
pr_emerg("Reserved Instruction\n");
show_regs(regs);
if (!user_mode(regs))
do_exit(SIGILL);
-   si.si_signo = SIGILL;
-   si.si_errno = 0;
-   force_sig_info(SIGILL, &si, current);
+   force_sig(SIGILL, current);
 }
 
 #ifdef CONFIG_ALIGNMENT_TRAP
-- 
2.14.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 997 matches

Mail list logo