Re: [PATCH 1/6] Generic radix trees

2018-05-25 Thread Kent Overstreet
On Sat, May 26, 2018 at 11:16:42AM +0800, Liu Bo wrote:
> > +/*
> > + * Returns pointer to the specified byte @offset within @radix, allocating 
> > it if
> > + * necessary - newly allocated slots are always zeroed out:
> > + */
> > +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
> > +  gfp_t gfp_mask)
> > +{
> > +   struct genradix_node **n;
> 
> Any reason that " struct genradix_node ** " is used here instead of "
> struct genradix_node * "?
> 
> Looks like this function only manipulates *n, am I missing something?

It stores to *n, when it has to allocate a node (including the root)


Re: [PATCH 1/6] Generic radix trees

2018-05-25 Thread Kent Overstreet
On Sat, May 26, 2018 at 11:16:42AM +0800, Liu Bo wrote:
> > +/*
> > + * Returns pointer to the specified byte @offset within @radix, allocating 
> > it if
> > + * necessary - newly allocated slots are always zeroed out:
> > + */
> > +void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset,
> > +  gfp_t gfp_mask)
> > +{
> > +   struct genradix_node **n;
> 
> Any reason that " struct genradix_node ** " is used here instead of "
> struct genradix_node * "?
> 
> Looks like this function only manipulates *n, am I missing something?

It stores to *n, when it has to allocate a node (including the root)


Re: [PATCH] arm64: dts: hikey: Fix eMMC corruption regression

2018-05-25 Thread Leo Yan
On Fri, May 25, 2018 at 08:10:47PM -0700, John Stultz wrote:
> This patch is a partial revert of commit
> abd7d0972a19 ("arm64: dts: hikey: Enable HS200 mode on eMMC")
> 
> which has been causing eMMC corruption on my HiKey board.
> 
> Symptoms usually looked like:
> 
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> ...
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc0: new HS200 MMC card at address 0001
> ...
> dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> print_req_error: I/O error, dev mmcblk0, sector 8810504
> Aborting journal on device mmcblk0p10-8.
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected 
> aborted journal
> EXT4-fs (mmcblk0p10): Remounting filesystem read-only
> 
> And quite often this would result in a disk that wouldn't properly
> boot even with older kernels.

I tested this patch on the kernel 4.17.0-rc5, I don't see booting
issue with this patch.

Tested-by: Leo Yan 

> It seems the max-frequency property added by the above patch is
> causing the problem, so remove it.

Should Cc this patch to stable kernel mailing list as fixing?

> Cc: Ryan Grachek 
> Cc: Wei Xu 
> Cc: Arnd Bergmann 
> Cc: Ulf Hansson 
> Cc: YongQin Liu 
> Cc: Leo Yan 
> Signed-off-by: John Stultz 
> ---
>  arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
> b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> index 724a0d3..edb4ee0 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> +++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> @@ -299,7 +299,6 @@
>   /* GPIO blocks 16 thru 19 do not appear to be routed to pins */
>  
>   dwmmc_0: dwmmc0@f723d000 {
> - max-frequency = <15000>;
>   cap-mmc-highspeed;
>   mmc-hs200-1_8v;
>   non-removable;
> -- 
> 2.7.4
> 


Re: [PATCH] arm64: dts: hikey: Fix eMMC corruption regression

2018-05-25 Thread Leo Yan
On Fri, May 25, 2018 at 08:10:47PM -0700, John Stultz wrote:
> This patch is a partial revert of commit
> abd7d0972a19 ("arm64: dts: hikey: Enable HS200 mode on eMMC")
> 
> which has been causing eMMC corruption on my HiKey board.
> 
> Symptoms usually looked like:
> 
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> ...
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc0: new HS200 MMC card at address 0001
> ...
> dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> print_req_error: I/O error, dev mmcblk0, sector 8810504
> Aborting journal on device mmcblk0p10-8.
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
> 40HZ div = 31)
> mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
> 14880HZ div = 0)
> EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected 
> aborted journal
> EXT4-fs (mmcblk0p10): Remounting filesystem read-only
> 
> And quite often this would result in a disk that wouldn't properly
> boot even with older kernels.

I tested this patch on the kernel 4.17.0-rc5, I don't see booting
issue with this patch.

Tested-by: Leo Yan 

> It seems the max-frequency property added by the above patch is
> causing the problem, so remove it.

Should Cc this patch to stable kernel mailing list as fixing?

> Cc: Ryan Grachek 
> Cc: Wei Xu 
> Cc: Arnd Bergmann 
> Cc: Ulf Hansson 
> Cc: YongQin Liu 
> Cc: Leo Yan 
> Signed-off-by: John Stultz 
> ---
>  arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
> b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> index 724a0d3..edb4ee0 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> +++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
> @@ -299,7 +299,6 @@
>   /* GPIO blocks 16 thru 19 do not appear to be routed to pins */
>  
>   dwmmc_0: dwmmc0@f723d000 {
> - max-frequency = <15000>;
>   cap-mmc-highspeed;
>   mmc-hs200-1_8v;
>   non-removable;
> -- 
> 2.7.4
> 


Re: linux-next: manual merge of the staging tree with the v4l-dvb tree

2018-05-25 Thread Stephen Rothwell
Hi Mauro,

On Thu, 17 May 2018 07:06:57 -0300 Mauro Carvalho Chehab 
 wrote:
>
> What do you use in order to check it? Maybe we could have some git
> hook running such check, in order to prevent merging patches without
> the right SOBs.

I run the script below on the range of new commits each time a fetch a
tree ...

-- 
Cheers,
Stephen Rothwell


#!/bin/bash

if [ "$#" -lt 1 ]; then
printf "Usage: %s \n", "$0" 1>&2
exit 1
fi

commits=$(git rev-list --no-merges "$@")
if [ -z "$commits" ]; then
printf "No commits\n"
exit 0
fi

for c in $commits; do
ae=$(git log -1 --format='%ae' "$c")
aE=$(git log -1 --format='%aE' "$c")
an=$(git log -1 --format='%an' "$c")
aN=$(git log -1 --format='%aN' "$c")
ce=$(git log -1 --format='%ce' "$c")
cE=$(git log -1 --format='%cE' "$c")
cn=$(git log -1 --format='%cn' "$c")
cN=$(git log -1 --format='%cN' "$c")
sob=$(git log -1 --format='%b' "$c" | grep -i 
'^[[:space:]]*Signed-off-by:')

am=false
cm=false
grep -i -q "<$ae>" <<<"$sob" ||
grep -i -q "<$aE>" <<<"$sob" ||
grep -i -q ":[[:space:]]*$an[[:space:]]*<" <<<"$sob" ||
grep -i -q ":[[:space:]]*$aN[[:space:]]*<" <<<"$sob" ||
am=true
grep -i -q "<$ce>" <<<"$sob" ||
grep -i -q "<$cE>" <<<"$sob" ||
grep -i -q ":[[:space:]]*$cn[[:space:]]*<" <<<"$sob" ||
grep -i -q ":[[:space:]]*$cN[[:space:]]*<" <<<"$sob" ||
cm=true

if "$am" || "$cm"; then
printf "Commit %s\n" "$c"
"$am" && printf "\tauthor SOB missing\n"
"$cm" && printf "\tcommitter SOB missing\n"
printf "%s %s\n%s\n" "$ae" "$ce" "$sob"
fi
done

exec gitk "$@"



pgprmbAPsUtmy.pgp
Description: OpenPGP digital signature


Re: linux-next: manual merge of the staging tree with the v4l-dvb tree

2018-05-25 Thread Stephen Rothwell
Hi Mauro,

On Thu, 17 May 2018 07:06:57 -0300 Mauro Carvalho Chehab 
 wrote:
>
> What do you use in order to check it? Maybe we could have some git
> hook running such check, in order to prevent merging patches without
> the right SOBs.

I run the script below on the range of new commits each time a fetch a
tree ...

-- 
Cheers,
Stephen Rothwell


#!/bin/bash

if [ "$#" -lt 1 ]; then
printf "Usage: %s \n", "$0" 1>&2
exit 1
fi

commits=$(git rev-list --no-merges "$@")
if [ -z "$commits" ]; then
printf "No commits\n"
exit 0
fi

for c in $commits; do
ae=$(git log -1 --format='%ae' "$c")
aE=$(git log -1 --format='%aE' "$c")
an=$(git log -1 --format='%an' "$c")
aN=$(git log -1 --format='%aN' "$c")
ce=$(git log -1 --format='%ce' "$c")
cE=$(git log -1 --format='%cE' "$c")
cn=$(git log -1 --format='%cn' "$c")
cN=$(git log -1 --format='%cN' "$c")
sob=$(git log -1 --format='%b' "$c" | grep -i 
'^[[:space:]]*Signed-off-by:')

am=false
cm=false
grep -i -q "<$ae>" <<<"$sob" ||
grep -i -q "<$aE>" <<<"$sob" ||
grep -i -q ":[[:space:]]*$an[[:space:]]*<" <<<"$sob" ||
grep -i -q ":[[:space:]]*$aN[[:space:]]*<" <<<"$sob" ||
am=true
grep -i -q "<$ce>" <<<"$sob" ||
grep -i -q "<$cE>" <<<"$sob" ||
grep -i -q ":[[:space:]]*$cn[[:space:]]*<" <<<"$sob" ||
grep -i -q ":[[:space:]]*$cN[[:space:]]*<" <<<"$sob" ||
cm=true

if "$am" || "$cm"; then
printf "Commit %s\n" "$c"
"$am" && printf "\tauthor SOB missing\n"
"$cm" && printf "\tcommitter SOB missing\n"
printf "%s %s\n%s\n" "$ae" "$ce" "$sob"
fi
done

exec gitk "$@"



pgprmbAPsUtmy.pgp
Description: OpenPGP digital signature


Re: [PATCH 0/3] Dts nodes for Keystone2 hw_rng driver

2018-05-25 Thread santosh.shilim...@oracle.com

On 5/24/18 6:12 AM, Vitaly Andrianov wrote:

This series adds dts nodes for Keystone2 hw_rng driver

Vitaly Andrianov (3):
   ARM: dts: k2hk: add dts node for k2hk hw_rng driver
   ARM: dts: k2l: add dts node for k2l hw_rng driver
   ARM: dts: k2e: add dts node for k2e hw_rng driver


Looks good. Will queue this up for 4.19. Thanks !!

Regards,
Santosh


Re: [PATCH 0/3] Dts nodes for Keystone2 hw_rng driver

2018-05-25 Thread santosh.shilim...@oracle.com

On 5/24/18 6:12 AM, Vitaly Andrianov wrote:

This series adds dts nodes for Keystone2 hw_rng driver

Vitaly Andrianov (3):
   ARM: dts: k2hk: add dts node for k2hk hw_rng driver
   ARM: dts: k2l: add dts node for k2l hw_rng driver
   ARM: dts: k2e: add dts node for k2e hw_rng driver


Looks good. Will queue this up for 4.19. Thanks !!

Regards,
Santosh


arch/powerpc/kernel/head_32.S:1106: Error: missing operand

2018-05-25 Thread Paul Menzel

Dear Linux folks,


Building the configuration created with `make tinyconfig` on the Power 8 
system IBM S822LC with Ubuntu 18.04 fails with the error below.


```
$ git describe --dirty
v4.17-rc6-296-gbc2dbc5420e8
$ git log --oneline -1
bc2dbc5420e8 (HEAD -> master, origin/master, origin/HEAD) Merge branch 
'akpm' (patches from Andrew)

$ make tinyconfig
$ make -j120
[…]
  AS  arch/powerpc/kernel/head_32.o
arch/powerpc/kernel/head_32.S: Assembler messages:
arch/powerpc/kernel/head_32.S:1106: Error: missing operand
scripts/Makefile.build:413: recipe for target 
'arch/powerpc/kernel/head_32.o' failed

make[1]: *** [arch/powerpc/kernel/head_32.o] Error 1
[…]
```

Is this expected? *ppc64_defconfig* and *ppc64e_defconfig* build fine.


Kind regards,

Paul


arch/powerpc/kernel/head_32.S:1106: Error: missing operand

2018-05-25 Thread Paul Menzel

Dear Linux folks,


Building the configuration created with `make tinyconfig` on the Power 8 
system IBM S822LC with Ubuntu 18.04 fails with the error below.


```
$ git describe --dirty
v4.17-rc6-296-gbc2dbc5420e8
$ git log --oneline -1
bc2dbc5420e8 (HEAD -> master, origin/master, origin/HEAD) Merge branch 
'akpm' (patches from Andrew)

$ make tinyconfig
$ make -j120
[…]
  AS  arch/powerpc/kernel/head_32.o
arch/powerpc/kernel/head_32.S: Assembler messages:
arch/powerpc/kernel/head_32.S:1106: Error: missing operand
scripts/Makefile.build:413: recipe for target 
'arch/powerpc/kernel/head_32.o' failed

make[1]: *** [arch/powerpc/kernel/head_32.o] Error 1
[…]
```

Is this expected? *ppc64_defconfig* and *ppc64e_defconfig* build fine.


Kind regards,

Paul


Re: linux-next: Signed-off-by missing for commit in the arm-soc tree

2018-05-25 Thread Stephen Rothwell
Hi Alexandre,

On Tue, 15 May 2018 09:15:23 +0200 Alexandre Torgue  
wrote:
>
> On 05/14/2018 11:22 PM, Stephen Rothwell wrote:
> > 
> > Commit
> > 
> >949a0c0dec85 ("ARM: dts: stm32: add USB Host (USBH) support to 
> > stm32mp157c")
> > 
> > is missing a Signed-off-by from its committer.
> 
> My fault, I forgot it when I applied patch on my branch. Do we need an 
> update or it is just a reminder?

Some people rebase their tree to fix these up, some just take it as a
learning experience :-)

-- 
Cheers,
Stephen Rothwell


pgpA6BoV8DAYk.pgp
Description: OpenPGP digital signature


Re: linux-next: Signed-off-by missing for commit in the arm-soc tree

2018-05-25 Thread Stephen Rothwell
Hi Alexandre,

On Tue, 15 May 2018 09:15:23 +0200 Alexandre Torgue  
wrote:
>
> On 05/14/2018 11:22 PM, Stephen Rothwell wrote:
> > 
> > Commit
> > 
> >949a0c0dec85 ("ARM: dts: stm32: add USB Host (USBH) support to 
> > stm32mp157c")
> > 
> > is missing a Signed-off-by from its committer.
> 
> My fault, I forgot it when I applied patch on my branch. Do we need an 
> update or it is just a reminder?

Some people rebase their tree to fix these up, some just take it as a
learning experience :-)

-- 
Cheers,
Stephen Rothwell


pgpA6BoV8DAYk.pgp
Description: OpenPGP digital signature


Re: [PATCH] qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency

2018-05-25 Thread Bjorn Andersson
On Fri 25 May 09:08 PDT 2018, Arnd Bergmann wrote:

> Without CONFIG_OF_RESERVED_MEM, gcc sees that the global cmd_db_header
> variable is never initialized, and through code optimization concludes
> that a lot of other code cannot possibly work after that:
> 
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_addr':
> drivers/soc/qcom/cmd-db.c:197:21: error: 'ent.addr' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   return ret < 0 ? 0 : le32_to_cpu(ent.addr);
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data':
> drivers/soc/qcom/cmd-db.c:224:10: error: 'ent.len' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   ent_len = le16_to_cpu(ent.len);
> drivers/soc/qcom/cmd-db.c:115:6: error: 'rsc_hdr.data_offset' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
>   u16 offset = le16_to_cpu(hdr->data_offset);
>   ^~
> drivers/soc/qcom/cmd-db.c:116:6: error: 'ent.offset' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
>   u16 loffset = le16_to_cpu(ent->offset);
>   ^~~
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data_len':
> drivers/soc/qcom/cmd-db.c:250:38: error: 'ent.len' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   return ret < 0 ? 0 : le16_to_cpu(ent.len);
>   ^
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_slave_id':
> drivers/soc/qcom/cmd-db.c:272:7: error: 'ent.addr' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
> 
> Using a hard CONFIG_OF_RESERVED_MEM dependency avoids this warning,
> and we can remove the CONFIG_OF dependency.
> 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/soc/qcom/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 9dc02f390ba3..5856e792d09c 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -5,7 +5,8 @@ menu "Qualcomm SoC drivers"
>  
>  config QCOM_COMMAND_DB
>   bool "Qualcomm Command DB"
> - depends on (ARCH_QCOM && OF) || COMPILE_TEST
> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on OF_RESERVED_MEM
>   help
> Command DB queries shared memory by key string for shared system
> resources. Platform drivers that require to set state of a shared
> -- 
> 2.9.0
> 


Re: [PATCH] qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency

2018-05-25 Thread Bjorn Andersson
On Fri 25 May 09:08 PDT 2018, Arnd Bergmann wrote:

> Without CONFIG_OF_RESERVED_MEM, gcc sees that the global cmd_db_header
> variable is never initialized, and through code optimization concludes
> that a lot of other code cannot possibly work after that:
> 
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_addr':
> drivers/soc/qcom/cmd-db.c:197:21: error: 'ent.addr' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   return ret < 0 ? 0 : le32_to_cpu(ent.addr);
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data':
> drivers/soc/qcom/cmd-db.c:224:10: error: 'ent.len' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   ent_len = le16_to_cpu(ent.len);
> drivers/soc/qcom/cmd-db.c:115:6: error: 'rsc_hdr.data_offset' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
>   u16 offset = le16_to_cpu(hdr->data_offset);
>   ^~
> drivers/soc/qcom/cmd-db.c:116:6: error: 'ent.offset' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
>   u16 loffset = le16_to_cpu(ent->offset);
>   ^~~
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_aux_data_len':
> drivers/soc/qcom/cmd-db.c:250:38: error: 'ent.len' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
>   return ret < 0 ? 0 : le16_to_cpu(ent.len);
>   ^
> drivers/soc/qcom/cmd-db.c: In function 'cmd_db_read_slave_id':
> drivers/soc/qcom/cmd-db.c:272:7: error: 'ent.addr' may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
> 
> Using a hard CONFIG_OF_RESERVED_MEM dependency avoids this warning,
> and we can remove the CONFIG_OF dependency.
> 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/soc/qcom/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 9dc02f390ba3..5856e792d09c 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -5,7 +5,8 @@ menu "Qualcomm SoC drivers"
>  
>  config QCOM_COMMAND_DB
>   bool "Qualcomm Command DB"
> - depends on (ARCH_QCOM && OF) || COMPILE_TEST
> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on OF_RESERVED_MEM
>   help
> Command DB queries shared memory by key string for shared system
> resources. Platform drivers that require to set state of a shared
> -- 
> 2.9.0
> 


Re: linux-next: Signed-off-by missing for commits in the block tree

2018-05-25 Thread Jens Axboe
On 5/25/18 9:50 PM, Stephen Rothwell wrote:
> Hi Jens,
> 
> Commits
> 
>   287a63ebbe7c ("nvme: mark the result argument to nvme_complete_async_event 
> volatile")
>   750dde4472e4 ("nvme-pci: simplify nvme_cqe_valid")
>   d1f06f4ae041 ("nvme-pci: move ->cq_vector == -1 check outside of ->q_lock")
>   1ab0cd6966fc ("nvme-pci: split the nvme queue lock into submission and 
> completion locks")
>   1eae349d18fc ("nvme-pci: drop IRQ disabling on submission queue lock")
> 
> are missing a Signed-off-by from their committer.

Thanks for the heads up. Not a huge problem for these particular commits,

-- 
Jens Axboe



Re: linux-next: Signed-off-by missing for commits in the block tree

2018-05-25 Thread Jens Axboe
On 5/25/18 9:50 PM, Stephen Rothwell wrote:
> Hi Jens,
> 
> Commits
> 
>   287a63ebbe7c ("nvme: mark the result argument to nvme_complete_async_event 
> volatile")
>   750dde4472e4 ("nvme-pci: simplify nvme_cqe_valid")
>   d1f06f4ae041 ("nvme-pci: move ->cq_vector == -1 check outside of ->q_lock")
>   1ab0cd6966fc ("nvme-pci: split the nvme queue lock into submission and 
> completion locks")
>   1eae349d18fc ("nvme-pci: drop IRQ disabling on submission queue lock")
> 
> are missing a Signed-off-by from their committer.

Thanks for the heads up. Not a huge problem for these particular commits,

-- 
Jens Axboe



linux-next: Signed-off-by missing for commits in the block tree

2018-05-25 Thread Stephen Rothwell
Hi Jens,

Commits

  287a63ebbe7c ("nvme: mark the result argument to nvme_complete_async_event 
volatile")
  750dde4472e4 ("nvme-pci: simplify nvme_cqe_valid")
  d1f06f4ae041 ("nvme-pci: move ->cq_vector == -1 check outside of ->q_lock")
  1ab0cd6966fc ("nvme-pci: split the nvme queue lock into submission and 
completion locks")
  1eae349d18fc ("nvme-pci: drop IRQ disabling on submission queue lock")

are missing a Signed-off-by from their committer.

-- 
Cheers,
Stephen Rothwell


pgp0RoXdKUnQg.pgp
Description: OpenPGP digital signature


linux-next: Signed-off-by missing for commits in the block tree

2018-05-25 Thread Stephen Rothwell
Hi Jens,

Commits

  287a63ebbe7c ("nvme: mark the result argument to nvme_complete_async_event 
volatile")
  750dde4472e4 ("nvme-pci: simplify nvme_cqe_valid")
  d1f06f4ae041 ("nvme-pci: move ->cq_vector == -1 check outside of ->q_lock")
  1ab0cd6966fc ("nvme-pci: split the nvme queue lock into submission and 
completion locks")
  1eae349d18fc ("nvme-pci: drop IRQ disabling on submission queue lock")

are missing a Signed-off-by from their committer.

-- 
Cheers,
Stephen Rothwell


pgp0RoXdKUnQg.pgp
Description: OpenPGP digital signature


Re: [PATCH] mm-kasan-dont-vfree-nonexistent-vm_area-fix

2018-05-25 Thread Andrew Morton
On Sat, 26 May 2018 11:31:35 +0800 kbuild test robot  wrote:

> Hi Andrey,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on mmotm/master]
> [cannot apply to v4.17-rc6]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Andrey-Ryabinin/mm-kasan-dont-vfree-nonexistent-vm_area-fix/20180526-093255
> base:   git://git.cmpxchg.org/linux-mmotm.git master
> config: sparc-allyesconfig (attached as .config)
> compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=sparc 
> 
> All errors (new ones prefixed by >>):
> 
>fs/autofs/inode.o: In function `autofs_new_ino':
>inode.c:(.text+0x220): multiple definition of `autofs_new_ino'
>fs/autofs/inode.o:inode.c:(.text+0x220): first defined here
>fs/autofs/inode.o: In function `autofs_clean_ino':
>inode.c:(.text+0x280): multiple definition of `autofs_clean_ino'
>fs/autofs/inode.o:inode.c:(.text+0x280): first defined here

There's bot breakage here - clearly that patch didn't cause this error.

Ian, this autofs glitch may still not be fixed.


Re: [PATCH] mm-kasan-dont-vfree-nonexistent-vm_area-fix

2018-05-25 Thread Andrew Morton
On Sat, 26 May 2018 11:31:35 +0800 kbuild test robot  wrote:

> Hi Andrey,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on mmotm/master]
> [cannot apply to v4.17-rc6]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Andrey-Ryabinin/mm-kasan-dont-vfree-nonexistent-vm_area-fix/20180526-093255
> base:   git://git.cmpxchg.org/linux-mmotm.git master
> config: sparc-allyesconfig (attached as .config)
> compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=sparc 
> 
> All errors (new ones prefixed by >>):
> 
>fs/autofs/inode.o: In function `autofs_new_ino':
>inode.c:(.text+0x220): multiple definition of `autofs_new_ino'
>fs/autofs/inode.o:inode.c:(.text+0x220): first defined here
>fs/autofs/inode.o: In function `autofs_clean_ino':
>inode.c:(.text+0x280): multiple definition of `autofs_clean_ino'
>fs/autofs/inode.o:inode.c:(.text+0x280): first defined here

There's bot breakage here - clearly that patch didn't cause this error.

Ian, this autofs glitch may still not be fixed.


linux-next: Signed-off-by missing for commits in the rdma tree

2018-05-25 Thread Stephen Rothwell
Hi all,

Commits

  116aeb887371 ("iw_cxgb4: provide detailed provider-specific CM_ID 
information")
to
  0252f73334f9 ("IB/qib: Fix DMA api warning with debug kernel")

are missing a Signed-off-by from their committer.  I presume they have
been rebased?

-- 
Cheers,
Stephen Rothwell


pgpeYCv9hbe7L.pgp
Description: OpenPGP digital signature


linux-next: Signed-off-by missing for commits in the rdma tree

2018-05-25 Thread Stephen Rothwell
Hi all,

Commits

  116aeb887371 ("iw_cxgb4: provide detailed provider-specific CM_ID 
information")
to
  0252f73334f9 ("IB/qib: Fix DMA api warning with debug kernel")

are missing a Signed-off-by from their committer.  I presume they have
been rebased?

-- 
Cheers,
Stephen Rothwell


pgpeYCv9hbe7L.pgp
Description: OpenPGP digital signature


Re: [PATCH] mm-kasan-dont-vfree-nonexistent-vm_area-fix

2018-05-25 Thread kbuild test robot
Hi Andrey,

I love your patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[cannot apply to v4.17-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Andrey-Ryabinin/mm-kasan-dont-vfree-nonexistent-vm_area-fix/20180526-093255
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc 

All errors (new ones prefixed by >>):

   fs/autofs/inode.o: In function `autofs_new_ino':
   inode.c:(.text+0x220): multiple definition of `autofs_new_ino'
   fs/autofs/inode.o:inode.c:(.text+0x220): first defined here
   fs/autofs/inode.o: In function `autofs_clean_ino':
   inode.c:(.text+0x280): multiple definition of `autofs_clean_ino'
   fs/autofs/inode.o:inode.c:(.text+0x280): first defined here
   fs/autofs/inode.o: In function `autofs_free_ino':
   inode.c:(.text+0x2c0): multiple definition of `autofs_free_ino'
   fs/autofs/inode.o:inode.c:(.text+0x2c0): first defined here
   fs/autofs/inode.o: In function `autofs_kill_sb':
   inode.c:(.text+0x2e0): multiple definition of `autofs_kill_sb'
   fs/autofs/inode.o:inode.c:(.text+0x2e0): first defined here
   fs/autofs/inode.o: In function `autofs_get_inode':
   inode.c:(.text+0x360): multiple definition of `autofs_get_inode'
   fs/autofs/inode.o:inode.c:(.text+0x360): first defined here
   fs/autofs/inode.o: In function `autofs_fill_super':
   inode.c:(.text+0x440): multiple definition of `autofs_fill_super'
   fs/autofs/inode.o:inode.c:(.text+0x440): first defined here
   fs/autofs/root.o: In function `is_autofs_dentry':
   root.c:(.text+0x1860): multiple definition of `is_autofs_dentry'
   fs/autofs/root.o:root.c:(.text+0x1860): first defined here
   fs/autofs/root.o:(.rodata+0x100): multiple definition of 
`autofs_dentry_operations'
   fs/autofs/root.o:(.rodata+0x100): first defined here
   fs/autofs/root.o:(.rodata+0x180): multiple definition of 
`autofs_dir_inode_operations'
   fs/autofs/root.o:(.rodata+0x180): first defined here
   fs/autofs/root.o:(.rodata+0x240): multiple definition of 
`autofs_dir_operations'
   fs/autofs/root.o:(.rodata+0x240): first defined here
   fs/autofs/root.o:(.rodata+0x338): multiple definition of 
`autofs_root_operations'
   fs/autofs/root.o:(.rodata+0x338): first defined here
   fs/autofs/symlink.o:(.rodata+0x0): multiple definition of 
`autofs_symlink_inode_operations'
   fs/autofs/symlink.o:(.rodata+0x0): first defined here
   fs/autofs/waitq.o: In function `autofs_catatonic_mode':
>> waitq.c:(.text+0x80): multiple definition of `autofs_catatonic_mode'
   fs/autofs/waitq.o:waitq.c:(.text+0x80): first defined here
   fs/autofs/waitq.o: In function `autofs_wait_release':
>> waitq.c:(.text+0x180): multiple definition of `autofs_wait_release'
   fs/autofs/waitq.o:waitq.c:(.text+0x180): first defined here
   fs/autofs/waitq.o: In function `autofs_wait':
>> waitq.c:(.text+0x520): multiple definition of `autofs_wait'
   fs/autofs/waitq.o:waitq.c:(.text+0x520): first defined here
   fs/autofs/expire.o: In function `autofs_expire_direct':
   expire.c:(.text+0x7a0): multiple definition of `autofs_expire_direct'
   fs/autofs/expire.o:expire.c:(.text+0x7a0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_indirect':
   expire.c:(.text+0x8e0): multiple definition of `autofs_expire_indirect'
   fs/autofs/expire.o:expire.c:(.text+0x8e0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_wait':
   expire.c:(.text+0xba0): multiple definition of `autofs_expire_wait'
   fs/autofs/expire.o:expire.c:(.text+0xba0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_run':
   expire.c:(.text+0xce0): multiple definition of `autofs_expire_run'
   fs/autofs/expire.o:expire.c:(.text+0xce0): first defined here
   fs/autofs/expire.o: In function `autofs_do_expire_multi':
   expire.c:(.text+0xde0): multiple definition of `autofs_do_expire_multi'
   fs/autofs/expire.o:expire.c:(.text+0xde0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_multi':
   expire.c:(.text+0xea0): multiple definition of `autofs_expire_multi'
   fs/autofs/expire.o:expire.c:(.text+0xea0): first defined here
   fs/autofs/dev-ioctl.o: In function `autofs_dev_ioctl_init':
   dev-ioctl.c:(.init.text+0x0): multiple definition of `autofs_dev_ioctl_init'
   fs/autofs/dev-ioctl.o:dev-ioctl.c:(.init.text+0x0): first defined here
   fs/autofs/dev-ioctl.o: In function `autofs_dev_ioctl_exit':
   dev-ioctl.c:(.text+0xbc0): multiple definition of `autofs_dev_ioctl_exit'
   fs/autofs/dev-ioctl.o:dev-ioctl.c:(.text+0xbc0): first defined here

---
0-DAY kernel test 

Re: [PATCH] mm-kasan-dont-vfree-nonexistent-vm_area-fix

2018-05-25 Thread kbuild test robot
Hi Andrey,

I love your patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[cannot apply to v4.17-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Andrey-Ryabinin/mm-kasan-dont-vfree-nonexistent-vm_area-fix/20180526-093255
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc 

All errors (new ones prefixed by >>):

   fs/autofs/inode.o: In function `autofs_new_ino':
   inode.c:(.text+0x220): multiple definition of `autofs_new_ino'
   fs/autofs/inode.o:inode.c:(.text+0x220): first defined here
   fs/autofs/inode.o: In function `autofs_clean_ino':
   inode.c:(.text+0x280): multiple definition of `autofs_clean_ino'
   fs/autofs/inode.o:inode.c:(.text+0x280): first defined here
   fs/autofs/inode.o: In function `autofs_free_ino':
   inode.c:(.text+0x2c0): multiple definition of `autofs_free_ino'
   fs/autofs/inode.o:inode.c:(.text+0x2c0): first defined here
   fs/autofs/inode.o: In function `autofs_kill_sb':
   inode.c:(.text+0x2e0): multiple definition of `autofs_kill_sb'
   fs/autofs/inode.o:inode.c:(.text+0x2e0): first defined here
   fs/autofs/inode.o: In function `autofs_get_inode':
   inode.c:(.text+0x360): multiple definition of `autofs_get_inode'
   fs/autofs/inode.o:inode.c:(.text+0x360): first defined here
   fs/autofs/inode.o: In function `autofs_fill_super':
   inode.c:(.text+0x440): multiple definition of `autofs_fill_super'
   fs/autofs/inode.o:inode.c:(.text+0x440): first defined here
   fs/autofs/root.o: In function `is_autofs_dentry':
   root.c:(.text+0x1860): multiple definition of `is_autofs_dentry'
   fs/autofs/root.o:root.c:(.text+0x1860): first defined here
   fs/autofs/root.o:(.rodata+0x100): multiple definition of 
`autofs_dentry_operations'
   fs/autofs/root.o:(.rodata+0x100): first defined here
   fs/autofs/root.o:(.rodata+0x180): multiple definition of 
`autofs_dir_inode_operations'
   fs/autofs/root.o:(.rodata+0x180): first defined here
   fs/autofs/root.o:(.rodata+0x240): multiple definition of 
`autofs_dir_operations'
   fs/autofs/root.o:(.rodata+0x240): first defined here
   fs/autofs/root.o:(.rodata+0x338): multiple definition of 
`autofs_root_operations'
   fs/autofs/root.o:(.rodata+0x338): first defined here
   fs/autofs/symlink.o:(.rodata+0x0): multiple definition of 
`autofs_symlink_inode_operations'
   fs/autofs/symlink.o:(.rodata+0x0): first defined here
   fs/autofs/waitq.o: In function `autofs_catatonic_mode':
>> waitq.c:(.text+0x80): multiple definition of `autofs_catatonic_mode'
   fs/autofs/waitq.o:waitq.c:(.text+0x80): first defined here
   fs/autofs/waitq.o: In function `autofs_wait_release':
>> waitq.c:(.text+0x180): multiple definition of `autofs_wait_release'
   fs/autofs/waitq.o:waitq.c:(.text+0x180): first defined here
   fs/autofs/waitq.o: In function `autofs_wait':
>> waitq.c:(.text+0x520): multiple definition of `autofs_wait'
   fs/autofs/waitq.o:waitq.c:(.text+0x520): first defined here
   fs/autofs/expire.o: In function `autofs_expire_direct':
   expire.c:(.text+0x7a0): multiple definition of `autofs_expire_direct'
   fs/autofs/expire.o:expire.c:(.text+0x7a0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_indirect':
   expire.c:(.text+0x8e0): multiple definition of `autofs_expire_indirect'
   fs/autofs/expire.o:expire.c:(.text+0x8e0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_wait':
   expire.c:(.text+0xba0): multiple definition of `autofs_expire_wait'
   fs/autofs/expire.o:expire.c:(.text+0xba0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_run':
   expire.c:(.text+0xce0): multiple definition of `autofs_expire_run'
   fs/autofs/expire.o:expire.c:(.text+0xce0): first defined here
   fs/autofs/expire.o: In function `autofs_do_expire_multi':
   expire.c:(.text+0xde0): multiple definition of `autofs_do_expire_multi'
   fs/autofs/expire.o:expire.c:(.text+0xde0): first defined here
   fs/autofs/expire.o: In function `autofs_expire_multi':
   expire.c:(.text+0xea0): multiple definition of `autofs_expire_multi'
   fs/autofs/expire.o:expire.c:(.text+0xea0): first defined here
   fs/autofs/dev-ioctl.o: In function `autofs_dev_ioctl_init':
   dev-ioctl.c:(.init.text+0x0): multiple definition of `autofs_dev_ioctl_init'
   fs/autofs/dev-ioctl.o:dev-ioctl.c:(.init.text+0x0): first defined here
   fs/autofs/dev-ioctl.o: In function `autofs_dev_ioctl_exit':
   dev-ioctl.c:(.text+0xbc0): multiple definition of `autofs_dev_ioctl_exit'
   fs/autofs/dev-ioctl.o:dev-ioctl.c:(.text+0xbc0): first defined here

---
0-DAY kernel test 

Re: [PATCH 1/6] Generic radix trees

2018-05-25 Thread Liu Bo
Hi Kent,

(Add all ML to cc this time.)

On Wed, May 23, 2018 at 9:18 AM, Kent Overstreet
 wrote:
> Very simple radix tree implementation that supports storing arbitrary
> size entries, up to PAGE_SIZE - upcoming patches will convert existing
> flex_array users to genradixes. The new genradix code has a much simpler
> API and implementation, and doesn't have a hard limit on the number of
> elements like flex_array does.
>
> Signed-off-by: Kent Overstreet 
> ---
>  include/linux/generic-radix-tree.h | 222 +
>  lib/Makefile   |   3 +-
>  lib/generic-radix-tree.c   | 180 +++
>  3 files changed, 404 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/generic-radix-tree.h
>  create mode 100644 lib/generic-radix-tree.c
>
> diff --git a/include/linux/generic-radix-tree.h 
> b/include/linux/generic-radix-tree.h
> new file mode 100644
> index 00..3328813322
> --- /dev/null
> +++ b/include/linux/generic-radix-tree.h
> @@ -0,0 +1,222 @@
> +#ifndef _LINUX_GENERIC_RADIX_TREE_H
> +#define _LINUX_GENERIC_RADIX_TREE_H
> +
> +/*
> + * Generic radix trees/sparse arrays:
> + *
> + * Very simple and minimalistic, supporting arbitrary size entries up to
> + * PAGE_SIZE.
> + *
> + * A genradix is defined with the type it will store, like so:
> + * static GENRADIX(struct foo) foo_genradix;
> + *
> + * The main operations are:
> + * - genradix_init(radix) - initialize an empty genradix
> + *
> + * - genradix_free(radix) - free all memory owned by the genradix and
> + *   reinitialize it
> + *
> + * - genradix_ptr(radix, idx) - gets a pointer to the entry at idx, returning
> + *   NULL if that entry does not exist
> + *
> + * - genradix_ptr_alloc(radix, idx, gfp) - gets a pointer to an entry,
> + *   allocating it if necessary
> + *
> + * - genradix_for_each(radix, iter, p) - iterate over each entry in a 
> genradix
> + *
> + * The radix tree allocates one page of entries at a time, so entries may 
> exist
> + * that were never explicitly allocated - they will be initialized to all
> + * zeroes.
> + *
> + * Internally, a genradix is just a radix tree of pages, and indexing works 
> in
> + * terms of byte offsets. The wrappers in this header file use sizeof on the
> + * type the radix contains to calculate a byte offset from the index - see
> + * __idx_to_offset.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct genradix_node;
> +
> +struct __genradix {
> +   struct genradix_node*root;
> +   size_t  depth;
> +};
> +
> +#define __GENRADIX_INITIALIZER \
> +   {   \
> +   .tree = {   \
> +   .root = NULL,   \
> +   .depth = 0, \
> +   }   \
> +   }
> +
> +/*
> + * We use a 0 size array to stash the type we're storing without taking any
> + * space at runtime - then the various accessor macros can use typeof() to 
> get
> + * to it for casts/sizeof - we also force the alignment so that storing a 
> type
> + * with a ridiculous alignment doesn't blow up the alignment or size of the
> + * genradix.
> + */
> +
> +#define GENRADIX(_type)\
> +struct {   \
> +   struct __genradix   tree;   \
> +   _type   type[0] __aligned(1);   \
> +}
> +
> +#define DEFINE_GENRADIX(_name, _type)  \
> +   GENRADIX(_type) _name = __GENRADIX_INITIALIZER
> +
> +/**
> + * genradix_init - initialize a genradix
> + * @_radix:genradix to initialize
> + *
> + * Does not fail
> + */
> +#define genradix_init(_radix)  \
> +do {   \
> +   *(_radix) = (typeof(*_radix)) __GENRADIX_INITIALIZER;   \
> +} while (0)
> +
> +void __genradix_free(struct __genradix *);
> +
> +/**
> + * genradix_free: free all memory owned by a genradix
> + *
> + * After freeing, @_radix will be reinitialized and empty
> + */
> +#define genradix_free(_radix)  __genradix_free(&(_radix)->tree)
> +
> +static inline size_t __idx_to_offset(size_t idx, size_t obj_size)
> +{
> +   if (__builtin_constant_p(obj_size))
> +   BUILD_BUG_ON(obj_size > PAGE_SIZE);
> +   else
> +   BUG_ON(obj_size > PAGE_SIZE);
> +
> +   if (!is_power_of_2(obj_size)) {
> +   size_t objs_per_page = PAGE_SIZE / obj_size;
> +
> +   return (idx / objs_per_page) * PAGE_SIZE +
> +   (idx % objs_per_page) * obj_size;
> +   } else {
> +   return idx * obj_size;
> 

Re: [PATCH 1/6] Generic radix trees

2018-05-25 Thread Liu Bo
Hi Kent,

(Add all ML to cc this time.)

On Wed, May 23, 2018 at 9:18 AM, Kent Overstreet
 wrote:
> Very simple radix tree implementation that supports storing arbitrary
> size entries, up to PAGE_SIZE - upcoming patches will convert existing
> flex_array users to genradixes. The new genradix code has a much simpler
> API and implementation, and doesn't have a hard limit on the number of
> elements like flex_array does.
>
> Signed-off-by: Kent Overstreet 
> ---
>  include/linux/generic-radix-tree.h | 222 +
>  lib/Makefile   |   3 +-
>  lib/generic-radix-tree.c   | 180 +++
>  3 files changed, 404 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/generic-radix-tree.h
>  create mode 100644 lib/generic-radix-tree.c
>
> diff --git a/include/linux/generic-radix-tree.h 
> b/include/linux/generic-radix-tree.h
> new file mode 100644
> index 00..3328813322
> --- /dev/null
> +++ b/include/linux/generic-radix-tree.h
> @@ -0,0 +1,222 @@
> +#ifndef _LINUX_GENERIC_RADIX_TREE_H
> +#define _LINUX_GENERIC_RADIX_TREE_H
> +
> +/*
> + * Generic radix trees/sparse arrays:
> + *
> + * Very simple and minimalistic, supporting arbitrary size entries up to
> + * PAGE_SIZE.
> + *
> + * A genradix is defined with the type it will store, like so:
> + * static GENRADIX(struct foo) foo_genradix;
> + *
> + * The main operations are:
> + * - genradix_init(radix) - initialize an empty genradix
> + *
> + * - genradix_free(radix) - free all memory owned by the genradix and
> + *   reinitialize it
> + *
> + * - genradix_ptr(radix, idx) - gets a pointer to the entry at idx, returning
> + *   NULL if that entry does not exist
> + *
> + * - genradix_ptr_alloc(radix, idx, gfp) - gets a pointer to an entry,
> + *   allocating it if necessary
> + *
> + * - genradix_for_each(radix, iter, p) - iterate over each entry in a 
> genradix
> + *
> + * The radix tree allocates one page of entries at a time, so entries may 
> exist
> + * that were never explicitly allocated - they will be initialized to all
> + * zeroes.
> + *
> + * Internally, a genradix is just a radix tree of pages, and indexing works 
> in
> + * terms of byte offsets. The wrappers in this header file use sizeof on the
> + * type the radix contains to calculate a byte offset from the index - see
> + * __idx_to_offset.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct genradix_node;
> +
> +struct __genradix {
> +   struct genradix_node*root;
> +   size_t  depth;
> +};
> +
> +#define __GENRADIX_INITIALIZER \
> +   {   \
> +   .tree = {   \
> +   .root = NULL,   \
> +   .depth = 0, \
> +   }   \
> +   }
> +
> +/*
> + * We use a 0 size array to stash the type we're storing without taking any
> + * space at runtime - then the various accessor macros can use typeof() to 
> get
> + * to it for casts/sizeof - we also force the alignment so that storing a 
> type
> + * with a ridiculous alignment doesn't blow up the alignment or size of the
> + * genradix.
> + */
> +
> +#define GENRADIX(_type)\
> +struct {   \
> +   struct __genradix   tree;   \
> +   _type   type[0] __aligned(1);   \
> +}
> +
> +#define DEFINE_GENRADIX(_name, _type)  \
> +   GENRADIX(_type) _name = __GENRADIX_INITIALIZER
> +
> +/**
> + * genradix_init - initialize a genradix
> + * @_radix:genradix to initialize
> + *
> + * Does not fail
> + */
> +#define genradix_init(_radix)  \
> +do {   \
> +   *(_radix) = (typeof(*_radix)) __GENRADIX_INITIALIZER;   \
> +} while (0)
> +
> +void __genradix_free(struct __genradix *);
> +
> +/**
> + * genradix_free: free all memory owned by a genradix
> + *
> + * After freeing, @_radix will be reinitialized and empty
> + */
> +#define genradix_free(_radix)  __genradix_free(&(_radix)->tree)
> +
> +static inline size_t __idx_to_offset(size_t idx, size_t obj_size)
> +{
> +   if (__builtin_constant_p(obj_size))
> +   BUILD_BUG_ON(obj_size > PAGE_SIZE);
> +   else
> +   BUG_ON(obj_size > PAGE_SIZE);
> +
> +   if (!is_power_of_2(obj_size)) {
> +   size_t objs_per_page = PAGE_SIZE / obj_size;
> +
> +   return (idx / objs_per_page) * PAGE_SIZE +
> +   (idx % objs_per_page) * obj_size;
> +   } else {
> +   return idx * obj_size;
> +   }
> +}
> +
> +#define __genradix_cast(_radix) 

[PATCH] arm64: dts: hikey: Fix eMMC corruption regression

2018-05-25 Thread John Stultz
This patch is a partial revert of commit
abd7d0972a19 ("arm64: dts: hikey: Enable HS200 mode on eMMC")

which has been causing eMMC corruption on my HiKey board.

Symptoms usually looked like:

mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
...
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc0: new HS200 MMC card at address 0001
...
dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
print_req_error: I/O error, dev mmcblk0, sector 8810504
Aborting journal on device mmcblk0p10-8.
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected 
aborted journal
EXT4-fs (mmcblk0p10): Remounting filesystem read-only

And quite often this would result in a disk that wouldn't properly
boot even with older kernels.

It seems the max-frequency property added by the above patch is
causing the problem, so remove it.

Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Arnd Bergmann 
Cc: Ulf Hansson 
Cc: YongQin Liu 
Cc: Leo Yan 
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
index 724a0d3..edb4ee0 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
+++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
@@ -299,7 +299,6 @@
/* GPIO blocks 16 thru 19 do not appear to be routed to pins */
 
dwmmc_0: dwmmc0@f723d000 {
-   max-frequency = <15000>;
cap-mmc-highspeed;
mmc-hs200-1_8v;
non-removable;
-- 
2.7.4



[PATCH] arm64: dts: hikey: Fix eMMC corruption regression

2018-05-25 Thread John Stultz
This patch is a partial revert of commit
abd7d0972a19 ("arm64: dts: hikey: Enable HS200 mode on eMMC")

which has been causing eMMC corruption on my HiKey board.

Symptoms usually looked like:

mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
...
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc0: new HS200 MMC card at address 0001
...
dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
print_req_error: I/O error, dev mmcblk0, sector 8810504
Aborting journal on device mmcblk0p10-8.
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
mmc_host mmc0: Bus speed (slot 0) = 2480Hz (slot req 40Hz, actual 
40HZ div = 31)
mmc_host mmc0: Bus speed (slot 0) = 14880Hz (slot req 15000Hz, actual 
14880HZ div = 0)
EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected 
aborted journal
EXT4-fs (mmcblk0p10): Remounting filesystem read-only

And quite often this would result in a disk that wouldn't properly
boot even with older kernels.

It seems the max-frequency property added by the above patch is
causing the problem, so remove it.

Cc: Ryan Grachek 
Cc: Wei Xu 
Cc: Arnd Bergmann 
Cc: Ulf Hansson 
Cc: YongQin Liu 
Cc: Leo Yan 
Signed-off-by: John Stultz 
---
 arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts 
b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
index 724a0d3..edb4ee0 100644
--- a/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
+++ b/arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts
@@ -299,7 +299,6 @@
/* GPIO blocks 16 thru 19 do not appear to be routed to pins */
 
dwmmc_0: dwmmc0@f723d000 {
-   max-frequency = <15000>;
cap-mmc-highspeed;
mmc-hs200-1_8v;
non-removable;
-- 
2.7.4



Re: [PATCH v2] gpu: drm: gma500: Change return type to vm_fault_t

2018-05-25 Thread kbuild test robot
Hi Souptick,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on v4.17-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Souptick-Joarder/gpu-drm-gma500-Change-return-type-to-vm_fault_t/20180526-084629
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: x86_64-randconfig-x013-201820 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/gpu//drm/gma500/framebuffer.c: In function 'psbfb_vm_fault':
>> drivers/gpu//drm/gma500/framebuffer.c:143:9: warning: 'ret' may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
 return ret;
^~~

vim +/ret +143 drivers/gpu//drm/gma500/framebuffer.c

   113  
   114  static vm_fault_t psbfb_vm_fault(struct vm_fault *vmf)
   115  {
   116  struct vm_area_struct *vma = vmf->vma;
   117  struct psb_framebuffer *psbfb = vma->vm_private_data;
   118  struct drm_device *dev = psbfb->base.dev;
   119  struct drm_psb_private *dev_priv = dev->dev_private;
   120  int page_num;
   121  int i;
   122  unsigned long address;
   123  vm_fault_t ret;
   124  unsigned long pfn;
   125  unsigned long phys_addr = (unsigned long)dev_priv->stolen_base +
   126psbfb->gtt->offset;
   127  
   128  page_num = vma_pages(vma);
   129  address = vmf->address - (vmf->pgoff << PAGE_SHIFT);
   130  
   131  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
   132  
   133  for (i = 0; i < page_num; i++) {
   134  pfn = (phys_addr >> PAGE_SHIFT);
   135  
   136  ret = vmf_insert_mixed(vma, address,
   137  __pfn_to_pfn_t(pfn, PFN_DEV));
   138  if (unlikely(ret & VM_FAULT_ERROR))
   139  break;
   140  address += PAGE_SIZE;
   141  phys_addr += PAGE_SIZE;
   142  }
 > 143  return ret;
   144  }
   145  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v2] gpu: drm: gma500: Change return type to vm_fault_t

2018-05-25 Thread kbuild test robot
Hi Souptick,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on v4.17-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Souptick-Joarder/gpu-drm-gma500-Change-return-type-to-vm_fault_t/20180526-084629
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: x86_64-randconfig-x013-201820 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/gpu//drm/gma500/framebuffer.c: In function 'psbfb_vm_fault':
>> drivers/gpu//drm/gma500/framebuffer.c:143:9: warning: 'ret' may be used 
>> uninitialized in this function [-Wmaybe-uninitialized]
 return ret;
^~~

vim +/ret +143 drivers/gpu//drm/gma500/framebuffer.c

   113  
   114  static vm_fault_t psbfb_vm_fault(struct vm_fault *vmf)
   115  {
   116  struct vm_area_struct *vma = vmf->vma;
   117  struct psb_framebuffer *psbfb = vma->vm_private_data;
   118  struct drm_device *dev = psbfb->base.dev;
   119  struct drm_psb_private *dev_priv = dev->dev_private;
   120  int page_num;
   121  int i;
   122  unsigned long address;
   123  vm_fault_t ret;
   124  unsigned long pfn;
   125  unsigned long phys_addr = (unsigned long)dev_priv->stolen_base +
   126psbfb->gtt->offset;
   127  
   128  page_num = vma_pages(vma);
   129  address = vmf->address - (vmf->pgoff << PAGE_SHIFT);
   130  
   131  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
   132  
   133  for (i = 0; i < page_num; i++) {
   134  pfn = (phys_addr >> PAGE_SHIFT);
   135  
   136  ret = vmf_insert_mixed(vma, address,
   137  __pfn_to_pfn_t(pfn, PFN_DEV));
   138  if (unlikely(ret & VM_FAULT_ERROR))
   139  break;
   140  address += PAGE_SIZE;
   141  phys_addr += PAGE_SIZE;
   142  }
 > 143  return ret;
   144  }
   145  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 12/14] documentation: fpga: move fpga-region.txt to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Move Documentation/fpga/fpga-region.txt to
> driver-api/fpga/fpga-region.rst.  Including:
>  - Add it to driver-api/fpga/index.rst
>  - Formatting changes to build cleanly as ReST documentation
>  - Some rewrites for better flow as a ReST doc such as moving
>API reference to the end of the doc
>  - Rewrite API reference section to refer to kernel-doc
>documentation in fpga-region.c driver code
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/fpga-region.rst | 102 
> ++
>  Documentation/driver-api/fpga/index.rst   |   1 +
>  Documentation/fpga/fpga-region.txt|  94 
>  3 files changed, 103 insertions(+), 94 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/fpga-region.rst
>  delete mode 100644 Documentation/fpga/fpga-region.txt
> 
> diff --git a/Documentation/driver-api/fpga/fpga-region.rst 
> b/Documentation/driver-api/fpga/fpga-region.rst
> new file mode 100644
> index 000..f89e4a3
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/fpga-region.rst
> @@ -0,0 +1,102 @@
> +FPGA Region
> +===
> +
> +Overview
> +
> +
> +This document is meant to be an brief overview of the FPGA region API usage. 
>  A

a brief overview

> +more conceptual look at regions can be found in the Device Tree binding
> +document [#f1]_.
> +
> +For the purposes of this API document, let's just say that a region 
> associates
> +an FPGA Manager and a bridge (or bridges) with a reprogrammable region of an
> +FPGA or the whole FPGA.  The API provides a way to register a region and to
> +program a region.
> +
> +Currently the only layer above fpga-region.c in the kernel is the Device Tree
> +support (of-fpga-region.c) described in [#f1]_.  The DT support layer uses 
> regions
> +to program the FPGA and then DT to handle enumeration.  The common region 
> code
> +is intended to be used by other schemes that have other ways of accomplishing
> +enumeration after programming.
> +
> +An fpga-region can be set up to know the following things:
> +
> + * which FPGA manager to use to do the programming
> +
> + * which bridges to disable before programming and enable afterwards.
> +
> +Additional info needed to program the FPGA image is passed in the struct
> +fpga_image_info including:
> +
> + * pointers to the image as either a scatter-gather buffer, a contiguous
> +   buffer, or the name of firmware file
> +
> + * flags indicating specifics such as whether the image if for partial

   is for

> +   reconfiguration.
> +
> +How to program a FPGA using a region

  an FPGA

> +
> +
> +First, allocate the info struct::
> +
> + info = fpga_image_info_alloc(dev);
> + if (!info)
> + return -ENOMEM;
> +
> +Set flags as needed, i.e.::
> +
> + info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
> +
> +Point to your FPGA image, such as::
> +
> + info->sgt = 
> +
> +Add info to region and do the programming::
> +
> + region->info = info;
> + ret = fpga_region_program_fpga(region);
> +
> +:c:func:`fpga_region_program_fpga()` operates on info passed in the
> +fpga_image_info (region->info).  This function will attempt to:
> +
> + * lock the region's mutex
> + * lock the region's FPGA manager
> + * build a list of FPGA bridges if a method has been specified to do so
> + * disable the bridges
> + * program the FPGA
> + * re-enable the bridges
> + * release the locks
> +
> +Then you will want to enumerate whatever hardware has appeared in the FPGA.
> +
> +How to add a new FPGA region
> +
> +
> +An example of usage can be seen in the probe function of [#f2]_.
> +
> +.. [#f1] ../devicetree/bindings/fpga/fpga-region.txt
> +.. [#f2] ../../drivers/fpga/of-fpga-region.c
> +
> +API to program a FGPA
> +-
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_program_fpga
> +
> +API to add a new FPGA region
> +
> +
> +.. kernel-doc:: include/linux/fpga/fpga-region.h
> +   :functions: fpga_region
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_create
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_free
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_register
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_unregister


-- 
~Randy


Re: [PATCH 12/14] documentation: fpga: move fpga-region.txt to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Move Documentation/fpga/fpga-region.txt to
> driver-api/fpga/fpga-region.rst.  Including:
>  - Add it to driver-api/fpga/index.rst
>  - Formatting changes to build cleanly as ReST documentation
>  - Some rewrites for better flow as a ReST doc such as moving
>API reference to the end of the doc
>  - Rewrite API reference section to refer to kernel-doc
>documentation in fpga-region.c driver code
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/fpga-region.rst | 102 
> ++
>  Documentation/driver-api/fpga/index.rst   |   1 +
>  Documentation/fpga/fpga-region.txt|  94 
>  3 files changed, 103 insertions(+), 94 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/fpga-region.rst
>  delete mode 100644 Documentation/fpga/fpga-region.txt
> 
> diff --git a/Documentation/driver-api/fpga/fpga-region.rst 
> b/Documentation/driver-api/fpga/fpga-region.rst
> new file mode 100644
> index 000..f89e4a3
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/fpga-region.rst
> @@ -0,0 +1,102 @@
> +FPGA Region
> +===
> +
> +Overview
> +
> +
> +This document is meant to be an brief overview of the FPGA region API usage. 
>  A

a brief overview

> +more conceptual look at regions can be found in the Device Tree binding
> +document [#f1]_.
> +
> +For the purposes of this API document, let's just say that a region 
> associates
> +an FPGA Manager and a bridge (or bridges) with a reprogrammable region of an
> +FPGA or the whole FPGA.  The API provides a way to register a region and to
> +program a region.
> +
> +Currently the only layer above fpga-region.c in the kernel is the Device Tree
> +support (of-fpga-region.c) described in [#f1]_.  The DT support layer uses 
> regions
> +to program the FPGA and then DT to handle enumeration.  The common region 
> code
> +is intended to be used by other schemes that have other ways of accomplishing
> +enumeration after programming.
> +
> +An fpga-region can be set up to know the following things:
> +
> + * which FPGA manager to use to do the programming
> +
> + * which bridges to disable before programming and enable afterwards.
> +
> +Additional info needed to program the FPGA image is passed in the struct
> +fpga_image_info including:
> +
> + * pointers to the image as either a scatter-gather buffer, a contiguous
> +   buffer, or the name of firmware file
> +
> + * flags indicating specifics such as whether the image if for partial

   is for

> +   reconfiguration.
> +
> +How to program a FPGA using a region

  an FPGA

> +
> +
> +First, allocate the info struct::
> +
> + info = fpga_image_info_alloc(dev);
> + if (!info)
> + return -ENOMEM;
> +
> +Set flags as needed, i.e.::
> +
> + info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
> +
> +Point to your FPGA image, such as::
> +
> + info->sgt = 
> +
> +Add info to region and do the programming::
> +
> + region->info = info;
> + ret = fpga_region_program_fpga(region);
> +
> +:c:func:`fpga_region_program_fpga()` operates on info passed in the
> +fpga_image_info (region->info).  This function will attempt to:
> +
> + * lock the region's mutex
> + * lock the region's FPGA manager
> + * build a list of FPGA bridges if a method has been specified to do so
> + * disable the bridges
> + * program the FPGA
> + * re-enable the bridges
> + * release the locks
> +
> +Then you will want to enumerate whatever hardware has appeared in the FPGA.
> +
> +How to add a new FPGA region
> +
> +
> +An example of usage can be seen in the probe function of [#f2]_.
> +
> +.. [#f1] ../devicetree/bindings/fpga/fpga-region.txt
> +.. [#f2] ../../drivers/fpga/of-fpga-region.c
> +
> +API to program a FGPA
> +-
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_program_fpga
> +
> +API to add a new FPGA region
> +
> +
> +.. kernel-doc:: include/linux/fpga/fpga-region.h
> +   :functions: fpga_region
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_create
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_free
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_register
> +
> +.. kernel-doc:: drivers/fpga/fpga-region.c
> +   :functions: fpga_region_unregister


-- 
~Randy


Re: [PATCH 10/14] documentation: fpga: move fpga-mgr.txt to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Move Documentation/fpga/fpga-mgr.txt to driver-api/fpga/fpga-mgr.rst
> and:
>  - Add to driver-api/fpga/index.rst
>  - Format changes so documentation builds cleanly.
>  - Minor rewrites that make the doc flow better as ReST documentation.
>- Such as moving API reference to end of doc
>  - Change API reference section to refer to kernel-doc documentation in
>fpga-mgr.c driver code rather than statically defining each function.
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/fpga-mgr.rst | 220 
> +
>  Documentation/driver-api/fpga/index.rst|   1 +
>  Documentation/fpga/fpga-mgr.txt| 218 
>  3 files changed, 221 insertions(+), 218 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/fpga-mgr.rst
>  delete mode 100644 Documentation/fpga/fpga-mgr.txt
> 
> diff --git a/Documentation/driver-api/fpga/fpga-mgr.rst 
> b/Documentation/driver-api/fpga/fpga-mgr.rst
> new file mode 100644
> index 000..bcf2dd2
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/fpga-mgr.rst
> @@ -0,0 +1,220 @@
> +FPGA Manager
> +
> +
> +Overview
> +
> +
> +The FPGA manager core exports a set of functions for programming an FPGA with
> +an image.  The API is manufacturer agnostic.  All manufacturer specifics are
> +hidden away in a low level driver which registers a set of ops with the core.
> +The FPGA image data itself is very manufacturer specific, but for our 
> purposes
> +it's just binary data.  The FPGA manager core won't parse it.
> +
> +The FPGA image to be programmed can be in a scatter gather list, a single
> +contiguous buffer, or a firmware file.  Because allocating contiguous kernel
> +memory for the buffer should be avoided, users are encouraged to use a 
> scatter
> +gather list instead if possible.
> +
> +The particulars for programming the image are presented in a structure 
> (struct
> +fpga_image_info).  This struct contains parameters such as pointers to the
> +FPGA image as well as image-specific particulars such as whether the image 
> was
> +built for full or partial reconfiguration.
> +
> +How to support a new FPGA device
> +
> +
> +To add another FPGA manager, write a driver that implements a set of ops.  
> The
> +probe function calls fpga_mgr_register(), such as::
> +
> + static const struct fpga_manager_ops socfpga_fpga_ops = {
> + .write_init = socfpga_fpga_ops_configure_init,
> + .write = socfpga_fpga_ops_configure_write,
> + .write_complete = socfpga_fpga_ops_configure_complete,
> + .state = socfpga_fpga_ops_state,
> + };
> +
> + static int socfpga_fpga_probe(struct platform_device *pdev)
> + {
> + struct device *dev = >dev;
> + struct socfpga_fpga_priv *priv;
> + struct fpga_manager *mgr;
> + int ret;
> +
> + priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> +
> + /*
> +  * do ioremaps, get interrupts, etc. and save
> +  * them in priv
> +  */
> +
> + mgr = fpga_mgr_create(dev, "Altera SOCFPGA FPGA Manager",
> +   _fpga_ops, priv);
> + if (!mgr)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, mgr);
> +
> + ret = fpga_mgr_register(mgr);
> + if (ret)
> + fpga_mgr_free(mgr);
> +
> + return ret;
> + }
> +
> + static int socfpga_fpga_remove(struct platform_device *pdev)
> + {
> + struct fpga_manager *mgr = platform_get_drvdata(pdev);
> +
> + fpga_mgr_unregister(mgr);
> +
> + return 0;
> + }
> +
> +
> +The ops will implement whatever device specific register writes are needed to
> +do the programming sequence for this particular FPGA.  These ops return 0 for
> +success or negative error codes otherwise.
> +
> +The programming sequence is::
> + 1. .write_init
> + 2. .write or .write_sg (may be called once or multiple times)
> + 3. .write_complete
> +
> +The .write_init function will prepare the FPGA to receive the image data.  
> The
> +buffer passed into .write_init will be atmost .initial_header_size bytes 
> long,

  at most   
long;

> +if the whole bitstream is not immediately available then the core code will
> +buffer up at least this much before starting.
> +
> +The .write function writes a buffer to the FPGA. The buffer may be contain 
> the
> +whole FPGA image or may be a smaller chunk of an FPGA image.  In the latter
> +case, this function is called multiple times for successive chunks. This 
> interface
> +is suitable for drivers which use PIO.
> +
> +The .write_sg 

Re: [PATCH 10/14] documentation: fpga: move fpga-mgr.txt to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Move Documentation/fpga/fpga-mgr.txt to driver-api/fpga/fpga-mgr.rst
> and:
>  - Add to driver-api/fpga/index.rst
>  - Format changes so documentation builds cleanly.
>  - Minor rewrites that make the doc flow better as ReST documentation.
>- Such as moving API reference to end of doc
>  - Change API reference section to refer to kernel-doc documentation in
>fpga-mgr.c driver code rather than statically defining each function.
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/fpga-mgr.rst | 220 
> +
>  Documentation/driver-api/fpga/index.rst|   1 +
>  Documentation/fpga/fpga-mgr.txt| 218 
>  3 files changed, 221 insertions(+), 218 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/fpga-mgr.rst
>  delete mode 100644 Documentation/fpga/fpga-mgr.txt
> 
> diff --git a/Documentation/driver-api/fpga/fpga-mgr.rst 
> b/Documentation/driver-api/fpga/fpga-mgr.rst
> new file mode 100644
> index 000..bcf2dd2
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/fpga-mgr.rst
> @@ -0,0 +1,220 @@
> +FPGA Manager
> +
> +
> +Overview
> +
> +
> +The FPGA manager core exports a set of functions for programming an FPGA with
> +an image.  The API is manufacturer agnostic.  All manufacturer specifics are
> +hidden away in a low level driver which registers a set of ops with the core.
> +The FPGA image data itself is very manufacturer specific, but for our 
> purposes
> +it's just binary data.  The FPGA manager core won't parse it.
> +
> +The FPGA image to be programmed can be in a scatter gather list, a single
> +contiguous buffer, or a firmware file.  Because allocating contiguous kernel
> +memory for the buffer should be avoided, users are encouraged to use a 
> scatter
> +gather list instead if possible.
> +
> +The particulars for programming the image are presented in a structure 
> (struct
> +fpga_image_info).  This struct contains parameters such as pointers to the
> +FPGA image as well as image-specific particulars such as whether the image 
> was
> +built for full or partial reconfiguration.
> +
> +How to support a new FPGA device
> +
> +
> +To add another FPGA manager, write a driver that implements a set of ops.  
> The
> +probe function calls fpga_mgr_register(), such as::
> +
> + static const struct fpga_manager_ops socfpga_fpga_ops = {
> + .write_init = socfpga_fpga_ops_configure_init,
> + .write = socfpga_fpga_ops_configure_write,
> + .write_complete = socfpga_fpga_ops_configure_complete,
> + .state = socfpga_fpga_ops_state,
> + };
> +
> + static int socfpga_fpga_probe(struct platform_device *pdev)
> + {
> + struct device *dev = >dev;
> + struct socfpga_fpga_priv *priv;
> + struct fpga_manager *mgr;
> + int ret;
> +
> + priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> +
> + /*
> +  * do ioremaps, get interrupts, etc. and save
> +  * them in priv
> +  */
> +
> + mgr = fpga_mgr_create(dev, "Altera SOCFPGA FPGA Manager",
> +   _fpga_ops, priv);
> + if (!mgr)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, mgr);
> +
> + ret = fpga_mgr_register(mgr);
> + if (ret)
> + fpga_mgr_free(mgr);
> +
> + return ret;
> + }
> +
> + static int socfpga_fpga_remove(struct platform_device *pdev)
> + {
> + struct fpga_manager *mgr = platform_get_drvdata(pdev);
> +
> + fpga_mgr_unregister(mgr);
> +
> + return 0;
> + }
> +
> +
> +The ops will implement whatever device specific register writes are needed to
> +do the programming sequence for this particular FPGA.  These ops return 0 for
> +success or negative error codes otherwise.
> +
> +The programming sequence is::
> + 1. .write_init
> + 2. .write or .write_sg (may be called once or multiple times)
> + 3. .write_complete
> +
> +The .write_init function will prepare the FPGA to receive the image data.  
> The
> +buffer passed into .write_init will be atmost .initial_header_size bytes 
> long,

  at most   
long;

> +if the whole bitstream is not immediately available then the core code will
> +buffer up at least this much before starting.
> +
> +The .write function writes a buffer to the FPGA. The buffer may be contain 
> the
> +whole FPGA image or may be a smaller chunk of an FPGA image.  In the latter
> +case, this function is called multiple times for successive chunks. This 
> interface
> +is suitable for drivers which use PIO.
> +
> +The .write_sg version behaves the 

Re: [PATCH net-next] bpfilter: fix a build err

2018-05-25 Thread YueHaibing
On 2018/5/26 0:19, Alexei Starovoitov wrote:
> On Fri, May 25, 2018 at 06:17:57PM +0800, YueHaibing wrote:
>> gcc-7.3.0 report following err:
>>
>>   HOSTCC  net/bpfilter/main.o
>> In file included from net/bpfilter/main.c:9:0:
>> ./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such 
>> file or directory
>>  #include 
>>
>> remove it by adding a include path.
>> Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
>>
>> Signed-off-by: YueHaibing 
>> ---
>>  net/bpfilter/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
>> index 2af752c..3f3cb87 100644
>> --- a/net/bpfilter/Makefile
>> +++ b/net/bpfilter/Makefile
>> @@ -5,7 +5,7 @@
>>  
>>  hostprogs-y := bpfilter_umh
>>  bpfilter_umh-objs := main.o
>> -HOSTCFLAGS += -I. -Itools/include/
>> +HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
> 
> Strangely I don't see this error with gcc 7.3
> I've tried this patch and it doesn't hurt,
> but before it gets applied could you please try
> the top two patches from this tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf.git/?h=ipt_bpf
> in your environment?
> These two patches add the actual meat of bpfilter and I'd like
> to make sure the build setup is good for everyone before
> we proceed too far.

after applied these two patches on net-next, the err still here:
 bpfilter: rough bpfilter codegen example hack
 bpfilter: add iptable get/set parsing

  HOSTCC  net/bpfilter/main.o
In file included from net/bpfilter/main.c:13:0:
./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such file 
or directory
 #include 
  ^~~~
compilation terminated.
make[2]: *** [net/bpfilter/main.o] Error 1
make[1]: *** [net/bpfilter] Error 2
make: *** [net] Error 2

Also I compile your tree, error is same

my gcc version info as follow:
[root@localhost net-next]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/yuehb/gcc-7.3.0-tools/libexec/gcc/x86_64-pc-linux-gnu/7.3.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-7.3.0/configure --enable-checking=release 
--enable-languages=c,c++
--disable-multilib --prefix=/home/yuehb/gcc-7.3.0-tools
Thread model: posix
gcc version 7.3.0 (GCC)

> 
> 
> .
> 



Re: [PATCH net-next] bpfilter: fix a build err

2018-05-25 Thread YueHaibing
On 2018/5/26 0:19, Alexei Starovoitov wrote:
> On Fri, May 25, 2018 at 06:17:57PM +0800, YueHaibing wrote:
>> gcc-7.3.0 report following err:
>>
>>   HOSTCC  net/bpfilter/main.o
>> In file included from net/bpfilter/main.c:9:0:
>> ./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such 
>> file or directory
>>  #include 
>>
>> remove it by adding a include path.
>> Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
>>
>> Signed-off-by: YueHaibing 
>> ---
>>  net/bpfilter/Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
>> index 2af752c..3f3cb87 100644
>> --- a/net/bpfilter/Makefile
>> +++ b/net/bpfilter/Makefile
>> @@ -5,7 +5,7 @@
>>  
>>  hostprogs-y := bpfilter_umh
>>  bpfilter_umh-objs := main.o
>> -HOSTCFLAGS += -I. -Itools/include/
>> +HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
> 
> Strangely I don't see this error with gcc 7.3
> I've tried this patch and it doesn't hurt,
> but before it gets applied could you please try
> the top two patches from this tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf.git/?h=ipt_bpf
> in your environment?
> These two patches add the actual meat of bpfilter and I'd like
> to make sure the build setup is good for everyone before
> we proceed too far.

after applied these two patches on net-next, the err still here:
 bpfilter: rough bpfilter codegen example hack
 bpfilter: add iptable get/set parsing

  HOSTCC  net/bpfilter/main.o
In file included from net/bpfilter/main.c:13:0:
./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such file 
or directory
 #include 
  ^~~~
compilation terminated.
make[2]: *** [net/bpfilter/main.o] Error 1
make[1]: *** [net/bpfilter] Error 2
make: *** [net] Error 2

Also I compile your tree, error is same

my gcc version info as follow:
[root@localhost net-next]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/yuehb/gcc-7.3.0-tools/libexec/gcc/x86_64-pc-linux-gnu/7.3.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-7.3.0/configure --enable-checking=release 
--enable-languages=c,c++
--disable-multilib --prefix=/home/yuehb/gcc-7.3.0-tools
Thread model: posix
gcc version 7.3.0 (GCC)

> 
> 
> .
> 



Re: [PATCH 09/14] Documentation: fpga: move fpga overview to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Start of moving Documentation/fpga/*.txt to driver-api, including:
>  - Add new directory driver-api/fpga
>  - Add new file driver-api/fpga/index.rst
>  - Add driver-api/fpga to driver-api/index.rst
>  - Move Documentation/fpga/overview.txt to driver-api/fpga/intro.rst
>  - Formatting and rewrites so that intro.rst will build cleanly
>and form a good introduction to the rest of the docs to be added.
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/index.rst | 10 ++
>  Documentation/driver-api/fpga/intro.rst | 54 
> +
>  Documentation/driver-api/index.rst  |  1 +
>  Documentation/fpga/overview.txt | 23 --
>  4 files changed, 65 insertions(+), 23 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/index.rst
>  create mode 100644 Documentation/driver-api/fpga/intro.rst
>  delete mode 100644 Documentation/fpga/overview.txt

Hi,
Just a few comments for you.

> diff --git a/Documentation/driver-api/fpga/intro.rst 
> b/Documentation/driver-api/fpga/intro.rst
> new file mode 100644
> index 000..51cd81d
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/intro.rst
> @@ -0,0 +1,54 @@
> +Introduction
> +
> +
> +The FPGA subsystem supports reprogramming FPGAs dynamically under
> +Linux.  Some of the core intentions of the FPGA subsystems are:
> +
> +* The FPGA subsystem is vendor agnostic.
> +
> +* The FPGA subsystem separates upper layers (userspace interfaces and
> +  enumeration) from lower layers that know how to program a specific
> +  FPGA.
> +
> +* Code should not be shared between upper and lower layers.  This
> +  should go without saying.  If that seems necessary, there's probably
> +  framework functionality that that can be added that will benefit

 that  can be

> +  other users.  Write the linux-fpga mailing list and maintainers and
> +  seek out a solution that expands the framework for broad reuse.

  choose one spelling:v

> +* Generally, when adding code, think of the future.  Plan for re-use.
> +
> +The framework in the kernel is divided into:
> +
> +FPGA Manager
> +
> +
> +If you are adding a new FPGA or a new method of programming a FPGA,
> +this is the subsystem for you.  Low level FPGA manager drivers contain
> +the knowledge of how to program a specific device.  This subsystem
> +includes the framework in fpga-mgr.c and the low level drivers that
> +are registered with it.
> +
> +FPGA Bridge
> +---
> +
> +FPGA Bridges prevent spurious signals from going out of a FPGA or a

of an

> +region of a FPGA during programming.  They are disabled before

  of an 

> +programming begins and re-enabled afterwards.  An FPGA bridge may be
> +actual hard hardware that gates a bus to a cpu or a soft ("freeze")

  CPU

> +bridge in FPGA fabric that surrounds a partial reconfiguration region
> +of an FPGA.  This subsystem includes fpga-bridge.c and the low level
> +drivers that are registered with it.
> +
> +FPGA Region
> +---
> +
> +If you are adding a new interface to the FPGA framework, add it on top
> +of a FPGA region to allow the most reuse of your interface.

   of an

> +
> +The FPGA Region framework (fpga-region.c) associates managers and
> +bridges as reconfigurable regions.  A region may refer to the whole
> +FPGA in full reconfiguration or to a partial reconfiguration region.
> +
> +The Device Tree FPGA Region support (of-fpga-region.c) handles
> +reprogramming FPGAs when device tree overlays are applied.



-- 
~Randy


Re: [PATCH 09/14] Documentation: fpga: move fpga overview to driver-api

2018-05-25 Thread Randy Dunlap
On 05/16/2018 04:50 PM, Alan Tull wrote:
> Start of moving Documentation/fpga/*.txt to driver-api, including:
>  - Add new directory driver-api/fpga
>  - Add new file driver-api/fpga/index.rst
>  - Add driver-api/fpga to driver-api/index.rst
>  - Move Documentation/fpga/overview.txt to driver-api/fpga/intro.rst
>  - Formatting and rewrites so that intro.rst will build cleanly
>and form a good introduction to the rest of the docs to be added.
> 
> Signed-off-by: Alan Tull 
> ---
>  Documentation/driver-api/fpga/index.rst | 10 ++
>  Documentation/driver-api/fpga/intro.rst | 54 
> +
>  Documentation/driver-api/index.rst  |  1 +
>  Documentation/fpga/overview.txt | 23 --
>  4 files changed, 65 insertions(+), 23 deletions(-)
>  create mode 100644 Documentation/driver-api/fpga/index.rst
>  create mode 100644 Documentation/driver-api/fpga/intro.rst
>  delete mode 100644 Documentation/fpga/overview.txt

Hi,
Just a few comments for you.

> diff --git a/Documentation/driver-api/fpga/intro.rst 
> b/Documentation/driver-api/fpga/intro.rst
> new file mode 100644
> index 000..51cd81d
> --- /dev/null
> +++ b/Documentation/driver-api/fpga/intro.rst
> @@ -0,0 +1,54 @@
> +Introduction
> +
> +
> +The FPGA subsystem supports reprogramming FPGAs dynamically under
> +Linux.  Some of the core intentions of the FPGA subsystems are:
> +
> +* The FPGA subsystem is vendor agnostic.
> +
> +* The FPGA subsystem separates upper layers (userspace interfaces and
> +  enumeration) from lower layers that know how to program a specific
> +  FPGA.
> +
> +* Code should not be shared between upper and lower layers.  This
> +  should go without saying.  If that seems necessary, there's probably
> +  framework functionality that that can be added that will benefit

 that  can be

> +  other users.  Write the linux-fpga mailing list and maintainers and
> +  seek out a solution that expands the framework for broad reuse.

  choose one spelling:v

> +* Generally, when adding code, think of the future.  Plan for re-use.
> +
> +The framework in the kernel is divided into:
> +
> +FPGA Manager
> +
> +
> +If you are adding a new FPGA or a new method of programming a FPGA,
> +this is the subsystem for you.  Low level FPGA manager drivers contain
> +the knowledge of how to program a specific device.  This subsystem
> +includes the framework in fpga-mgr.c and the low level drivers that
> +are registered with it.
> +
> +FPGA Bridge
> +---
> +
> +FPGA Bridges prevent spurious signals from going out of a FPGA or a

of an

> +region of a FPGA during programming.  They are disabled before

  of an 

> +programming begins and re-enabled afterwards.  An FPGA bridge may be
> +actual hard hardware that gates a bus to a cpu or a soft ("freeze")

  CPU

> +bridge in FPGA fabric that surrounds a partial reconfiguration region
> +of an FPGA.  This subsystem includes fpga-bridge.c and the low level
> +drivers that are registered with it.
> +
> +FPGA Region
> +---
> +
> +If you are adding a new interface to the FPGA framework, add it on top
> +of a FPGA region to allow the most reuse of your interface.

   of an

> +
> +The FPGA Region framework (fpga-region.c) associates managers and
> +bridges as reconfigurable regions.  A region may refer to the whole
> +FPGA in full reconfiguration or to a partial reconfiguration region.
> +
> +The Device Tree FPGA Region support (of-fpga-region.c) handles
> +reprogramming FPGAs when device tree overlays are applied.



-- 
~Randy


Re: [PATCH v4 19/31] Documentation: kconfig: document a new Kconfig macro language

2018-05-25 Thread Randy Dunlap
On 05/16/2018 11:16 PM, Masahiro Yamada wrote:
> Add a document for the macro language introduced to Kconfig.
> 
> The motivation of this work is to move the compiler option tests to
> Kconfig from Makefile.  A number of kernel features require the
> compiler support.  Enabling such features blindly in Kconfig ends up
> with a lot of nasty build-time testing in Makefiles.  If a chosen
> feature turns out unsupported by the compiler, what the build system
> can do is either to disable it (silently!) or to forcibly break the
> build, despite Kconfig has let the user to enable it.  By moving the
> compiler capability tests to Kconfig, features unsupported by the
> compiler will be hidden automatically.
> 
> This change was strongly prompted by Linus Torvalds.  You can find
> his suggestions [1] [2] in ML.  The original idea was to add a new
> attribute with 'option shell=...', but I found more generalized text
> expansion would make Kconfig more powerful and lovely.  The basic
> ideas are from Make, but there are some differences.
> 
> [1]: https://lkml.org/lkml/2016/12/9/577
> [2]: https://lkml.org/lkml/2018/2/7/527
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
> Changes in v4:
>  - Update according to the syntax change
> 
> Changes in v3:
>  - Newly added
> 
> Changes in v2: None
> 
>  Documentation/kbuild/kconfig-macro-language.txt | 252 
> 
>  MAINTAINERS |   2 +-
>  2 files changed, 253 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/kbuild/kconfig-macro-language.txt
> 
> diff --git a/Documentation/kbuild/kconfig-macro-language.txt 
> b/Documentation/kbuild/kconfig-macro-language.txt
> new file mode 100644
> index 000..a8dc792
> --- /dev/null
> +++ b/Documentation/kbuild/kconfig-macro-language.txt
> @@ -0,0 +1,252 @@
> +Concept
> +---
> +
> +The basic idea was inspired by Make. When we look at Make, we notice sort of
> +two languages in one. One language describes dependency graphs consisting of
> +targets and prerequisites. The other is a macro language for performing 
> textual
> +substitution.
> +
> +There is clear distinction between the two language stages. For example, you
> +can write a makefile like follows:
> +
> +APP := foo
> +SRC := foo.c
> +CC := gcc
> +
> +$(APP): $(SRC)
> +$(CC) -o $(APP) $(SRC)
> +
> +The macro language replaces the variable references with their expanded form,
> +and handles as if the source file were input like follows:
> +
> +foo: foo.c
> +gcc -o foo foo.c
> +
> +Then, Make analyzes the dependency graph and determines the targets to be
> +updated.
> +
> +The idea is quite similar in Kconfig - it is possible to describe a Kconfig
> +file like this:
> +
> +CC := gcc
> +
> +config CC_HAS_FOO
> +def_bool $(shell, $(srctree)/scripts/gcc-check-foo.sh $(CC))
> +
> +The macro language in Kconfig processes the source file into the following
> +intermediate:
> +
> +config CC_HAS_FOO
> +def_bool y
> +
> +Then, Kconfig moves onto the evaluation stage to resolve inter-symbol
> +dependency as explained in kconfig-language.txt.
> +
> +
> +Variables
> +-
> +
> +Like in Make, a variable in Kconfig works as a macro variable.  A macro
> +variable is expanded "in place" to yield a text string that may then be
> +expanded further. To get the value of a variable, enclose the variable name 
> in
> +$( ). The parentheses are required even for single-letter variable names; $X 
> is
> +a syntax error. The curly brace form as in ${CC} is not supported either.
> +
> +There are two types of variables: simply expanded variables and recursively
> +expanded variables.
> +
> +A simply expanded variable is defined using the := assignment operator. Its
> +righthand side is expanded immediately upon reading the line from the Kconfig
> +file.
> +
> +A recursively expanded variable is defined using the = assignment operator.
> +Its righthand side is simply stored as the value of the variable without
> +expanding it in any way. Instead, the expansion is performed when the 
> variable
> +is used.
> +
> +There is another type of assignment operator; += is used to append text to a
> +variable. The righthand side of += is expanded immediately if the lefthand
> +side was originally defined as a simple variable. Otherwise, its evaluation 
> is
> +deferred.
> +
> +The variable reference can take parameters, in the following form:
> +
> +  $(name,arg1,arg2,arg3)
> +
> +You can consider the parameterized reference as a function. (more precisely,
> +"user-defined function" in the contrast to "built-in function" listed below).

   in contrast to

> +
> +Useful functions must be expanded when they are used since the same function 
> is
> +expanded differently if different parameters are passed. Hence, a 
> user-defined
> +function is defined using the = assignment operator. The parameters are
> 

Re: [PATCH v4 19/31] Documentation: kconfig: document a new Kconfig macro language

2018-05-25 Thread Randy Dunlap
On 05/16/2018 11:16 PM, Masahiro Yamada wrote:
> Add a document for the macro language introduced to Kconfig.
> 
> The motivation of this work is to move the compiler option tests to
> Kconfig from Makefile.  A number of kernel features require the
> compiler support.  Enabling such features blindly in Kconfig ends up
> with a lot of nasty build-time testing in Makefiles.  If a chosen
> feature turns out unsupported by the compiler, what the build system
> can do is either to disable it (silently!) or to forcibly break the
> build, despite Kconfig has let the user to enable it.  By moving the
> compiler capability tests to Kconfig, features unsupported by the
> compiler will be hidden automatically.
> 
> This change was strongly prompted by Linus Torvalds.  You can find
> his suggestions [1] [2] in ML.  The original idea was to add a new
> attribute with 'option shell=...', but I found more generalized text
> expansion would make Kconfig more powerful and lovely.  The basic
> ideas are from Make, but there are some differences.
> 
> [1]: https://lkml.org/lkml/2016/12/9/577
> [2]: https://lkml.org/lkml/2018/2/7/527
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
> Changes in v4:
>  - Update according to the syntax change
> 
> Changes in v3:
>  - Newly added
> 
> Changes in v2: None
> 
>  Documentation/kbuild/kconfig-macro-language.txt | 252 
> 
>  MAINTAINERS |   2 +-
>  2 files changed, 253 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/kbuild/kconfig-macro-language.txt
> 
> diff --git a/Documentation/kbuild/kconfig-macro-language.txt 
> b/Documentation/kbuild/kconfig-macro-language.txt
> new file mode 100644
> index 000..a8dc792
> --- /dev/null
> +++ b/Documentation/kbuild/kconfig-macro-language.txt
> @@ -0,0 +1,252 @@
> +Concept
> +---
> +
> +The basic idea was inspired by Make. When we look at Make, we notice sort of
> +two languages in one. One language describes dependency graphs consisting of
> +targets and prerequisites. The other is a macro language for performing 
> textual
> +substitution.
> +
> +There is clear distinction between the two language stages. For example, you
> +can write a makefile like follows:
> +
> +APP := foo
> +SRC := foo.c
> +CC := gcc
> +
> +$(APP): $(SRC)
> +$(CC) -o $(APP) $(SRC)
> +
> +The macro language replaces the variable references with their expanded form,
> +and handles as if the source file were input like follows:
> +
> +foo: foo.c
> +gcc -o foo foo.c
> +
> +Then, Make analyzes the dependency graph and determines the targets to be
> +updated.
> +
> +The idea is quite similar in Kconfig - it is possible to describe a Kconfig
> +file like this:
> +
> +CC := gcc
> +
> +config CC_HAS_FOO
> +def_bool $(shell, $(srctree)/scripts/gcc-check-foo.sh $(CC))
> +
> +The macro language in Kconfig processes the source file into the following
> +intermediate:
> +
> +config CC_HAS_FOO
> +def_bool y
> +
> +Then, Kconfig moves onto the evaluation stage to resolve inter-symbol
> +dependency as explained in kconfig-language.txt.
> +
> +
> +Variables
> +-
> +
> +Like in Make, a variable in Kconfig works as a macro variable.  A macro
> +variable is expanded "in place" to yield a text string that may then be
> +expanded further. To get the value of a variable, enclose the variable name 
> in
> +$( ). The parentheses are required even for single-letter variable names; $X 
> is
> +a syntax error. The curly brace form as in ${CC} is not supported either.
> +
> +There are two types of variables: simply expanded variables and recursively
> +expanded variables.
> +
> +A simply expanded variable is defined using the := assignment operator. Its
> +righthand side is expanded immediately upon reading the line from the Kconfig
> +file.
> +
> +A recursively expanded variable is defined using the = assignment operator.
> +Its righthand side is simply stored as the value of the variable without
> +expanding it in any way. Instead, the expansion is performed when the 
> variable
> +is used.
> +
> +There is another type of assignment operator; += is used to append text to a
> +variable. The righthand side of += is expanded immediately if the lefthand
> +side was originally defined as a simple variable. Otherwise, its evaluation 
> is
> +deferred.
> +
> +The variable reference can take parameters, in the following form:
> +
> +  $(name,arg1,arg2,arg3)
> +
> +You can consider the parameterized reference as a function. (more precisely,
> +"user-defined function" in the contrast to "built-in function" listed below).

   in contrast to

> +
> +Useful functions must be expanded when they are used since the same function 
> is
> +expanded differently if different parameters are passed. Hence, a 
> user-defined
> +function is defined using the = assignment operator. The parameters are
> +referenced within the body 

[PATCH v2] ARM: dts: imx51-zii-rdu1: Make sure SD1_WP is low

2018-05-25 Thread Andrey Smirnov
Make sure that MX51_PAD_GPIO1_1 does not remain configure as
ALT0/SD1_WP (it is out of reset). This is needed because of external
pull-up resistor attached to that pad that, when left unchanged, will
drive SD1_WP high preventing eSDHC1/eMMC from working correctly.

To fix that add a pinmux configuration line configureing the pad to
function as a GPIO. While we are at it, add a corresponding
output-high GPIO hog in an effort to minimize current consumption.

Cc: Nikita Yushchenko 
Cc: Shawn Guo 
Cc: Fabio Estevam 
Cc: Lucas Stach 
Cc: Chris Healy 
Cc: Rob Herring 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---

Changes since [v1]:

 - Switched GPIO hog to be output-high

 - Removed whitespace damage

[v1] lkml.kernel.org/r/20180525030153.15986-1-andrew.smir...@gmail.com

 arch/arm/boot/dts/imx51-zii-rdu1.dts | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/arm/boot/dts/imx51-zii-rdu1.dts 
b/arch/arm/boot/dts/imx51-zii-rdu1.dts
index df9eca94d812..1e343f35a9d9 100644
--- a/arch/arm/boot/dts/imx51-zii-rdu1.dts
+++ b/arch/arm/boot/dts/imx51-zii-rdu1.dts
@@ -476,6 +476,17 @@
status = "okay";
 };
 
+ {
+   unused-sd3-wp-gpio {
+   /*
+* See pinctrl_esdhc1 below for more details on this
+*/
+   gpio-hog;
+   gpios = <1 GPIO_ACTIVE_HIGH>;
+   output-high;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_i2c2>;
@@ -660,6 +671,23 @@
MX51_PAD_SD1_DATA1__SD1_DATA1   0x20d5
MX51_PAD_SD1_DATA2__SD1_DATA2   0x20d5
MX51_PAD_SD1_DATA3__SD1_DATA3   0x20d5
+   /*
+* GPIO1_1 is not directly used by eSDHC1 in
+* any capacity, but earlier versions of RDU1
+* used that pin as WP GPIO for eSDHC3 and
+* because of that that pad has an external
+* pull-up resistor. This is problematic
+* because out of reset the pad is configured
+* as ALT0 which serves as SD1_WP, which, when
+* pulled high by and external pull-up, will
+* inhibit execution of any write request to
+* attached eMMC device.
+*
+* To avoid this problem we configure the pad
+* to ALT1/GPIO and avoid driving SD1_WP
+* signal high.
+*/
+   MX51_PAD_GPIO1_1__GPIO1_1   0x
>;
};
 
-- 
2.17.0



[PATCH v2] ARM: dts: imx51-zii-rdu1: Make sure SD1_WP is low

2018-05-25 Thread Andrey Smirnov
Make sure that MX51_PAD_GPIO1_1 does not remain configure as
ALT0/SD1_WP (it is out of reset). This is needed because of external
pull-up resistor attached to that pad that, when left unchanged, will
drive SD1_WP high preventing eSDHC1/eMMC from working correctly.

To fix that add a pinmux configuration line configureing the pad to
function as a GPIO. While we are at it, add a corresponding
output-high GPIO hog in an effort to minimize current consumption.

Cc: Nikita Yushchenko 
Cc: Shawn Guo 
Cc: Fabio Estevam 
Cc: Lucas Stach 
Cc: Chris Healy 
Cc: Rob Herring 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---

Changes since [v1]:

 - Switched GPIO hog to be output-high

 - Removed whitespace damage

[v1] lkml.kernel.org/r/20180525030153.15986-1-andrew.smir...@gmail.com

 arch/arm/boot/dts/imx51-zii-rdu1.dts | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/arm/boot/dts/imx51-zii-rdu1.dts 
b/arch/arm/boot/dts/imx51-zii-rdu1.dts
index df9eca94d812..1e343f35a9d9 100644
--- a/arch/arm/boot/dts/imx51-zii-rdu1.dts
+++ b/arch/arm/boot/dts/imx51-zii-rdu1.dts
@@ -476,6 +476,17 @@
status = "okay";
 };
 
+ {
+   unused-sd3-wp-gpio {
+   /*
+* See pinctrl_esdhc1 below for more details on this
+*/
+   gpio-hog;
+   gpios = <1 GPIO_ACTIVE_HIGH>;
+   output-high;
+   };
+};
+
  {
pinctrl-names = "default";
pinctrl-0 = <_i2c2>;
@@ -660,6 +671,23 @@
MX51_PAD_SD1_DATA1__SD1_DATA1   0x20d5
MX51_PAD_SD1_DATA2__SD1_DATA2   0x20d5
MX51_PAD_SD1_DATA3__SD1_DATA3   0x20d5
+   /*
+* GPIO1_1 is not directly used by eSDHC1 in
+* any capacity, but earlier versions of RDU1
+* used that pin as WP GPIO for eSDHC3 and
+* because of that that pad has an external
+* pull-up resistor. This is problematic
+* because out of reset the pad is configured
+* as ALT0 which serves as SD1_WP, which, when
+* pulled high by and external pull-up, will
+* inhibit execution of any write request to
+* attached eMMC device.
+*
+* To avoid this problem we configure the pad
+* to ALT1/GPIO and avoid driving SD1_WP
+* signal high.
+*/
+   MX51_PAD_GPIO1_1__GPIO1_1   0x
>;
};
 
-- 
2.17.0



[PATCH] rtc: test: remove obsolete .set_mmss

2018-05-25 Thread Alexandre Belloni
There is no point in testing .set_mmss versus .set_mmss64 as there are both
taking the exact same argument (truncated for set_mmss though).

Signed-off-by: Alexandre Belloni 
---
 drivers/rtc/rtc-test.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/rtc/rtc-test.c b/drivers/rtc/rtc-test.c
index 390f928fd6fc..a0d1571c4af6 100644
--- a/drivers/rtc/rtc-test.c
+++ b/drivers/rtc/rtc-test.c
@@ -13,10 +13,6 @@
 #include 
 #include 
 
-static int test_mmss64;
-module_param(test_mmss64, int, 0644);
-MODULE_PARM_DESC(test_mmss64, "Test struct rtc_class_ops.set_mmss64().");
-
 static struct platform_device *test0 = NULL, *test1 = NULL;
 
 static int test_rtc_read_alarm(struct device *dev,
@@ -44,12 +40,6 @@ static int test_rtc_set_mmss64(struct device *dev, time64_t 
secs)
return 0;
 }
 
-static int test_rtc_set_mmss(struct device *dev, unsigned long secs)
-{
-   dev_info(dev, "%s, secs = %lu\n", __func__, secs);
-   return 0;
-}
-
 static int test_rtc_proc(struct device *dev, struct seq_file *seq)
 {
struct platform_device *plat_dev = to_platform_device(dev);
@@ -65,12 +55,12 @@ static int test_rtc_alarm_irq_enable(struct device *dev, 
unsigned int enable)
return 0;
 }
 
-static struct rtc_class_ops test_rtc_ops = {
+static const struct rtc_class_ops test_rtc_ops = {
.proc = test_rtc_proc,
.read_time = test_rtc_read_time,
.read_alarm = test_rtc_read_alarm,
.set_alarm = test_rtc_set_alarm,
-   .set_mmss = test_rtc_set_mmss,
+   .set_mmss64 = test_rtc_set_mmss64,
.alarm_irq_enable = test_rtc_alarm_irq_enable,
 };
 
@@ -110,11 +100,6 @@ static int test_probe(struct platform_device *plat_dev)
int err;
struct rtc_device *rtc;
 
-   if (test_mmss64) {
-   test_rtc_ops.set_mmss64 = test_rtc_set_mmss64;
-   test_rtc_ops.set_mmss = NULL;
-   }
-
rtc = devm_rtc_device_register(_dev->dev, "test",
_rtc_ops, THIS_MODULE);
if (IS_ERR(rtc)) {
-- 
2.17.0



[PATCH] rtc: test: remove obsolete .set_mmss

2018-05-25 Thread Alexandre Belloni
There is no point in testing .set_mmss versus .set_mmss64 as there are both
taking the exact same argument (truncated for set_mmss though).

Signed-off-by: Alexandre Belloni 
---
 drivers/rtc/rtc-test.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/rtc/rtc-test.c b/drivers/rtc/rtc-test.c
index 390f928fd6fc..a0d1571c4af6 100644
--- a/drivers/rtc/rtc-test.c
+++ b/drivers/rtc/rtc-test.c
@@ -13,10 +13,6 @@
 #include 
 #include 
 
-static int test_mmss64;
-module_param(test_mmss64, int, 0644);
-MODULE_PARM_DESC(test_mmss64, "Test struct rtc_class_ops.set_mmss64().");
-
 static struct platform_device *test0 = NULL, *test1 = NULL;
 
 static int test_rtc_read_alarm(struct device *dev,
@@ -44,12 +40,6 @@ static int test_rtc_set_mmss64(struct device *dev, time64_t 
secs)
return 0;
 }
 
-static int test_rtc_set_mmss(struct device *dev, unsigned long secs)
-{
-   dev_info(dev, "%s, secs = %lu\n", __func__, secs);
-   return 0;
-}
-
 static int test_rtc_proc(struct device *dev, struct seq_file *seq)
 {
struct platform_device *plat_dev = to_platform_device(dev);
@@ -65,12 +55,12 @@ static int test_rtc_alarm_irq_enable(struct device *dev, 
unsigned int enable)
return 0;
 }
 
-static struct rtc_class_ops test_rtc_ops = {
+static const struct rtc_class_ops test_rtc_ops = {
.proc = test_rtc_proc,
.read_time = test_rtc_read_time,
.read_alarm = test_rtc_read_alarm,
.set_alarm = test_rtc_set_alarm,
-   .set_mmss = test_rtc_set_mmss,
+   .set_mmss64 = test_rtc_set_mmss64,
.alarm_irq_enable = test_rtc_alarm_irq_enable,
 };
 
@@ -110,11 +100,6 @@ static int test_probe(struct platform_device *plat_dev)
int err;
struct rtc_device *rtc;
 
-   if (test_mmss64) {
-   test_rtc_ops.set_mmss64 = test_rtc_set_mmss64;
-   test_rtc_ops.set_mmss = NULL;
-   }
-
rtc = devm_rtc_device_register(_dev->dev, "test",
_rtc_ops, THIS_MODULE);
if (IS_ERR(rtc)) {
-- 
2.17.0



Re: [PATCH V2 rdma-next 3/4] RDMA/hns: Add reset process for RoCE in hip08

2018-05-25 Thread Wei Hu (Xavier)


On 2018/5/25 22:55, Jason Gunthorpe wrote:
> On Fri, May 25, 2018 at 01:54:31PM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/25 5:31, Jason Gunthorpe wrote:
  static const struct hnae3_client_ops hns_roce_hw_v2_ops = {
.init_instance = hns_roce_hw_v2_init_instance,
.uninit_instance = hns_roce_hw_v2_uninit_instance,
 +  .reset_notify = hns_roce_hw_v2_reset_notify,
  };
  
  static struct hnae3_client hns_roce_hw_v2_client = {
 diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
 b/drivers/infiniband/hw/hns/hns_roce_main.c
 index 1b79a38..ac51372 100644
 +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
 @@ -332,6 +332,9 @@ static struct ib_ucontext 
 *hns_roce_alloc_ucontext(struct ib_device *ib_dev,
struct hns_roce_ib_alloc_ucontext_resp resp = {};
struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
  
 +  if (!hr_dev->active)
 +  return ERR_PTR(-EAGAIN);
>>> This still doesn't make sense, ib_unregister_device already makes sure
>>> that hns_roce_alloc_ucontext isn't running and won't be called before
>>> returning, don't need another flag to do that.
>>>
>>> Since this is the only place the active flag is tested it can just be 
>>> deleted
>>> entirely.
>> Hi, Jason
>>
>> roce reset process:
>> 1. hr_dev->active = false;  //make sure no any process call
>> ibv_open_device.   
>> 2. call ib_dispatch_event() function to report IB_EVENT_DEVICE_FATAL
>> event.
>> 3. msleep(100);   // for some app to free resources
>> 4. call ib_unregister_device().   
>> 5. ...
>> 6. ...
>>
>> There are 2 steps as above before calling ib_unregister_device(), we
>> evaluate
>> hr_dev->active with false to avoid no any process call
>> ibv_open_device.
> If you think this is the right flow then it is core issue to block new
> opens, not an individual driver issue, send a core patch - eg add a
> 'ib_driver_fatal_error()' call that does the dispatch and call it from
> all the drivers using this IB_EVENT_DEVICE_FATAL
Hi, Jason

It seem to be no difference between calling ib_driver_fatal_error and
calling ib_dispatch_event  directly in manufacturer driver.

void ib_driver_fatal_error(struct ib_device *ib_dev, u8 port_num)
 {
   struct ib_event event;

  event.event = IB_EVENT_DEVICE_FATAL;
event.device = ib_dev;
event.element.port_num = port_num;
ib_dispatch_event();
}

Regards
Wei Hu
> I'm not completely sure this makes sense though, it might be better
> for the core code to force stuff a IB_EVENT_DEVICE_FATAL to contexts
> that open after the fatal event..
>
> Jason
>
> .
>




Re: [PATCH V2 rdma-next 3/4] RDMA/hns: Add reset process for RoCE in hip08

2018-05-25 Thread Wei Hu (Xavier)


On 2018/5/25 22:55, Jason Gunthorpe wrote:
> On Fri, May 25, 2018 at 01:54:31PM +0800, Wei Hu (Xavier) wrote:
>>
>> On 2018/5/25 5:31, Jason Gunthorpe wrote:
  static const struct hnae3_client_ops hns_roce_hw_v2_ops = {
.init_instance = hns_roce_hw_v2_init_instance,
.uninit_instance = hns_roce_hw_v2_uninit_instance,
 +  .reset_notify = hns_roce_hw_v2_reset_notify,
  };
  
  static struct hnae3_client hns_roce_hw_v2_client = {
 diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
 b/drivers/infiniband/hw/hns/hns_roce_main.c
 index 1b79a38..ac51372 100644
 +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
 @@ -332,6 +332,9 @@ static struct ib_ucontext 
 *hns_roce_alloc_ucontext(struct ib_device *ib_dev,
struct hns_roce_ib_alloc_ucontext_resp resp = {};
struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
  
 +  if (!hr_dev->active)
 +  return ERR_PTR(-EAGAIN);
>>> This still doesn't make sense, ib_unregister_device already makes sure
>>> that hns_roce_alloc_ucontext isn't running and won't be called before
>>> returning, don't need another flag to do that.
>>>
>>> Since this is the only place the active flag is tested it can just be 
>>> deleted
>>> entirely.
>> Hi, Jason
>>
>> roce reset process:
>> 1. hr_dev->active = false;  //make sure no any process call
>> ibv_open_device.   
>> 2. call ib_dispatch_event() function to report IB_EVENT_DEVICE_FATAL
>> event.
>> 3. msleep(100);   // for some app to free resources
>> 4. call ib_unregister_device().   
>> 5. ...
>> 6. ...
>>
>> There are 2 steps as above before calling ib_unregister_device(), we
>> evaluate
>> hr_dev->active with false to avoid no any process call
>> ibv_open_device.
> If you think this is the right flow then it is core issue to block new
> opens, not an individual driver issue, send a core patch - eg add a
> 'ib_driver_fatal_error()' call that does the dispatch and call it from
> all the drivers using this IB_EVENT_DEVICE_FATAL
Hi, Jason

It seem to be no difference between calling ib_driver_fatal_error and
calling ib_dispatch_event  directly in manufacturer driver.

void ib_driver_fatal_error(struct ib_device *ib_dev, u8 port_num)
 {
   struct ib_event event;

  event.event = IB_EVENT_DEVICE_FATAL;
event.device = ib_dev;
event.element.port_num = port_num;
ib_dispatch_event();
}

Regards
Wei Hu
> I'm not completely sure this makes sense though, it might be better
> for the core code to force stuff a IB_EVENT_DEVICE_FATAL to contexts
> that open after the fatal event..
>
> Jason
>
> .
>




Re: [RFC PATCH] f2fs: add fsync_mode=nobarrier for non-atomic files

2018-05-25 Thread Chao Yu
On 2018/5/26 9:04, Jaegeuk Kim wrote:
> For non-atomic files, this patch adds an option to give nobarrier which
> doesn't issue flush commands to the device.
> 
> Signed-off-by: Jaegeuk Kim 

Reviewed-by: Chao Yu 

Thanks,



[PATCH v3] block/loop: Serialize ioctl operations.

2018-05-25 Thread Tetsuo Handa
syzbot is reporting NULL pointer dereference [1] which is caused by
race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
loop devices without holding corresponding locks.

syzbot is also reporting circular locking dependency between bdev->bd_mutex
and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part()
with lock held.

Since ioctl() request on loop devices is not frequent operation, we don't
need fine grained locking. Let's use global lock and simplify the locking
order.

Strategy is that the global lock is held upon entry of ioctl() request,
and release it before either starting operations which might deadlock or
leaving ioctl() request. After the global lock is released, current thread
no longer uses "struct loop_device" memory because it might be modified
by next ioctl() request which was waiting for current ioctl() request.

In order to enforce this strategy, this patch inversed
loop_reread_partitions() and loop_unprepare_queue() in loop_clr_fd().
I don't know whether it breaks something, but I don't have testcases.

Since this patch serializes using global lock, race bugs should no longer
exist. Thus, it will be easy to test whether this patch broke something.

[1] 
https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] 
https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Reported-by: syzbot 

Cc: Jens Axboe 
---
 drivers/block/loop.c | 231 ---
 drivers/block/loop.h |   1 -
 2 files changed, 128 insertions(+), 104 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 55cf554..feb9fa7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -81,11 +81,50 @@
 #include 
 
 static DEFINE_IDR(loop_index_idr);
-static DEFINE_MUTEX(loop_index_mutex);
+static DEFINE_MUTEX(loop_mutex);
+static void *loop_mutex_owner; /* == __mutex_owner(_mutex) */
 
 static int max_part;
 static int part_shift;
 
+/*
+ * lock_loop - Lock loop_mutex.
+ */
+static void lock_loop(void)
+{
+   mutex_lock(_mutex);
+   loop_mutex_owner = current;
+}
+
+/*
+ * lock_loop_killable - Lock loop_mutex unless killed.
+ */
+static int lock_loop_killable(void)
+{
+   int err = mutex_lock_killable(_mutex);
+
+   if (err)
+   return err;
+   loop_mutex_owner = current;
+   return 0;
+}
+
+/*
+ * unlock_loop - Unlock loop_mutex as needed.
+ *
+ * Explicitly call this function before calling fput() or blkdev_reread_part()
+ * in order to avoid circular lock dependency. After this function is called,
+ * current thread is no longer allowed to access "struct loop_device" memory,
+ * for another thread would access that memory as soon as loop_mutex is held.
+ */
+static void unlock_loop(void)
+{
+   if (loop_mutex_owner == current) {
+   loop_mutex_owner = NULL;
+   mutex_unlock(_mutex);
+   }
+}
+
 static int transfer_xor(struct loop_device *lo, int cmd,
struct page *raw_page, unsigned raw_off,
struct page *loop_page, unsigned loop_off,
@@ -626,7 +665,12 @@ static void loop_reread_partitions(struct loop_device *lo,
   struct block_device *bdev)
 {
int rc;
+   /* Save variables which might change after unlock_loop() is called. */
+   char filename[sizeof(lo->lo_file_name)];
+   const int num = lo->lo_number;
 
+   memmove(filename, lo->lo_file_name, sizeof(filename));
+   unlock_loop();
/*
 * bd_mutex has been held already in release path, so don't
 * acquire it if this function is called in such case.
@@ -641,7 +685,7 @@ static void loop_reread_partitions(struct loop_device *lo,
rc = blkdev_reread_part(bdev);
if (rc)
pr_warn("%s: partition scan of loop%d (%s) failed (rc=%d)\n",
-   __func__, lo->lo_number, lo->lo_file_name, rc);
+   __func__, num, filename, rc);
 }
 
 /*
@@ -659,6 +703,7 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
struct inode*inode;
int error;
 
+   lockdep_assert_held(_mutex);
error = -ENXIO;
if (lo->lo_state != Lo_bound)
goto out;
@@ -695,12 +740,14 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
loop_update_dio(lo);
blk_mq_unfreeze_queue(lo->lo_queue);
 
-   fput(old_file);
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
-   loop_reread_partitions(lo, bdev);
+   loop_reread_partitions(lo, bdev); /* calls unlock_loop() */
+ 

Re: [RFC PATCH] f2fs: add fsync_mode=nobarrier for non-atomic files

2018-05-25 Thread Chao Yu
On 2018/5/26 9:04, Jaegeuk Kim wrote:
> For non-atomic files, this patch adds an option to give nobarrier which
> doesn't issue flush commands to the device.
> 
> Signed-off-by: Jaegeuk Kim 

Reviewed-by: Chao Yu 

Thanks,



[PATCH v3] block/loop: Serialize ioctl operations.

2018-05-25 Thread Tetsuo Handa
syzbot is reporting NULL pointer dereference [1] which is caused by
race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
loop devices without holding corresponding locks.

syzbot is also reporting circular locking dependency between bdev->bd_mutex
and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part()
with lock held.

Since ioctl() request on loop devices is not frequent operation, we don't
need fine grained locking. Let's use global lock and simplify the locking
order.

Strategy is that the global lock is held upon entry of ioctl() request,
and release it before either starting operations which might deadlock or
leaving ioctl() request. After the global lock is released, current thread
no longer uses "struct loop_device" memory because it might be modified
by next ioctl() request which was waiting for current ioctl() request.

In order to enforce this strategy, this patch inversed
loop_reread_partitions() and loop_unprepare_queue() in loop_clr_fd().
I don't know whether it breaks something, but I don't have testcases.

Since this patch serializes using global lock, race bugs should no longer
exist. Thus, it will be easy to test whether this patch broke something.

[1] 
https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] 
https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Reported-by: syzbot 

Cc: Jens Axboe 
---
 drivers/block/loop.c | 231 ---
 drivers/block/loop.h |   1 -
 2 files changed, 128 insertions(+), 104 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 55cf554..feb9fa7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -81,11 +81,50 @@
 #include 
 
 static DEFINE_IDR(loop_index_idr);
-static DEFINE_MUTEX(loop_index_mutex);
+static DEFINE_MUTEX(loop_mutex);
+static void *loop_mutex_owner; /* == __mutex_owner(_mutex) */
 
 static int max_part;
 static int part_shift;
 
+/*
+ * lock_loop - Lock loop_mutex.
+ */
+static void lock_loop(void)
+{
+   mutex_lock(_mutex);
+   loop_mutex_owner = current;
+}
+
+/*
+ * lock_loop_killable - Lock loop_mutex unless killed.
+ */
+static int lock_loop_killable(void)
+{
+   int err = mutex_lock_killable(_mutex);
+
+   if (err)
+   return err;
+   loop_mutex_owner = current;
+   return 0;
+}
+
+/*
+ * unlock_loop - Unlock loop_mutex as needed.
+ *
+ * Explicitly call this function before calling fput() or blkdev_reread_part()
+ * in order to avoid circular lock dependency. After this function is called,
+ * current thread is no longer allowed to access "struct loop_device" memory,
+ * for another thread would access that memory as soon as loop_mutex is held.
+ */
+static void unlock_loop(void)
+{
+   if (loop_mutex_owner == current) {
+   loop_mutex_owner = NULL;
+   mutex_unlock(_mutex);
+   }
+}
+
 static int transfer_xor(struct loop_device *lo, int cmd,
struct page *raw_page, unsigned raw_off,
struct page *loop_page, unsigned loop_off,
@@ -626,7 +665,12 @@ static void loop_reread_partitions(struct loop_device *lo,
   struct block_device *bdev)
 {
int rc;
+   /* Save variables which might change after unlock_loop() is called. */
+   char filename[sizeof(lo->lo_file_name)];
+   const int num = lo->lo_number;
 
+   memmove(filename, lo->lo_file_name, sizeof(filename));
+   unlock_loop();
/*
 * bd_mutex has been held already in release path, so don't
 * acquire it if this function is called in such case.
@@ -641,7 +685,7 @@ static void loop_reread_partitions(struct loop_device *lo,
rc = blkdev_reread_part(bdev);
if (rc)
pr_warn("%s: partition scan of loop%d (%s) failed (rc=%d)\n",
-   __func__, lo->lo_number, lo->lo_file_name, rc);
+   __func__, num, filename, rc);
 }
 
 /*
@@ -659,6 +703,7 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
struct inode*inode;
int error;
 
+   lockdep_assert_held(_mutex);
error = -ENXIO;
if (lo->lo_state != Lo_bound)
goto out;
@@ -695,12 +740,14 @@ static int loop_change_fd(struct loop_device *lo, struct 
block_device *bdev,
loop_update_dio(lo);
blk_mq_unfreeze_queue(lo->lo_queue);
 
-   fput(old_file);
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
-   loop_reread_partitions(lo, bdev);
+   loop_reread_partitions(lo, bdev); /* calls unlock_loop() */
+   unlock_loop();
+   fput(old_file);
return 0;
 
  out_putf:
+   unlock_loop();
fput(file);
  out:
return error;
@@ -884,6 +931,7 @@ static int 

[PATCH V2] cros_ec_keyb: Mark cros_ec_keyb driver as wake enabled device.

2018-05-25 Thread Ravi Chandra Sadineni
Mark cros_ec_keyb has wake enabled by default. If we see a MKBP event
related to keyboard,  call pm_wakeup_event() to make sure wakeup
triggers are accounted to keyb during suspend resume path.

Signed-off-by: Ravi Chandra Sadineni 
---
V2: Marked the ckdev as wake enabled instead of input devices.

 drivers/input/keyboard/cros_ec_keyb.c | 21 +
 drivers/mfd/cros_ec.c | 19 +++
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/input/keyboard/cros_ec_keyb.c 
b/drivers/input/keyboard/cros_ec_keyb.c
index 79eb29550c348..a7c96f0317123 100644
--- a/drivers/input/keyboard/cros_ec_keyb.c
+++ b/drivers/input/keyboard/cros_ec_keyb.c
@@ -245,12 +245,17 @@ static int cros_ec_keyb_work(struct notifier_block *nb,
switch (ckdev->ec->event_data.event_type) {
case EC_MKBP_EVENT_KEY_MATRIX:
/*
-* If EC is not the wake source, discard key state changes
+* If Keyb is not wake enabled, discard key state changes
 * during suspend.
 */
-   if (queued_during_suspend)
+   if (queued_during_suspend
+   && !device_may_wakeup(ckdev->dev))
return NOTIFY_OK;
 
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
+
+
if (ckdev->ec->event_size != ckdev->cols) {
dev_err(ckdev->dev,
"Discarded incomplete key matrix event.\n");
@@ -265,18 +270,25 @@ static int cros_ec_keyb_work(struct notifier_block *nb,
val = get_unaligned_le32(>ec->event_data.data.sysrq);
dev_dbg(ckdev->dev, "sysrq code from EC: %#x\n", val);
handle_sysrq(val);
+
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
break;
 
case EC_MKBP_EVENT_BUTTON:
case EC_MKBP_EVENT_SWITCH:
/*
-* If EC is not the wake source, discard key state
+* If keyb is not wake enabled, discard key state
 * changes during suspend. Switches will be re-checked in
 * cros_ec_keyb_resume() to be sure nothing is lost.
 */
-   if (queued_during_suspend)
+   if (queued_during_suspend
+   && !device_may_wakeup(ckdev->dev))
return NOTIFY_OK;
 
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
+
if (ckdev->ec->event_data.event_type == EC_MKBP_EVENT_BUTTON) {
val = get_unaligned_le32(
>ec->event_data.data.buttons);
@@ -639,6 +651,7 @@ static int cros_ec_keyb_probe(struct platform_device *pdev)
return err;
}
 
+   device_init_wakeup(ckdev->dev, true);
return 0;
 }
 
diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index d61024141e2b6..36156a41499c9 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -229,7 +229,7 @@ int cros_ec_suspend(struct cros_ec_device *ec_dev)
 }
 EXPORT_SYMBOL(cros_ec_suspend);
 
-static void cros_ec_drain_events(struct cros_ec_device *ec_dev)
+static void cros_ec_report_events_during_suspend(struct cros_ec_device *ec_dev)
 {
while (cros_ec_get_next_event(ec_dev, NULL) > 0)
blocking_notifier_call_chain(_dev->event_notifier,
@@ -253,21 +253,16 @@ int cros_ec_resume(struct cros_ec_device *ec_dev)
dev_dbg(ec_dev->dev, "Error %d sending resume event to ec",
ret);
 
-   /*
-* In some cases, we need to distinguish between events that occur
-* during suspend if the EC is not a wake source. For example,
-* keypresses during suspend should be discarded if it does not wake
-* the system.
-*
-* If the EC is not a wake source, drain the event queue and mark them
-* as "queued during suspend".
-*/
if (ec_dev->wake_enabled) {
disable_irq_wake(ec_dev->irq);
ec_dev->wake_enabled = 0;
-   } else {
-   cros_ec_drain_events(ec_dev);
}
+   /*
+* Let the mfd devices know about events that occur during
+* suspend. This way the clients know what to do with them.
+*/
+   cros_ec_report_events_during_suspend(ec_dev);
+
 
return 0;
 }
-- 
2.17.0.921.gf22659ad46-goog



[PATCH V2] cros_ec_keyb: Mark cros_ec_keyb driver as wake enabled device.

2018-05-25 Thread Ravi Chandra Sadineni
Mark cros_ec_keyb has wake enabled by default. If we see a MKBP event
related to keyboard,  call pm_wakeup_event() to make sure wakeup
triggers are accounted to keyb during suspend resume path.

Signed-off-by: Ravi Chandra Sadineni 
---
V2: Marked the ckdev as wake enabled instead of input devices.

 drivers/input/keyboard/cros_ec_keyb.c | 21 +
 drivers/mfd/cros_ec.c | 19 +++
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/input/keyboard/cros_ec_keyb.c 
b/drivers/input/keyboard/cros_ec_keyb.c
index 79eb29550c348..a7c96f0317123 100644
--- a/drivers/input/keyboard/cros_ec_keyb.c
+++ b/drivers/input/keyboard/cros_ec_keyb.c
@@ -245,12 +245,17 @@ static int cros_ec_keyb_work(struct notifier_block *nb,
switch (ckdev->ec->event_data.event_type) {
case EC_MKBP_EVENT_KEY_MATRIX:
/*
-* If EC is not the wake source, discard key state changes
+* If Keyb is not wake enabled, discard key state changes
 * during suspend.
 */
-   if (queued_during_suspend)
+   if (queued_during_suspend
+   && !device_may_wakeup(ckdev->dev))
return NOTIFY_OK;
 
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
+
+
if (ckdev->ec->event_size != ckdev->cols) {
dev_err(ckdev->dev,
"Discarded incomplete key matrix event.\n");
@@ -265,18 +270,25 @@ static int cros_ec_keyb_work(struct notifier_block *nb,
val = get_unaligned_le32(>ec->event_data.data.sysrq);
dev_dbg(ckdev->dev, "sysrq code from EC: %#x\n", val);
handle_sysrq(val);
+
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
break;
 
case EC_MKBP_EVENT_BUTTON:
case EC_MKBP_EVENT_SWITCH:
/*
-* If EC is not the wake source, discard key state
+* If keyb is not wake enabled, discard key state
 * changes during suspend. Switches will be re-checked in
 * cros_ec_keyb_resume() to be sure nothing is lost.
 */
-   if (queued_during_suspend)
+   if (queued_during_suspend
+   && !device_may_wakeup(ckdev->dev))
return NOTIFY_OK;
 
+   if (device_may_wakeup(ckdev->dev))
+   pm_wakeup_event(ckdev->dev, 0);
+
if (ckdev->ec->event_data.event_type == EC_MKBP_EVENT_BUTTON) {
val = get_unaligned_le32(
>ec->event_data.data.buttons);
@@ -639,6 +651,7 @@ static int cros_ec_keyb_probe(struct platform_device *pdev)
return err;
}
 
+   device_init_wakeup(ckdev->dev, true);
return 0;
 }
 
diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index d61024141e2b6..36156a41499c9 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -229,7 +229,7 @@ int cros_ec_suspend(struct cros_ec_device *ec_dev)
 }
 EXPORT_SYMBOL(cros_ec_suspend);
 
-static void cros_ec_drain_events(struct cros_ec_device *ec_dev)
+static void cros_ec_report_events_during_suspend(struct cros_ec_device *ec_dev)
 {
while (cros_ec_get_next_event(ec_dev, NULL) > 0)
blocking_notifier_call_chain(_dev->event_notifier,
@@ -253,21 +253,16 @@ int cros_ec_resume(struct cros_ec_device *ec_dev)
dev_dbg(ec_dev->dev, "Error %d sending resume event to ec",
ret);
 
-   /*
-* In some cases, we need to distinguish between events that occur
-* during suspend if the EC is not a wake source. For example,
-* keypresses during suspend should be discarded if it does not wake
-* the system.
-*
-* If the EC is not a wake source, drain the event queue and mark them
-* as "queued during suspend".
-*/
if (ec_dev->wake_enabled) {
disable_irq_wake(ec_dev->irq);
ec_dev->wake_enabled = 0;
-   } else {
-   cros_ec_drain_events(ec_dev);
}
+   /*
+* Let the mfd devices know about events that occur during
+* suspend. This way the clients know what to do with them.
+*/
+   cros_ec_report_events_during_suspend(ec_dev);
+
 
return 0;
 }
-- 
2.17.0.921.gf22659ad46-goog



[PATCH] thermal: int340x: Prevent error in reading trip hysteresis attribute

2018-05-25 Thread Srinivas Pandruvada
Some of the INT340X devices may not have hysteresis defined in the ACPI
definition. In that case reading trip hysteresis results in error. This
spams logs of user space utilities.

In this case instead of returning error, just return hysteresis as 0,
which is correct as there is no hysteresis defined for the device.

Signed-off-by: Srinivas Pandruvada 
---
 drivers/thermal/int340x_thermal/int340x_thermal_zone.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c 
b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
index 145a5c53ff5c..dfdf6dbc2ddc 100644
--- a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
@@ -147,9 +147,9 @@ static int int340x_thermal_get_trip_hyst(struct 
thermal_zone_device *zone,
 
status = acpi_evaluate_integer(d->adev->handle, "GTSH", NULL, );
if (ACPI_FAILURE(status))
-   return -EIO;
-
-   *temp = hyst * 100;
+   *temp = 0;
+   else
+   *temp = hyst * 100;
 
return 0;
 }
-- 
2.17.0



[PATCH] thermal: int340x: Prevent error in reading trip hysteresis attribute

2018-05-25 Thread Srinivas Pandruvada
Some of the INT340X devices may not have hysteresis defined in the ACPI
definition. In that case reading trip hysteresis results in error. This
spams logs of user space utilities.

In this case instead of returning error, just return hysteresis as 0,
which is correct as there is no hysteresis defined for the device.

Signed-off-by: Srinivas Pandruvada 
---
 drivers/thermal/int340x_thermal/int340x_thermal_zone.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c 
b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
index 145a5c53ff5c..dfdf6dbc2ddc 100644
--- a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
@@ -147,9 +147,9 @@ static int int340x_thermal_get_trip_hyst(struct 
thermal_zone_device *zone,
 
status = acpi_evaluate_integer(d->adev->handle, "GTSH", NULL, );
if (ACPI_FAILURE(status))
-   return -EIO;
-
-   *temp = hyst * 100;
+   *temp = 0;
+   else
+   *temp = hyst * 100;
 
return 0;
 }
-- 
2.17.0



Re: [PATCH v2 5/6] soc: qcom: rpmh powerdomain driver

2018-05-25 Thread David Collins
Hello Rajendra,

On 05/25/2018 03:01 AM, Rajendra Nayak wrote:
> The RPMh powerdomain driver aggregates the corner votes from various

s/powerdomain/power domain/

This applies to all instances in this patch.  "Power domain" seems to be
the preferred spelling in the kernel.


> consumers for the ARC resources and communicates it to RPMh.
> 
> We also add data for all powerdomains on sdm845 as part of the patch.
> The driver can be extended to support other SoCs which support RPMh
> 
> Signed-off-by: Rajendra Nayak 
> ---
>  .../devicetree/bindings/power/qcom,rpmhpd.txt |  65 
>  drivers/soc/qcom/Kconfig  |   9 +
>  drivers/soc/qcom/Makefile |   1 +
>  drivers/soc/qcom/rpmhpd.c | 360 ++
>  4 files changed, 435 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/power/qcom,rpmhpd.txt
>  create mode 100644 drivers/soc/qcom/rpmhpd.c
...
> +++ b/Documentation/devicetree/bindings/power/qcom,rpmhpd.txt

I think that this binding documentation should be in a patch separate from
the driver.


> @@ -0,0 +1,65 @@
> +Qualcomm RPMh Powerdomains

s/Qualcomm/Qualcomm Technologies, Inc./


> +
> +* For RPMh powerdomains, we communicate a performance state to RPMh

Does this line need to start with '*'?


> +which then translates it into a corresponding voltage on a rail
> +
> +Required Properties:
> + - compatible: Should be one of the following
> + * qcom,sdm845-rpmhpd: RPMh powerdomain for the sdm845 family of SoC
> + - power-domain-cells: number of cells in power domain specifier
> + must be 1
> + - operating-points-v2: Phandle to the OPP table for the power-domain.
> + Refer to Documentation/devicetree/bindings/power/power_domain.txt
> + and Documentation/devicetree/bindings/opp/qcom-opp.txt for more details
> +
> +Example:
> +
> + rpmhpd: power-controller {
> + compatible = "qcom,sdm845-rpmhpd";
> + #power-domain-cells = <1>;
> + operating-points-v2 = <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>;

Can this be changed to simply:
  operating-points-v2 = <_opp_table>;

The opp binding documentation [1] states that this should be ok:

If only one phandle is available, then the same OPP table will be used
for all power domains provided by the power domain provider.


> + };
> +
> + rpmhpd_opp_table: opp-table {
> + compatible = "operating-points-v2-qcom-level", 
> "operating-points-v2";
> +
> + rpmhpd_opp1: opp@1 {

Is there any significance to the 1 through 8 values in these OPP table
nodes?  If not, then could this be changed to something like:

rpmhpd_retention: opp@16 {
...
rpmhpd_turbo_l1: opp@416 {


> + qcom-corner = <16>;

s/qcom-corner/qcom,level/g


> + };
> +
> + rpmhpd_opp2: opp@2 {
> + qcom-corner = <48>;
> + };
> +
> + rpmhpd_opp3: opp@3 {
> + qcom-corner = <64>;
> + };
> +
> + rpmhpd_opp4: opp@4 {
> + qcom-corner = <128>;
> + };
> +
> + rpmhpd_opp5: opp@5 {
> + qcom-corner = <192>;
> + };
> +
> + rpmhpd_opp6: opp@6 {
> + qcom-corner = <256>;
> + };
> +
> + rpmhpd_opp7: opp@7 {
> + qcom-corner = <320>;
> + };

Can you please add 336 and 384 to your example?  384 at least should be
present as it corresponds to the Turbo level which all supplies support.


> + rpmhpd_opp8: opp@8 {
> + qcom-corner = <416>;
> + };
> + };

How are consumers of these power domains supposed to identify which domain
within <> to use (e.g. VDD_CX vs VDD_MX)?  If the answer is a plain
integer index, then could you please add per-platform #define constants in
a DT header file which explicitly define the meaning for each index?

How do consumers of these power domains identify which level they want to
set for a specific power domain (e.g. Nominal vs Turbo)?

Would it be helpful to provide a DT header file with #define constants for
the cross-platform sparse level mapping?  This is done in [2] for the
downstream rpmh-regulator driver that handles ARC managed regulators.


Would it be ok to add some consumer DT nodes in this binding file example
so that it is clear how consumers interact with the rpmhpd?


> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index a7a405178967..1faed239701d 

Re: [PATCH v2 5/6] soc: qcom: rpmh powerdomain driver

2018-05-25 Thread David Collins
Hello Rajendra,

On 05/25/2018 03:01 AM, Rajendra Nayak wrote:
> The RPMh powerdomain driver aggregates the corner votes from various

s/powerdomain/power domain/

This applies to all instances in this patch.  "Power domain" seems to be
the preferred spelling in the kernel.


> consumers for the ARC resources and communicates it to RPMh.
> 
> We also add data for all powerdomains on sdm845 as part of the patch.
> The driver can be extended to support other SoCs which support RPMh
> 
> Signed-off-by: Rajendra Nayak 
> ---
>  .../devicetree/bindings/power/qcom,rpmhpd.txt |  65 
>  drivers/soc/qcom/Kconfig  |   9 +
>  drivers/soc/qcom/Makefile |   1 +
>  drivers/soc/qcom/rpmhpd.c | 360 ++
>  4 files changed, 435 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/power/qcom,rpmhpd.txt
>  create mode 100644 drivers/soc/qcom/rpmhpd.c
...
> +++ b/Documentation/devicetree/bindings/power/qcom,rpmhpd.txt

I think that this binding documentation should be in a patch separate from
the driver.


> @@ -0,0 +1,65 @@
> +Qualcomm RPMh Powerdomains

s/Qualcomm/Qualcomm Technologies, Inc./


> +
> +* For RPMh powerdomains, we communicate a performance state to RPMh

Does this line need to start with '*'?


> +which then translates it into a corresponding voltage on a rail
> +
> +Required Properties:
> + - compatible: Should be one of the following
> + * qcom,sdm845-rpmhpd: RPMh powerdomain for the sdm845 family of SoC
> + - power-domain-cells: number of cells in power domain specifier
> + must be 1
> + - operating-points-v2: Phandle to the OPP table for the power-domain.
> + Refer to Documentation/devicetree/bindings/power/power_domain.txt
> + and Documentation/devicetree/bindings/opp/qcom-opp.txt for more details
> +
> +Example:
> +
> + rpmhpd: power-controller {
> + compatible = "qcom,sdm845-rpmhpd";
> + #power-domain-cells = <1>;
> + operating-points-v2 = <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>,
> +   <_opp_table>;

Can this be changed to simply:
  operating-points-v2 = <_opp_table>;

The opp binding documentation [1] states that this should be ok:

If only one phandle is available, then the same OPP table will be used
for all power domains provided by the power domain provider.


> + };
> +
> + rpmhpd_opp_table: opp-table {
> + compatible = "operating-points-v2-qcom-level", 
> "operating-points-v2";
> +
> + rpmhpd_opp1: opp@1 {

Is there any significance to the 1 through 8 values in these OPP table
nodes?  If not, then could this be changed to something like:

rpmhpd_retention: opp@16 {
...
rpmhpd_turbo_l1: opp@416 {


> + qcom-corner = <16>;

s/qcom-corner/qcom,level/g


> + };
> +
> + rpmhpd_opp2: opp@2 {
> + qcom-corner = <48>;
> + };
> +
> + rpmhpd_opp3: opp@3 {
> + qcom-corner = <64>;
> + };
> +
> + rpmhpd_opp4: opp@4 {
> + qcom-corner = <128>;
> + };
> +
> + rpmhpd_opp5: opp@5 {
> + qcom-corner = <192>;
> + };
> +
> + rpmhpd_opp6: opp@6 {
> + qcom-corner = <256>;
> + };
> +
> + rpmhpd_opp7: opp@7 {
> + qcom-corner = <320>;
> + };

Can you please add 336 and 384 to your example?  384 at least should be
present as it corresponds to the Turbo level which all supplies support.


> + rpmhpd_opp8: opp@8 {
> + qcom-corner = <416>;
> + };
> + };

How are consumers of these power domains supposed to identify which domain
within <> to use (e.g. VDD_CX vs VDD_MX)?  If the answer is a plain
integer index, then could you please add per-platform #define constants in
a DT header file which explicitly define the meaning for each index?

How do consumers of these power domains identify which level they want to
set for a specific power domain (e.g. Nominal vs Turbo)?

Would it be helpful to provide a DT header file with #define constants for
the cross-platform sparse level mapping?  This is done in [2] for the
downstream rpmh-regulator driver that handles ARC managed regulators.


Would it be ok to add some consumer DT nodes in this binding file example
so that it is clear how consumers interact with the rpmhpd?


> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index a7a405178967..1faed239701d 100644
> --- 

[RFC PATCH] f2fs: add fsync_mode=nobarrier for non-atomic files

2018-05-25 Thread Jaegeuk Kim
For non-atomic files, this patch adds an option to give nobarrier which
doesn't issue flush commands to the device.

Signed-off-by: Jaegeuk Kim 
---
 Documentation/filesystems/f2fs.txt | 16 +---
 fs/f2fs/f2fs.h |  1 +
 fs/f2fs/file.c |  2 +-
 fs/f2fs/super.c|  4 
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/Documentation/filesystems/f2fs.txt 
b/Documentation/filesystems/f2fs.txt
index 12a147c9f87f..69f8de995739 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -182,13 +182,15 @@ whint_mode=%s  Control which write hints are 
passed down to block
passes down hints with its policy.
 alloc_mode=%s  Adjust block allocation policy, which supports "reuse"
and "default".
-fsync_mode=%s  Control the policy of fsync. Currently supports "posix"
-   and "strict". In "posix" mode, which is default, fsync
-   will follow POSIX semantics and does a light operation
-   to improve the filesystem performance. In "strict" mode,
-   fsync will be heavy and behaves in line with xfs, ext4
-   and btrfs, where xfstest generic/342 will pass, but the
-   performance will regress.
+fsync_mode=%s  Control the policy of fsync. Currently supports "posix",
+   "strict", and "nobarrier". In "posix" mode, which is
+   default, fsync will follow POSIX semantics and does a
+   light operation to improve the filesystem performance.
+   In "strict" mode, fsync will be heavy and behaves in 
line
+   with xfs, ext4 and btrfs, where xfstest generic/342 will
+   pass, but the performance will regress. "nobarrier" is
+   based on "posix", but doesn't issue flush command for
+   non-atomic files likewise "nobarrier" mount option.
 test_dummy_encryption  Enable dummy encryption, which provides a fake fscrypt
context. The fake fscrypt context is used by xfstests.
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6e0677aff8ca..659c63dae81c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1093,6 +1093,7 @@ enum {
 enum fsync_mode {
FSYNC_MODE_POSIX,   /* fsync follows posix semantics */
FSYNC_MODE_STRICT,  /* fsync behaves in line with ext4 */
+   FSYNC_MODE_NOBARRIER,   /* fsync behaves nobarrier based on posix */
 };
 
 #ifdef CONFIG_F2FS_FS_ENCRYPTION
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7ac216bb560d..fab65a0bd4cc 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -306,7 +306,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
remove_ino_entry(sbi, ino, APPEND_INO);
clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-   if (!atomic)
+   if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
ret = f2fs_issue_flush(sbi, inode->i_ino);
if (!ret) {
remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 1b42fc7e4b29..12282b144651 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -740,6 +740,10 @@ static int parse_options(struct super_block *sb, char 
*options)
} else if (strlen(name) == 6 &&
!strncmp(name, "strict", 6)) {
F2FS_OPTION(sbi).fsync_mode = FSYNC_MODE_STRICT;
+   } else if (strlen(name) == 9 &&
+   !strncmp(name, "nobarrier", 9)) {
+   F2FS_OPTION(sbi).fsync_mode =
+   FSYNC_MODE_NOBARRIER;
} else {
kfree(name);
return -EINVAL;
-- 
2.17.0.441.gb46fe60e1d-goog



[RFC PATCH] f2fs: add fsync_mode=nobarrier for non-atomic files

2018-05-25 Thread Jaegeuk Kim
For non-atomic files, this patch adds an option to give nobarrier which
doesn't issue flush commands to the device.

Signed-off-by: Jaegeuk Kim 
---
 Documentation/filesystems/f2fs.txt | 16 +---
 fs/f2fs/f2fs.h |  1 +
 fs/f2fs/file.c |  2 +-
 fs/f2fs/super.c|  4 
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/Documentation/filesystems/f2fs.txt 
b/Documentation/filesystems/f2fs.txt
index 12a147c9f87f..69f8de995739 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -182,13 +182,15 @@ whint_mode=%s  Control which write hints are 
passed down to block
passes down hints with its policy.
 alloc_mode=%s  Adjust block allocation policy, which supports "reuse"
and "default".
-fsync_mode=%s  Control the policy of fsync. Currently supports "posix"
-   and "strict". In "posix" mode, which is default, fsync
-   will follow POSIX semantics and does a light operation
-   to improve the filesystem performance. In "strict" mode,
-   fsync will be heavy and behaves in line with xfs, ext4
-   and btrfs, where xfstest generic/342 will pass, but the
-   performance will regress.
+fsync_mode=%s  Control the policy of fsync. Currently supports "posix",
+   "strict", and "nobarrier". In "posix" mode, which is
+   default, fsync will follow POSIX semantics and does a
+   light operation to improve the filesystem performance.
+   In "strict" mode, fsync will be heavy and behaves in 
line
+   with xfs, ext4 and btrfs, where xfstest generic/342 will
+   pass, but the performance will regress. "nobarrier" is
+   based on "posix", but doesn't issue flush command for
+   non-atomic files likewise "nobarrier" mount option.
 test_dummy_encryption  Enable dummy encryption, which provides a fake fscrypt
context. The fake fscrypt context is used by xfstests.
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6e0677aff8ca..659c63dae81c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1093,6 +1093,7 @@ enum {
 enum fsync_mode {
FSYNC_MODE_POSIX,   /* fsync follows posix semantics */
FSYNC_MODE_STRICT,  /* fsync behaves in line with ext4 */
+   FSYNC_MODE_NOBARRIER,   /* fsync behaves nobarrier based on posix */
 };
 
 #ifdef CONFIG_F2FS_FS_ENCRYPTION
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7ac216bb560d..fab65a0bd4cc 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -306,7 +306,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
remove_ino_entry(sbi, ino, APPEND_INO);
clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-   if (!atomic)
+   if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
ret = f2fs_issue_flush(sbi, inode->i_ino);
if (!ret) {
remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 1b42fc7e4b29..12282b144651 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -740,6 +740,10 @@ static int parse_options(struct super_block *sb, char 
*options)
} else if (strlen(name) == 6 &&
!strncmp(name, "strict", 6)) {
F2FS_OPTION(sbi).fsync_mode = FSYNC_MODE_STRICT;
+   } else if (strlen(name) == 9 &&
+   !strncmp(name, "nobarrier", 9)) {
+   F2FS_OPTION(sbi).fsync_mode =
+   FSYNC_MODE_NOBARRIER;
} else {
kfree(name);
return -EINVAL;
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH v2] f2fs: keep migration IO order in LFS mode

2018-05-25 Thread Chao Yu
For non-migration IO, we will keep order of data/node blocks' submitting
as allocation sequence by sorting IOs in per log io_list list, but for
migration IO, it could be out-of-order.

In LFS mode, we should keep all IOs including migration IO be ordered,
so that this patch fixes to add an additional lock to keep submitting
order.

Signed-off-by: Chao Yu 
Signed-off-by: Yunlong Song 
---
v2:
- introduce variable lfs_mode to record historical option, it can avoid
option being changed.
 fs/f2fs/f2fs.h| 2 ++
 fs/f2fs/gc.c  | 6 ++
 fs/f2fs/segment.c | 5 +
 fs/f2fs/super.c   | 1 +
 4 files changed, 14 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index dc0a462461e8..3cc56b4df03f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1124,6 +1124,8 @@ struct f2fs_sb_info {
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
struct mutex wio_mutex[NR_PAGE_TYPE - 1][NR_TEMP_TYPE];
/* bio ordering for NODE/DATA */
+   /* keep migration IO order for LFS mode */
+   struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
 
/* for checkpoint */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5ef3233c38d2..50bb8fc25275 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -610,6 +610,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
struct page *page;
block_t newaddr;
int err;
+   bool lfs_mode = test_opt(fio.sbi, LFS);
 
/* do not read out */
page = f2fs_grab_cache_page(inode->i_mapping, bidx, false);
@@ -653,6 +654,9 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.page = page;
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
 
+   if (lfs_mode)
+   down_write(>io_order_lock);
+
allocate_data_block(fio.sbi, NULL, fio.old_blkaddr, ,
, CURSEG_COLD_DATA, NULL, false);
 
@@ -709,6 +713,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 put_page_out:
f2fs_put_page(fio.encrypted_page, 1);
 recover_block:
+   if (lfs_mode)
+   up_write(>io_order_lock);
if (err)
__f2fs_replace_block(fio.sbi, , newaddr, fio.old_blkaddr,
true, true);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index a05208954dd5..c67d92bf2968 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2735,7 +2735,10 @@ static void do_write_page(struct f2fs_summary *sum, 
struct f2fs_io_info *fio)
 {
int type = __get_segment_type(fio);
int err;
+   bool keep_order = (test_opt(fio->sbi, LFS) && type == CURSEG_COLD_DATA);
 
+   if (keep_order)
+   down_read(>sbi->io_order_lock);
 reallocate:
allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr,
>new_blkaddr, sum, type, fio, true);
@@ -2748,6 +2751,8 @@ static void do_write_page(struct f2fs_summary *sum, 
struct f2fs_io_info *fio)
} else if (!err) {
update_device_state(fio);
}
+   if (keep_order)
+   up_read(>sbi->io_order_lock);
 }
 
 void write_meta_page(struct f2fs_sb_info *sbi, struct page *page,
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8e5f0a178f5d..1b42fc7e4b29 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2365,6 +2365,7 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
for (i = 0; i < NR_PAGE_TYPE - 1; i++)
for (j = HOT; j < NR_TEMP_TYPE; j++)
mutex_init(>wio_mutex[i][j]);
+   init_rwsem(>io_order_lock);
spin_lock_init(>cp_lock);
 
sbi->dirty_device = 0;
-- 
2.17.0.391.g1f1cddd558b5



[PATCH v2] f2fs: keep migration IO order in LFS mode

2018-05-25 Thread Chao Yu
For non-migration IO, we will keep order of data/node blocks' submitting
as allocation sequence by sorting IOs in per log io_list list, but for
migration IO, it could be out-of-order.

In LFS mode, we should keep all IOs including migration IO be ordered,
so that this patch fixes to add an additional lock to keep submitting
order.

Signed-off-by: Chao Yu 
Signed-off-by: Yunlong Song 
---
v2:
- introduce variable lfs_mode to record historical option, it can avoid
option being changed.
 fs/f2fs/f2fs.h| 2 ++
 fs/f2fs/gc.c  | 6 ++
 fs/f2fs/segment.c | 5 +
 fs/f2fs/super.c   | 1 +
 4 files changed, 14 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index dc0a462461e8..3cc56b4df03f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1124,6 +1124,8 @@ struct f2fs_sb_info {
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
struct mutex wio_mutex[NR_PAGE_TYPE - 1][NR_TEMP_TYPE];
/* bio ordering for NODE/DATA */
+   /* keep migration IO order for LFS mode */
+   struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
 
/* for checkpoint */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5ef3233c38d2..50bb8fc25275 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -610,6 +610,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
struct page *page;
block_t newaddr;
int err;
+   bool lfs_mode = test_opt(fio.sbi, LFS);
 
/* do not read out */
page = f2fs_grab_cache_page(inode->i_mapping, bidx, false);
@@ -653,6 +654,9 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.page = page;
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
 
+   if (lfs_mode)
+   down_write(>io_order_lock);
+
allocate_data_block(fio.sbi, NULL, fio.old_blkaddr, ,
, CURSEG_COLD_DATA, NULL, false);
 
@@ -709,6 +713,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 put_page_out:
f2fs_put_page(fio.encrypted_page, 1);
 recover_block:
+   if (lfs_mode)
+   up_write(>io_order_lock);
if (err)
__f2fs_replace_block(fio.sbi, , newaddr, fio.old_blkaddr,
true, true);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index a05208954dd5..c67d92bf2968 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2735,7 +2735,10 @@ static void do_write_page(struct f2fs_summary *sum, 
struct f2fs_io_info *fio)
 {
int type = __get_segment_type(fio);
int err;
+   bool keep_order = (test_opt(fio->sbi, LFS) && type == CURSEG_COLD_DATA);
 
+   if (keep_order)
+   down_read(>sbi->io_order_lock);
 reallocate:
allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr,
>new_blkaddr, sum, type, fio, true);
@@ -2748,6 +2751,8 @@ static void do_write_page(struct f2fs_summary *sum, 
struct f2fs_io_info *fio)
} else if (!err) {
update_device_state(fio);
}
+   if (keep_order)
+   up_read(>sbi->io_order_lock);
 }
 
 void write_meta_page(struct f2fs_sb_info *sbi, struct page *page,
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8e5f0a178f5d..1b42fc7e4b29 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2365,6 +2365,7 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
for (i = 0; i < NR_PAGE_TYPE - 1; i++)
for (j = HOT; j < NR_TEMP_TYPE; j++)
mutex_init(>wio_mutex[i][j]);
+   init_rwsem(>io_order_lock);
spin_lock_init(>cp_lock);
 
sbi->dirty_device = 0;
-- 
2.17.0.391.g1f1cddd558b5



[PATCH] PM / hibernate: Fix oops at snapshot_write().

2018-05-25 Thread Tetsuo Handa
syzbot is reporting NULL pointer dereference at snapshot_write() [1].
This is because data->handle is zero-cleared by ioctl(SNAPSHOT_FREE).
Fix this by checking data_of(data->handle) != NULL before using it.

[1] 
https://syzkaller.appspot.com/bug?id=828a3c71bd344a6de8b6a31233d51a72099f27fd

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Cc: Rafael J. Wysocki 
Cc: Len Brown 
Cc: Pavel Machek 
---
 kernel/power/user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/power/user.c b/kernel/power/user.c
index 75c959d..abd2255 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -186,6 +186,11 @@ static ssize_t snapshot_write(struct file *filp, const 
char __user *buf,
res = PAGE_SIZE - pg_offp;
}
 
+   if (!data_of(data->handle)) {
+   res = -EINVAL;
+   goto unlock;
+   }
+
res = simple_write_to_buffer(data_of(data->handle), res, _offp,
buf, count);
if (res > 0)
-- 
1.8.3.1



[PATCH] PM / hibernate: Fix oops at snapshot_write().

2018-05-25 Thread Tetsuo Handa
syzbot is reporting NULL pointer dereference at snapshot_write() [1].
This is because data->handle is zero-cleared by ioctl(SNAPSHOT_FREE).
Fix this by checking data_of(data->handle) != NULL before using it.

[1] 
https://syzkaller.appspot.com/bug?id=828a3c71bd344a6de8b6a31233d51a72099f27fd

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Cc: Rafael J. Wysocki 
Cc: Len Brown 
Cc: Pavel Machek 
---
 kernel/power/user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/power/user.c b/kernel/power/user.c
index 75c959d..abd2255 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -186,6 +186,11 @@ static ssize_t snapshot_write(struct file *filp, const 
char __user *buf,
res = PAGE_SIZE - pg_offp;
}
 
+   if (!data_of(data->handle)) {
+   res = -EINVAL;
+   goto unlock;
+   }
+
res = simple_write_to_buffer(data_of(data->handle), res, _offp,
buf, count);
if (res > 0)
-- 
1.8.3.1



[PATCH 1/2] n_tty: Fix stall at n_tty_receive_char_special().

2018-05-25 Thread Tetsuo Handa
syzbot is reporting stalls at n_tty_receive_char_special() [1]. This is
because comparison is not working as expected since ldata->read_head can
change at any moment. Mitigate this by explicitly masking with buffer size
when checking condition for "while" loops.

[1] 
https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Fixes: bc5a5e3f45d04784 ("n_tty: Don't wrap input buffer indices at buffer 
size")
Cc: Peter Hurley 
---
 drivers/tty/n_tty.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index cbe98bc..b279f873 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -124,6 +124,8 @@ struct n_tty_data {
struct mutex output_lock;
 };
 
+#define MASK(x) ((x) & (N_TTY_BUF_SIZE - 1))
+
 static inline size_t read_cnt(struct n_tty_data *ldata)
 {
return ldata->read_head - ldata->read_tail;
@@ -978,14 +980,15 @@ static void eraser(unsigned char c, struct tty_struct 
*tty)
}
 
seen_alnums = 0;
-   while (ldata->read_head != ldata->canon_head) {
+   while (MASK(ldata->read_head) != MASK(ldata->canon_head)) {
head = ldata->read_head;
 
/* erase a single possibly multibyte character */
do {
head--;
c = read_buf(ldata, head);
-   } while (is_continuation(c, tty) && head != ldata->canon_head);
+   } while (is_continuation(c, tty) &&
+MASK(head) != MASK(ldata->canon_head));
 
/* do not partially erase */
if (is_continuation(c, tty))
@@ -1027,7 +1030,7 @@ static void eraser(unsigned char c, struct tty_struct 
*tty)
 * This info is used to go back the correct
 * number of columns.
 */
-   while (tail != ldata->canon_head) {
+   while (MASK(tail) != MASK(ldata->canon_head)) {
tail--;
c = read_buf(ldata, tail);
if (c == '\t') {
@@ -1302,7 +1305,7 @@ static void n_tty_receive_parity_error(struct tty_struct 
*tty, unsigned char c)
finish_erasing(ldata);
echo_char(c, tty);
echo_char_raw('\n', ldata);
-   while (tail != ldata->read_head) {
+   while (MASK(tail) != MASK(ldata->read_head)) {
echo_char(read_buf(ldata, tail), tty);
tail++;
}
@@ -2411,7 +2414,7 @@ static unsigned long inq_canon(struct n_tty_data *ldata)
tail = ldata->read_tail;
nr = head - tail;
/* Skip EOF-chars.. */
-   while (head != tail) {
+   while (MASK(head) != MASK(tail)) {
if (test_bit(tail & (N_TTY_BUF_SIZE - 1), ldata->read_flags) &&
read_buf(ldata, tail) == __DISABLED_CHAR)
nr--;
-- 
1.8.3.1


[PATCH 1/2] n_tty: Fix stall at n_tty_receive_char_special().

2018-05-25 Thread Tetsuo Handa
syzbot is reporting stalls at n_tty_receive_char_special() [1]. This is
because comparison is not working as expected since ldata->read_head can
change at any moment. Mitigate this by explicitly masking with buffer size
when checking condition for "while" loops.

[1] 
https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Fixes: bc5a5e3f45d04784 ("n_tty: Don't wrap input buffer indices at buffer 
size")
Cc: Peter Hurley 
---
 drivers/tty/n_tty.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index cbe98bc..b279f873 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -124,6 +124,8 @@ struct n_tty_data {
struct mutex output_lock;
 };
 
+#define MASK(x) ((x) & (N_TTY_BUF_SIZE - 1))
+
 static inline size_t read_cnt(struct n_tty_data *ldata)
 {
return ldata->read_head - ldata->read_tail;
@@ -978,14 +980,15 @@ static void eraser(unsigned char c, struct tty_struct 
*tty)
}
 
seen_alnums = 0;
-   while (ldata->read_head != ldata->canon_head) {
+   while (MASK(ldata->read_head) != MASK(ldata->canon_head)) {
head = ldata->read_head;
 
/* erase a single possibly multibyte character */
do {
head--;
c = read_buf(ldata, head);
-   } while (is_continuation(c, tty) && head != ldata->canon_head);
+   } while (is_continuation(c, tty) &&
+MASK(head) != MASK(ldata->canon_head));
 
/* do not partially erase */
if (is_continuation(c, tty))
@@ -1027,7 +1030,7 @@ static void eraser(unsigned char c, struct tty_struct 
*tty)
 * This info is used to go back the correct
 * number of columns.
 */
-   while (tail != ldata->canon_head) {
+   while (MASK(tail) != MASK(ldata->canon_head)) {
tail--;
c = read_buf(ldata, tail);
if (c == '\t') {
@@ -1302,7 +1305,7 @@ static void n_tty_receive_parity_error(struct tty_struct 
*tty, unsigned char c)
finish_erasing(ldata);
echo_char(c, tty);
echo_char_raw('\n', ldata);
-   while (tail != ldata->read_head) {
+   while (MASK(tail) != MASK(ldata->read_head)) {
echo_char(read_buf(ldata, tail), tty);
tail++;
}
@@ -2411,7 +2414,7 @@ static unsigned long inq_canon(struct n_tty_data *ldata)
tail = ldata->read_tail;
nr = head - tail;
/* Skip EOF-chars.. */
-   while (head != tail) {
+   while (MASK(head) != MASK(tail)) {
if (test_bit(tail & (N_TTY_BUF_SIZE - 1), ldata->read_flags) &&
read_buf(ldata, tail) == __DISABLED_CHAR)
nr--;
-- 
1.8.3.1


[PATCH 2/2] n_tty: Access echo_* variables carefully.

2018-05-25 Thread Tetsuo Handa
syzbot is reporting stalls at __process_echoes() [1]. This is because
since ldata->echo_commit < ldata->echo_tail becomes true for some reason,
the discard loop is serving as almost infinite loop. This patch tries to
avoid falling into ldata->echo_commit < ldata->echo_tail situation by
making access to echo_* variables more carefully.

Since reset_buffer_flags() is called without output_lock held, it should
not touch echo_* variables. And omit a call to reset_buffer_flags() from
n_tty_open() by using vzalloc().

Since add_echo_byte() is called without output_lock held, it needs memory
barrier between storing into echo_buf[] and incrementing echo_head counter.
echo_buf() needs corresponding memory barrier before reading echo_buf[].
Lack of handling the possibility of not-yet-stored multi-byte operation
might be the reason of falling into ldata->echo_commit < ldata->echo_tail
situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to
echo_buf(ldata, tail + 1), the WARN_ON() fires.

Also, explicitly masking with buffer for the former "while" loop, and
use ldata->echo_commit > tail for the latter "while" loop.

[1] 
https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Cc: Peter Hurley 
---
 drivers/tty/n_tty.c | 42 --
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b279f873..4317422 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -143,6 +143,7 @@ static inline unsigned char *read_buf_addr(struct 
n_tty_data *ldata, size_t i)
 
 static inline unsigned char echo_buf(struct n_tty_data *ldata, size_t i)
 {
+   smp_rmb(); /* Matches smp_wmb() in add_echo_byte(). */
return ldata->echo_buf[i & (N_TTY_BUF_SIZE - 1)];
 }
 
@@ -318,9 +319,7 @@ static inline void put_tty_queue(unsigned char c, struct 
n_tty_data *ldata)
 static void reset_buffer_flags(struct n_tty_data *ldata)
 {
ldata->read_head = ldata->canon_head = ldata->read_tail = 0;
-   ldata->echo_head = ldata->echo_tail = ldata->echo_commit = 0;
ldata->commit_head = 0;
-   ldata->echo_mark = 0;
ldata->line_start = 0;
 
ldata->erasing = 0;
@@ -619,13 +618,20 @@ static size_t __process_echoes(struct tty_struct *tty)
old_space = space = tty_write_room(tty);
 
tail = ldata->echo_tail;
-   while (ldata->echo_commit != tail) {
+   while (MASK(ldata->echo_commit) != MASK(tail)) {
c = echo_buf(ldata, tail);
if (c == ECHO_OP_START) {
unsigned char op;
int no_space_left = 0;
 
/*
+* Since add_echo_byte() is called without holding
+* output_lock, we might see only portion of multi-byte
+* operation.
+*/
+   if (MASK(ldata->echo_commit) == MASK(tail + 1))
+   goto not_yet_stored;
+   /*
 * If the buffer byte is the start of a multi-byte
 * operation, get the next byte, which is either the
 * op code or a control character value.
@@ -636,6 +642,8 @@ static size_t __process_echoes(struct tty_struct *tty)
unsigned int num_chars, num_bs;
 
case ECHO_OP_ERASE_TAB:
+   if (MASK(ldata->echo_commit) == MASK(tail + 2))
+   goto not_yet_stored;
num_chars = echo_buf(ldata, tail + 2);
 
/*
@@ -730,7 +738,8 @@ static size_t __process_echoes(struct tty_struct *tty)
/* If the echo buffer is nearly full (so that the possibility exists
 * of echo overrun before the next commit), then discard enough
 * data at the tail to prevent a subsequent overrun */
-   while (ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
+   while (ldata->echo_commit > tail &&
+  ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
if (echo_buf(ldata, tail) == ECHO_OP_START) {
if (echo_buf(ldata, tail + 1) == ECHO_OP_ERASE_TAB)
tail += 3;
@@ -740,6 +749,7 @@ static size_t __process_echoes(struct tty_struct *tty)
tail++;
}
 
+ not_yet_stored:
ldata->echo_tail = tail;
return old_space - space;
 }
@@ -750,6 +760,7 @@ static void commit_echoes(struct tty_struct *tty)
size_t nr, old, echoed;
size_t head;
 
+   mutex_lock(>output_lock);
head = ldata->echo_head;
ldata->echo_mark = head;
old = 

[PATCH 2/2] n_tty: Access echo_* variables carefully.

2018-05-25 Thread Tetsuo Handa
syzbot is reporting stalls at __process_echoes() [1]. This is because
since ldata->echo_commit < ldata->echo_tail becomes true for some reason,
the discard loop is serving as almost infinite loop. This patch tries to
avoid falling into ldata->echo_commit < ldata->echo_tail situation by
making access to echo_* variables more carefully.

Since reset_buffer_flags() is called without output_lock held, it should
not touch echo_* variables. And omit a call to reset_buffer_flags() from
n_tty_open() by using vzalloc().

Since add_echo_byte() is called without output_lock held, it needs memory
barrier between storing into echo_buf[] and incrementing echo_head counter.
echo_buf() needs corresponding memory barrier before reading echo_buf[].
Lack of handling the possibility of not-yet-stored multi-byte operation
might be the reason of falling into ldata->echo_commit < ldata->echo_tail
situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to
echo_buf(ldata, tail + 1), the WARN_ON() fires.

Also, explicitly masking with buffer for the former "while" loop, and
use ldata->echo_commit > tail for the latter "while" loop.

[1] 
https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40

Signed-off-by: Tetsuo Handa 
Reported-by: syzbot 
Cc: Peter Hurley 
---
 drivers/tty/n_tty.c | 42 --
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index b279f873..4317422 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -143,6 +143,7 @@ static inline unsigned char *read_buf_addr(struct 
n_tty_data *ldata, size_t i)
 
 static inline unsigned char echo_buf(struct n_tty_data *ldata, size_t i)
 {
+   smp_rmb(); /* Matches smp_wmb() in add_echo_byte(). */
return ldata->echo_buf[i & (N_TTY_BUF_SIZE - 1)];
 }
 
@@ -318,9 +319,7 @@ static inline void put_tty_queue(unsigned char c, struct 
n_tty_data *ldata)
 static void reset_buffer_flags(struct n_tty_data *ldata)
 {
ldata->read_head = ldata->canon_head = ldata->read_tail = 0;
-   ldata->echo_head = ldata->echo_tail = ldata->echo_commit = 0;
ldata->commit_head = 0;
-   ldata->echo_mark = 0;
ldata->line_start = 0;
 
ldata->erasing = 0;
@@ -619,13 +618,20 @@ static size_t __process_echoes(struct tty_struct *tty)
old_space = space = tty_write_room(tty);
 
tail = ldata->echo_tail;
-   while (ldata->echo_commit != tail) {
+   while (MASK(ldata->echo_commit) != MASK(tail)) {
c = echo_buf(ldata, tail);
if (c == ECHO_OP_START) {
unsigned char op;
int no_space_left = 0;
 
/*
+* Since add_echo_byte() is called without holding
+* output_lock, we might see only portion of multi-byte
+* operation.
+*/
+   if (MASK(ldata->echo_commit) == MASK(tail + 1))
+   goto not_yet_stored;
+   /*
 * If the buffer byte is the start of a multi-byte
 * operation, get the next byte, which is either the
 * op code or a control character value.
@@ -636,6 +642,8 @@ static size_t __process_echoes(struct tty_struct *tty)
unsigned int num_chars, num_bs;
 
case ECHO_OP_ERASE_TAB:
+   if (MASK(ldata->echo_commit) == MASK(tail + 2))
+   goto not_yet_stored;
num_chars = echo_buf(ldata, tail + 2);
 
/*
@@ -730,7 +738,8 @@ static size_t __process_echoes(struct tty_struct *tty)
/* If the echo buffer is nearly full (so that the possibility exists
 * of echo overrun before the next commit), then discard enough
 * data at the tail to prevent a subsequent overrun */
-   while (ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
+   while (ldata->echo_commit > tail &&
+  ldata->echo_commit - tail >= ECHO_DISCARD_WATERMARK) {
if (echo_buf(ldata, tail) == ECHO_OP_START) {
if (echo_buf(ldata, tail + 1) == ECHO_OP_ERASE_TAB)
tail += 3;
@@ -740,6 +749,7 @@ static size_t __process_echoes(struct tty_struct *tty)
tail++;
}
 
+ not_yet_stored:
ldata->echo_tail = tail;
return old_space - space;
 }
@@ -750,6 +760,7 @@ static void commit_echoes(struct tty_struct *tty)
size_t nr, old, echoed;
size_t head;
 
+   mutex_lock(>output_lock);
head = ldata->echo_head;
ldata->echo_mark = head;
old = ldata->echo_commit - ldata->echo_tail;
@@ -758,10 +769,12 @@ static void commit_echoes(struct tty_struct *tty)
 * is 

Re: [PATCH v2] f2fs: let fstrim issue discard commands in lower priority

2018-05-25 Thread Chao Yu
On 2018/5/26 3:49, Jaegeuk Kim wrote:
> The fstrim gathers huge number of large discard commands, and tries to issue
> without IO awareness, which results in long user-perceive IO latencies on
> READ, WRITE, and FLUSH in UFS. We've observed some of commands take several
> seconds due to long discard latency.
> 
> This patch limits the maximum size to 2MB per candidate, and check IO 
> congestion
> when issuing them to disk.
> 
> Signed-off-by: Jaegeuk Kim 

It looks good to me. :)

Reviewed-by: Chao Yu 

Thanks,



Re: [PATCH v2] f2fs: let fstrim issue discard commands in lower priority

2018-05-25 Thread Chao Yu
On 2018/5/26 3:49, Jaegeuk Kim wrote:
> The fstrim gathers huge number of large discard commands, and tries to issue
> without IO awareness, which results in long user-perceive IO latencies on
> READ, WRITE, and FLUSH in UFS. We've observed some of commands take several
> seconds due to long discard latency.
> 
> This patch limits the maximum size to 2MB per candidate, and check IO 
> congestion
> when issuing them to disk.
> 
> Signed-off-by: Jaegeuk Kim 

It looks good to me. :)

Reviewed-by: Chao Yu 

Thanks,



Re: [PATCH v3] f2fs: Fix deadlock in shutdown ioctl

2018-05-25 Thread Chao Yu
On 2018/5/18 14:21, Sahitya Tummala wrote:
> f2fs_ioc_shutdown() ioctl gets stuck in the below path
> when issued with F2FS_GOING_DOWN_FULLSYNC option.
> 
> __switch_to+0x90/0xc4
> percpu_down_write+0x8c/0xc0
> freeze_super+0xec/0x1e4
> freeze_bdev+0xc4/0xcc
> f2fs_ioctl+0xc0c/0x1ce0
> f2fs_compat_ioctl+0x98/0x1f0
> 
> Signed-off-by: Sahitya Tummala 

Reviewed-by: Chao Yu 

Thanks,




Re: [PATCH v3] f2fs: Fix deadlock in shutdown ioctl

2018-05-25 Thread Chao Yu
On 2018/5/18 14:21, Sahitya Tummala wrote:
> f2fs_ioc_shutdown() ioctl gets stuck in the below path
> when issued with F2FS_GOING_DOWN_FULLSYNC option.
> 
> __switch_to+0x90/0xc4
> percpu_down_write+0x8c/0xc0
> freeze_super+0xec/0x1e4
> freeze_bdev+0xc4/0xcc
> f2fs_ioctl+0xc0c/0x1ce0
> f2fs_compat_ioctl+0x98/0x1f0
> 
> Signed-off-by: Sahitya Tummala 

Reviewed-by: Chao Yu 

Thanks,




Re: [PATCH] xfs, proc: hide unused xfs procfs helpers

2018-05-25 Thread Al Viro
On Fri, May 25, 2018 at 06:22:04PM +0200, Christoph Hellwig wrote:
> Looks fine to me:
> 
> Reviewed-by: Christoph Hellwig 
> 
> Al, can you pick this up?

Done


Re: [PATCH] xfs, proc: hide unused xfs procfs helpers

2018-05-25 Thread Al Viro
On Fri, May 25, 2018 at 06:22:04PM +0200, Christoph Hellwig wrote:
> Looks fine to me:
> 
> Reviewed-by: Christoph Hellwig 
> 
> Al, can you pick this up?

Done


Re: [PATCH] IB: Revert "remove redundant INFINIBAND kconfig dependencies"

2018-05-25 Thread Greg Thelen
On Fri, May 25, 2018 at 2:32 PM Arnd Bergmann  wrote:

> Several subsystems depend on INFINIBAND_ADDR_TRANS, which in turn depends
> on INFINIBAND. However, when with CONFIG_INIFIBAND=m, this leads to a
> link error when another driver using it is built-in. The
> INFINIBAND_ADDR_TRANS dependency is insufficient here as this is
> a 'bool' symbol that does not force anything to be a module in turn.

> fs/cifs/smbdirect.o: In function `smbd_disconnect_rdma_work':
> smbdirect.c:(.text+0x1e4): undefined reference to `rdma_disconnect'
> net/9p/trans_rdma.o: In function `rdma_request':
> trans_rdma.c:(.text+0x7bc): undefined reference to `rdma_disconnect'
> net/9p/trans_rdma.o: In function `rdma_destroy_trans':
> trans_rdma.c:(.text+0x830): undefined reference to `ib_destroy_qp'
> trans_rdma.c:(.text+0x858): undefined reference to `ib_dealloc_pd'

> Fixes: 9533b292a7ac ("IB: remove redundant INFINIBAND kconfig
dependencies")
> Signed-off-by: Arnd Bergmann 

Acked-by: Greg Thelen 

Sorry for the 9533b292a7ac problem.
At this point the in release cycle, I think Arnd's revert is best.

If there is interest, I've put a little thought into an alternative fix:
making INFINIBAND_ADDR_TRANS tristate.  But it's nontrivial.
So I prefer this simple revert for now.

Doug: do you need anything from me on this?

> ---
> The patch that introduced the problem has been queued in the
> rdma-fixes/for-rc tree. Please revert the patch before sending
> the branch to Linus.
> ---
> drivers/infiniband/ulp/srpt/Kconfig | 2 +-
> drivers/nvme/host/Kconfig   | 2 +-
> drivers/nvme/target/Kconfig | 2 +-
> drivers/staging/lustre/lnet/Kconfig | 2 +-
> fs/cifs/Kconfig | 2 +-
> net/9p/Kconfig  | 2 +-
> net/rds/Kconfig | 2 +-
> net/sunrpc/Kconfig  | 2 +-
> 8 files changed, 8 insertions(+), 8 deletions(-)

> diff --git a/drivers/infiniband/ulp/srpt/Kconfig
b/drivers/infiniband/ulp/srpt/Kconfig
> index 25bf6955b6d0..fb8b7182f05e 100644
> --- a/drivers/infiniband/ulp/srpt/Kconfig
> +++ b/drivers/infiniband/ulp/srpt/Kconfig
> @@ -1,6 +1,6 @@
> config INFINIBAND_SRPT
>tristate "InfiniBand SCSI RDMA Protocol target support"
> -   depends on INFINIBAND_ADDR_TRANS && TARGET_CORE
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS && TARGET_CORE
>---help---

>  Support for the SCSI RDMA Protocol (SRP) Target driver. The
> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> index dbb7464c018c..88a8b5916624 100644
> --- a/drivers/nvme/host/Kconfig
> +++ b/drivers/nvme/host/Kconfig
> @@ -27,7 +27,7 @@ config NVME_FABRICS

> config NVME_RDMA
>tristate "NVM Express over Fabrics RDMA host driver"
> -   depends on INFINIBAND_ADDR_TRANS && BLOCK
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS && BLOCK
>select NVME_CORE
>select NVME_FABRICS
>select SG_POOL
> diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
> index 7595664ee753..3c7b61ddb0d1 100644
> --- a/drivers/nvme/target/Kconfig
> +++ b/drivers/nvme/target/Kconfig
> @@ -27,7 +27,7 @@ config NVME_TARGET_LOOP

> config NVME_TARGET_RDMA
>tristate "NVMe over Fabrics RDMA target support"
> -   depends on INFINIBAND_ADDR_TRANS
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS
>depends on NVME_TARGET
>select SGL_ALLOC
>help
> diff --git a/drivers/staging/lustre/lnet/Kconfig
b/drivers/staging/lustre/lnet/Kconfig
> index f3b1ad4bd3dc..ad049e6f24e4 100644
> --- a/drivers/staging/lustre/lnet/Kconfig
> +++ b/drivers/staging/lustre/lnet/Kconfig
> @@ -34,7 +34,7 @@ config LNET_SELFTEST

> config LNET_XPRT_IB
>tristate "LNET infiniband support"
> -   depends on LNET && PCI && INFINIBAND_ADDR_TRANS
> +   depends on LNET && PCI && INFINIBAND && INFINIBAND_ADDR_TRANS
>default LNET && INFINIBAND
>help
>  This option allows the LNET users to use infiniband as an
> diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
> index d61e2de8d0eb..5f132d59dfc2 100644
> --- a/fs/cifs/Kconfig
> +++ b/fs/cifs/Kconfig
> @@ -197,7 +197,7 @@ config CIFS_SMB311

> config CIFS_SMB_DIRECT
>bool "SMB Direct support (Experimental)"
> -   depends on CIFS=m && INFINIBAND_ADDR_TRANS || CIFS=y &&
INFINIBAND_ADDR_TRANS=y
> +   depends on CIFS=m && INFINIBAND && INFINIBAND_ADDR_TRANS ||
CIFS=y && INFINIBAND=y && INFINIBAND_ADDR_TRANS=y
>help
>  Enables SMB Direct experimental support for SMB 3.0, 3.02 and
3.1.1.
>  SMB Direct allows transferring SMB packets over RDMA. If
unsure,
> diff --git a/net/9p/Kconfig b/net/9p/Kconfig
> index 46c39f7da444..e6014e0e51f7 100644
> --- a/net/9p/Kconfig
> +++ b/net/9p/Kconfig
> @@ -32,7 +32,7 @@ config NET_9P_XEN


> 

Re: [PATCH] IB: Revert "remove redundant INFINIBAND kconfig dependencies"

2018-05-25 Thread Greg Thelen
On Fri, May 25, 2018 at 2:32 PM Arnd Bergmann  wrote:

> Several subsystems depend on INFINIBAND_ADDR_TRANS, which in turn depends
> on INFINIBAND. However, when with CONFIG_INIFIBAND=m, this leads to a
> link error when another driver using it is built-in. The
> INFINIBAND_ADDR_TRANS dependency is insufficient here as this is
> a 'bool' symbol that does not force anything to be a module in turn.

> fs/cifs/smbdirect.o: In function `smbd_disconnect_rdma_work':
> smbdirect.c:(.text+0x1e4): undefined reference to `rdma_disconnect'
> net/9p/trans_rdma.o: In function `rdma_request':
> trans_rdma.c:(.text+0x7bc): undefined reference to `rdma_disconnect'
> net/9p/trans_rdma.o: In function `rdma_destroy_trans':
> trans_rdma.c:(.text+0x830): undefined reference to `ib_destroy_qp'
> trans_rdma.c:(.text+0x858): undefined reference to `ib_dealloc_pd'

> Fixes: 9533b292a7ac ("IB: remove redundant INFINIBAND kconfig
dependencies")
> Signed-off-by: Arnd Bergmann 

Acked-by: Greg Thelen 

Sorry for the 9533b292a7ac problem.
At this point the in release cycle, I think Arnd's revert is best.

If there is interest, I've put a little thought into an alternative fix:
making INFINIBAND_ADDR_TRANS tristate.  But it's nontrivial.
So I prefer this simple revert for now.

Doug: do you need anything from me on this?

> ---
> The patch that introduced the problem has been queued in the
> rdma-fixes/for-rc tree. Please revert the patch before sending
> the branch to Linus.
> ---
> drivers/infiniband/ulp/srpt/Kconfig | 2 +-
> drivers/nvme/host/Kconfig   | 2 +-
> drivers/nvme/target/Kconfig | 2 +-
> drivers/staging/lustre/lnet/Kconfig | 2 +-
> fs/cifs/Kconfig | 2 +-
> net/9p/Kconfig  | 2 +-
> net/rds/Kconfig | 2 +-
> net/sunrpc/Kconfig  | 2 +-
> 8 files changed, 8 insertions(+), 8 deletions(-)

> diff --git a/drivers/infiniband/ulp/srpt/Kconfig
b/drivers/infiniband/ulp/srpt/Kconfig
> index 25bf6955b6d0..fb8b7182f05e 100644
> --- a/drivers/infiniband/ulp/srpt/Kconfig
> +++ b/drivers/infiniband/ulp/srpt/Kconfig
> @@ -1,6 +1,6 @@
> config INFINIBAND_SRPT
>tristate "InfiniBand SCSI RDMA Protocol target support"
> -   depends on INFINIBAND_ADDR_TRANS && TARGET_CORE
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS && TARGET_CORE
>---help---

>  Support for the SCSI RDMA Protocol (SRP) Target driver. The
> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> index dbb7464c018c..88a8b5916624 100644
> --- a/drivers/nvme/host/Kconfig
> +++ b/drivers/nvme/host/Kconfig
> @@ -27,7 +27,7 @@ config NVME_FABRICS

> config NVME_RDMA
>tristate "NVM Express over Fabrics RDMA host driver"
> -   depends on INFINIBAND_ADDR_TRANS && BLOCK
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS && BLOCK
>select NVME_CORE
>select NVME_FABRICS
>select SG_POOL
> diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
> index 7595664ee753..3c7b61ddb0d1 100644
> --- a/drivers/nvme/target/Kconfig
> +++ b/drivers/nvme/target/Kconfig
> @@ -27,7 +27,7 @@ config NVME_TARGET_LOOP

> config NVME_TARGET_RDMA
>tristate "NVMe over Fabrics RDMA target support"
> -   depends on INFINIBAND_ADDR_TRANS
> +   depends on INFINIBAND && INFINIBAND_ADDR_TRANS
>depends on NVME_TARGET
>select SGL_ALLOC
>help
> diff --git a/drivers/staging/lustre/lnet/Kconfig
b/drivers/staging/lustre/lnet/Kconfig
> index f3b1ad4bd3dc..ad049e6f24e4 100644
> --- a/drivers/staging/lustre/lnet/Kconfig
> +++ b/drivers/staging/lustre/lnet/Kconfig
> @@ -34,7 +34,7 @@ config LNET_SELFTEST

> config LNET_XPRT_IB
>tristate "LNET infiniband support"
> -   depends on LNET && PCI && INFINIBAND_ADDR_TRANS
> +   depends on LNET && PCI && INFINIBAND && INFINIBAND_ADDR_TRANS
>default LNET && INFINIBAND
>help
>  This option allows the LNET users to use infiniband as an
> diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
> index d61e2de8d0eb..5f132d59dfc2 100644
> --- a/fs/cifs/Kconfig
> +++ b/fs/cifs/Kconfig
> @@ -197,7 +197,7 @@ config CIFS_SMB311

> config CIFS_SMB_DIRECT
>bool "SMB Direct support (Experimental)"
> -   depends on CIFS=m && INFINIBAND_ADDR_TRANS || CIFS=y &&
INFINIBAND_ADDR_TRANS=y
> +   depends on CIFS=m && INFINIBAND && INFINIBAND_ADDR_TRANS ||
CIFS=y && INFINIBAND=y && INFINIBAND_ADDR_TRANS=y
>help
>  Enables SMB Direct experimental support for SMB 3.0, 3.02 and
3.1.1.
>  SMB Direct allows transferring SMB packets over RDMA. If
unsure,
> diff --git a/net/9p/Kconfig b/net/9p/Kconfig
> index 46c39f7da444..e6014e0e51f7 100644
> --- a/net/9p/Kconfig
> +++ b/net/9p/Kconfig
> @@ -32,7 +32,7 @@ config NET_9P_XEN


> config NET_9P_RDMA
> -   depends on INET && 

Re: define struct workqueue_struct in C file

2018-05-25 Thread Al Viro
On Thu, May 24, 2018 at 11:10:14PM +0800, Liu, Changcheng wrote:
> Hi all,
>   I have one confusion about workqueue_struct:
>   1) Why struct workqueue_struct is defined in C file instead of
>   header file?

To prevent all other code poking in its guts?

>  I'm trying to print "workqueue_struct:name" field in one external
>  build module. "workqueue_struct:name" can't be accessed directly.

... thus allowing implementation details be changed at later point without
worrying about breaking other code.

There are objects that are accessible via opaque pointers, with the set of
(widely used) primitives being the only code that has access to actual
structure layout.  Some are in a local header (outside of include search
path, and very much *NOT* supposed to be included outside of a few C
files implementing the public interfaces), some are outright local in
C file.


Re: define struct workqueue_struct in C file

2018-05-25 Thread Al Viro
On Thu, May 24, 2018 at 11:10:14PM +0800, Liu, Changcheng wrote:
> Hi all,
>   I have one confusion about workqueue_struct:
>   1) Why struct workqueue_struct is defined in C file instead of
>   header file?

To prevent all other code poking in its guts?

>  I'm trying to print "workqueue_struct:name" field in one external
>  build module. "workqueue_struct:name" can't be accessed directly.

... thus allowing implementation details be changed at later point without
worrying about breaking other code.

There are objects that are accessible via opaque pointers, with the set of
(widely used) primitives being the only code that has access to actual
structure layout.  Some are in a local header (outside of include search
path, and very much *NOT* supposed to be included outside of a few C
files implementing the public interfaces), some are outright local in
C file.


Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO

2018-05-25 Thread Suren Baghdasaryan
Hi Johannes,
I tried your previous memdelay patches before this new set was posted
and results were promising for predicting when Android system is close
to OOM. I'm definitely going to try this one after I backport it to
4.9.

On Mon, May 7, 2018 at 2:01 PM, Johannes Weiner  wrote:
> Hi,
>
> I previously submitted a version of this patch set called "memdelay",
> which translated delays from reclaim, swap-in, thrashing page cache
> into a pressure percentage of lost walltime. I've since extended this
> code to aggregate all delay states tracked by delayacct in order to
> have generalized pressure/overcommit levels for CPU, memory, and IO.
>
> There was feedback from Peter on the previous version that I have
> incorporated as much as possible and as it still applies to this code:
>
> - got rid of the extra lock in the sched callbacks; all task
>   state changes we care about serialize through rq->lock
>
> - got rid of ktime_get() inside the sched callbacks and
>   switched time measuring to rq_clock()
>
> - got rid of all divisions inside the sched callbacks,
>   tracking everything natively in ns now
>
> I also moved this stuff into existing sched/stat.h callbacks, so it
> doesn't get in the way in sched/core.c, and of course moved the whole
> thing behind CONFIG_PSI since not everyone is going to want it.

Would it make sense to split CONFIG_PSI into CONFIG_PSI_CPU,
CONFIG_PSI_MEM and CONFIG_PSI_IO since one might need only specific
subset of this feature?

>
> Real-world applications
>
> Since the last posting, we've begun using the data collected by this
> code quite extensively at Facebook, and with several success stories.
>
> First we used it on systems that frequently locked up in low memory
> situations. The reason this happens is that the OOM killer is
> triggered by reclaim not being able to make forward progress, but with
> fast flash devices there is *always* some clean and uptodate cache to
> reclaim; the OOM killer never kicks in, even as tasks wait 80-90% of
> the time faulting executables. There is no situation where this ever
> makes sense in practice. We wrote a <100 line POC python script to
> monitor memory pressure and kill stuff manually, way before such
> pathological thrashing.
>
> We've since extended the python script into a more generic oomd that
> we use all over the place, not just to avoid livelocks but also to
> guarantee latency and throughput SLAs, since they're usually violated
> way before the kernel OOM killer would ever kick in.
>
> We also use the memory pressure info for loadshedding. Our batch job
> infrastructure used to refuse new requests on heuristics based on RSS
> and other existing VM metrics in an attempt to avoid OOM kills and
> maximize utilization. Since it was still plagued by frequent OOM
> kills, we switched it to shed load on psi memory pressure, which has
> turned out to be a much better bellwether, and we managed to reduce
> OOM kills drastically. Reducing the rate of OOM outages from the
> worker pool raised its aggregate productivity, and we were able to
> switch that service to smaller machines.
>
> Lastly, we use cgroups to isolate a machine's main workload from
> maintenance crap like package upgrades, logging, configuration, as
> well as to prevent multiple workloads on a machine from stepping on
> each others' toes. We were not able to do this properly without the
> pressure metrics; we would see latency or bandwidth drops, but it
> would often be hard to impossible to rootcause it post-mortem. We now
> log and graph the pressure metrics for all containers in our fleet and
> can trivially link service drops to resource pressure after the fact.
>
> How do you use this?
>
> A kernel with CONFIG_PSI=y will create a /proc/pressure directory with
> 3 files: cpu, memory, and io. If using cgroup2, cgroups will also have
> cpu.pressure, memory.pressure and io.pressure files, which simply
> calculate pressure at the cgroup level instead of system-wide.
>
> The cpu file contains one line:
>
> some avg10=2.04 avg60=0.75 avg300=0.40 total=157656722
>
> The averages give the percentage of walltime in which some tasks are
> delayed on the runqueue while another task has the CPU. They're recent
> averages over 10s, 1m, 5m windows, so you can tell short term trends
> from long term ones, similarly to the load average.
>
> What to make of this number? If CPU utilization is at 100% and CPU
> pressure is 0, it means the system is perfectly utilized, with one
> runnable thread per CPU and nobody waiting. At two or more runnable
> tasks per CPU, the system is 100% overcommitted and the pressure
> average will indicate as much. From a utilization perspective this is
> a great state of course: no CPU cycles are being wasted, even when 50%
> of the threads were to go idle (and most workloads do vary). From the
> perspective of the individual job it's not great, however, and they
> might do 

Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO

2018-05-25 Thread Suren Baghdasaryan
Hi Johannes,
I tried your previous memdelay patches before this new set was posted
and results were promising for predicting when Android system is close
to OOM. I'm definitely going to try this one after I backport it to
4.9.

On Mon, May 7, 2018 at 2:01 PM, Johannes Weiner  wrote:
> Hi,
>
> I previously submitted a version of this patch set called "memdelay",
> which translated delays from reclaim, swap-in, thrashing page cache
> into a pressure percentage of lost walltime. I've since extended this
> code to aggregate all delay states tracked by delayacct in order to
> have generalized pressure/overcommit levels for CPU, memory, and IO.
>
> There was feedback from Peter on the previous version that I have
> incorporated as much as possible and as it still applies to this code:
>
> - got rid of the extra lock in the sched callbacks; all task
>   state changes we care about serialize through rq->lock
>
> - got rid of ktime_get() inside the sched callbacks and
>   switched time measuring to rq_clock()
>
> - got rid of all divisions inside the sched callbacks,
>   tracking everything natively in ns now
>
> I also moved this stuff into existing sched/stat.h callbacks, so it
> doesn't get in the way in sched/core.c, and of course moved the whole
> thing behind CONFIG_PSI since not everyone is going to want it.

Would it make sense to split CONFIG_PSI into CONFIG_PSI_CPU,
CONFIG_PSI_MEM and CONFIG_PSI_IO since one might need only specific
subset of this feature?

>
> Real-world applications
>
> Since the last posting, we've begun using the data collected by this
> code quite extensively at Facebook, and with several success stories.
>
> First we used it on systems that frequently locked up in low memory
> situations. The reason this happens is that the OOM killer is
> triggered by reclaim not being able to make forward progress, but with
> fast flash devices there is *always* some clean and uptodate cache to
> reclaim; the OOM killer never kicks in, even as tasks wait 80-90% of
> the time faulting executables. There is no situation where this ever
> makes sense in practice. We wrote a <100 line POC python script to
> monitor memory pressure and kill stuff manually, way before such
> pathological thrashing.
>
> We've since extended the python script into a more generic oomd that
> we use all over the place, not just to avoid livelocks but also to
> guarantee latency and throughput SLAs, since they're usually violated
> way before the kernel OOM killer would ever kick in.
>
> We also use the memory pressure info for loadshedding. Our batch job
> infrastructure used to refuse new requests on heuristics based on RSS
> and other existing VM metrics in an attempt to avoid OOM kills and
> maximize utilization. Since it was still plagued by frequent OOM
> kills, we switched it to shed load on psi memory pressure, which has
> turned out to be a much better bellwether, and we managed to reduce
> OOM kills drastically. Reducing the rate of OOM outages from the
> worker pool raised its aggregate productivity, and we were able to
> switch that service to smaller machines.
>
> Lastly, we use cgroups to isolate a machine's main workload from
> maintenance crap like package upgrades, logging, configuration, as
> well as to prevent multiple workloads on a machine from stepping on
> each others' toes. We were not able to do this properly without the
> pressure metrics; we would see latency or bandwidth drops, but it
> would often be hard to impossible to rootcause it post-mortem. We now
> log and graph the pressure metrics for all containers in our fleet and
> can trivially link service drops to resource pressure after the fact.
>
> How do you use this?
>
> A kernel with CONFIG_PSI=y will create a /proc/pressure directory with
> 3 files: cpu, memory, and io. If using cgroup2, cgroups will also have
> cpu.pressure, memory.pressure and io.pressure files, which simply
> calculate pressure at the cgroup level instead of system-wide.
>
> The cpu file contains one line:
>
> some avg10=2.04 avg60=0.75 avg300=0.40 total=157656722
>
> The averages give the percentage of walltime in which some tasks are
> delayed on the runqueue while another task has the CPU. They're recent
> averages over 10s, 1m, 5m windows, so you can tell short term trends
> from long term ones, similarly to the load average.
>
> What to make of this number? If CPU utilization is at 100% and CPU
> pressure is 0, it means the system is perfectly utilized, with one
> runnable thread per CPU and nobody waiting. At two or more runnable
> tasks per CPU, the system is 100% overcommitted and the pressure
> average will indicate as much. From a utilization perspective this is
> a great state of course: no CPU cycles are being wasted, even when 50%
> of the threads were to go idle (and most workloads do vary). From the
> perspective of the individual job it's not great, however, and they
> might do better with more 

Re: [PATCH, net-next] qcom-emag: hide ACPI specific functions

2018-05-25 Thread Timur Tabi

On 5/25/18 4:37 PM, Arnd Bergmann wrote:

+#ifdef CONFIG_ACPI
  static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u8 irq_bits)
  {
struct emac_sgmii *phy = >phy;
@@ -288,6 +289,7 @@ static struct sgmii_ops qdf2400_ops = {
.link_change = emac_sgmii_common_link_change,
.reset = emac_sgmii_common_reset,
  };
+#endif


This seems wrong.  The SGMII interrupt handler should still be viable on 
a device-tree system.  There is a DT compatibility entry for the qdf2432.


Looks like that most recent patch on net-next broke DT support, when it 
removed these lines:


-   phy->open = emac_sgmii_open;
-   phy->close = emac_sgmii_close;
-   phy->link_up = emac_sgmii_link_up;
-   phy->link_down = emac_sgmii_link_down;

I'll take it look at it next week when I'm back in the office.

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH, net-next] qcom-emag: hide ACPI specific functions

2018-05-25 Thread Timur Tabi

On 5/25/18 4:37 PM, Arnd Bergmann wrote:

+#ifdef CONFIG_ACPI
  static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u8 irq_bits)
  {
struct emac_sgmii *phy = >phy;
@@ -288,6 +289,7 @@ static struct sgmii_ops qdf2400_ops = {
.link_change = emac_sgmii_common_link_change,
.reset = emac_sgmii_common_reset,
  };
+#endif


This seems wrong.  The SGMII interrupt handler should still be viable on 
a device-tree system.  There is a DT compatibility entry for the qdf2432.


Looks like that most recent patch on net-next broke DT support, when it 
removed these lines:


-   phy->open = emac_sgmii_open;
-   phy->close = emac_sgmii_close;
-   phy->link_up = emac_sgmii_link_up;
-   phy->link_down = emac_sgmii_link_down;

I'll take it look at it next week when I'm back in the office.

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


[BUG] perf/inject: crash in pipe mode

2018-05-25 Thread Stephane Eranian
Hi,


With the latest tip.git perf, if you run

$ perf record -a -o - sleep 2 | perf inject -b -i - | perf buildid-list -i -
SEGFAULT in perf inject:
free_dup_event (oe=0x55d25b88, oe=0x55d25b88,
event=0x3030310931303031) at util/ordered-events.c:86
86 oe->cur_alloc_size -= event->header.size;
(gdb) bt
#0  free_dup_event (oe=0x55d25b88, oe=0x55d25b88,
event=0x3030310931303031) at util/ordered-events.c:86
#1  ordered_events__free (oe=oe@entry=0x55d25b88) at
util/ordered-events.c:310
#2  0x557964f8 in __perf_session__process_pipe_events
(session=0x55d1f910) at util/session.c:1778
#3  perf_session__process_events (session=session@entry=0x55d1f910) at
util/session.c:1958
#4  0x556ef9b2 in __cmd_inject (inject=0x7fffda40) at
builtin-inject.c:697
#5  cmd_inject (argc=, argv=) at
builtin-inject.c:871
#6  0x5572e8b1 in run_builtin (p=0x55be8f98 ,
argc=4, argv=0x7fffe460) at perf.c:303
#7  0x5572ebae in handle_internal_command (argc=4,
argv=0x7fffe460) at perf.c:355
#8  0x556ae1e1 in run_argv (argcp=,
argv=) at perf.c:399
#9  main (argc=, argv=0x7fffe460) at perf.c:521

In general I think the pipe mode is not very well tested. I think it could
be made the default file format. I believe perf can
autodetect file vs. pipe mode perf.data using the header.size field. This
would simplify a few things inside perf and ensure
the pipe mode format is well tested.


[BUG] perf/inject: crash in pipe mode

2018-05-25 Thread Stephane Eranian
Hi,


With the latest tip.git perf, if you run

$ perf record -a -o - sleep 2 | perf inject -b -i - | perf buildid-list -i -
SEGFAULT in perf inject:
free_dup_event (oe=0x55d25b88, oe=0x55d25b88,
event=0x3030310931303031) at util/ordered-events.c:86
86 oe->cur_alloc_size -= event->header.size;
(gdb) bt
#0  free_dup_event (oe=0x55d25b88, oe=0x55d25b88,
event=0x3030310931303031) at util/ordered-events.c:86
#1  ordered_events__free (oe=oe@entry=0x55d25b88) at
util/ordered-events.c:310
#2  0x557964f8 in __perf_session__process_pipe_events
(session=0x55d1f910) at util/session.c:1778
#3  perf_session__process_events (session=session@entry=0x55d1f910) at
util/session.c:1958
#4  0x556ef9b2 in __cmd_inject (inject=0x7fffda40) at
builtin-inject.c:697
#5  cmd_inject (argc=, argv=) at
builtin-inject.c:871
#6  0x5572e8b1 in run_builtin (p=0x55be8f98 ,
argc=4, argv=0x7fffe460) at perf.c:303
#7  0x5572ebae in handle_internal_command (argc=4,
argv=0x7fffe460) at perf.c:355
#8  0x556ae1e1 in run_argv (argcp=,
argv=) at perf.c:399
#9  main (argc=, argv=0x7fffe460) at perf.c:521

In general I think the pipe mode is not very well tested. I think it could
be made the default file format. I believe perf can
autodetect file vs. pipe mode perf.data using the header.size field. This
would simplify a few things inside perf and ensure
the pipe mode format is well tested.


Re: aio poll and a new in-kernel poll API V13

2018-05-25 Thread Al Viro
On Wed, May 23, 2018 at 09:19:49PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series adds support for the IOCB_CMD_POLL operation to poll for the
> readyness of file descriptors using the aio subsystem.  The API is based
> on patches that existed in RHAS2.1 and RHEL3, which means it already is
> supported by libaio.  To implement the poll support efficiently new
> methods to poll are introduced in struct file_operations:  get_poll_head
> and poll_mask.  The first one returns a wait_queue_head to wait on
> (lifetime is bound by the file), and the second does a non-blocking
> check for the POLL* events.  This allows aio poll to work without
> any additional context switches, unlike epoll.
> 
> This series sits on top of the aio-fsync series that also includes
> support for io_pgetevents.

OK, I can live with that, except for one problem - the first patch shouldn't
be sitting on top of arseloads of next window fodder.

Please, rebase the rest of the series on top of merge of vfs.git#fixes
(4faa99965e02) with your aio-fsync.4 and tell me what to pull.


Re: aio poll and a new in-kernel poll API V13

2018-05-25 Thread Al Viro
On Wed, May 23, 2018 at 09:19:49PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series adds support for the IOCB_CMD_POLL operation to poll for the
> readyness of file descriptors using the aio subsystem.  The API is based
> on patches that existed in RHAS2.1 and RHEL3, which means it already is
> supported by libaio.  To implement the poll support efficiently new
> methods to poll are introduced in struct file_operations:  get_poll_head
> and poll_mask.  The first one returns a wait_queue_head to wait on
> (lifetime is bound by the file), and the second does a non-blocking
> check for the POLL* events.  This allows aio poll to work without
> any additional context switches, unlike epoll.
> 
> This series sits on top of the aio-fsync series that also includes
> support for io_pgetevents.

OK, I can live with that, except for one problem - the first patch shouldn't
be sitting on top of arseloads of next window fodder.

Please, rebase the rest of the series on top of merge of vfs.git#fixes
(4faa99965e02) with your aio-fsync.4 and tell me what to pull.


[GIT PULL] Transition vfs to 64-bit timestamps

2018-05-25 Thread Deepa Dinamani
Hi Thomas,

This is a pull request for the series that I sent earlier.
The series aims to switch vfs timestamps to use
struct timespec64. Currently vfs uses struct timespec,
which is not y2038 safe.

The flag patch applies cleanly. I've not seen the timestamps
update logic change often. The series applies cleanly on 4.17-rc6
and linux-next tip (top commit: next-20180517).

I'm not sure how to merge this kind of a series with a flag patch.
We are targeting 4.18 for this.
Let me know if you have other suggestions.

The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
   replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
   This is a flag day patch.

I've tried to keep the conversions with the script simple, to
aid in the reviews. I've kept all the internal filesystem data
structures and function signatures the same.

Next steps:
1. Convert APIs that can handle timespec64, instead of converting
   timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions.

-- 
Changes from v1:
* Fix the pointer bug in the udf patch.
* Include Kees's patch for pstore.
* Fix hostfs regression found by kbuild bot.

The following changes since commit 771c577c23bac90597c685971d7297ea00f99d11:

  Linux 4.17-rc6 (2018-05-20 15:31:38 -0700)

are available in the Git repository at:

  https://github.com/deepa-hub/vfs vfs_timespec64

for you to fetch changes up to 213ae530f442029284ab6041812df0571ebd9da7:

  vfs: change inode times to use struct timespec64 (2018-05-25 15:31:14 -0700)


Deepa Dinamani (6):
  fs: add timespec64_truncate()
  lustre: Use long long type to print inode time
  ceph: make inode time prints to be long long
  fs: nfs: get rid of memcpys for inode times
  udf: Simplify calls to udf_disk_stamp_to_time
  vfs: change inode times to use struct timespec64

Kees Cook (1):
  pstore: Convert internal records to timespec64

 drivers/firmware/efi/efi-pstore.c   | 27 
 drivers/staging/lustre/lustre/llite/llite_lib.c | 12 ++--
 drivers/staging/lustre/lustre/llite/namei.c |  5 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c |  7 +-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c   |  6 +-
 drivers/staging/lustre/lustre/obdclass/obdo.c   |  6 +-
 drivers/tty/tty_io.c| 15 -
 drivers/usb/gadget/function/f_fs.c  |  2 +-
 fs/adfs/inode.c |  7 +-
 fs/afs/fsclient.c   |  2 +-
 fs/attr.c   | 14 ++--
 fs/bad_inode.c  |  2 +-
 fs/btrfs/file.c |  6 +-
 fs/btrfs/inode.c|  8 +--
 fs/btrfs/ioctl.c|  4 +-
 fs/btrfs/root-tree.c|  4 +-
 fs/btrfs/transaction.c  |  2 +-
 fs/ceph/addr.c  | 12 ++--
 fs/ceph/cache.c |  4 +-
 fs/ceph/caps.c  |  6 +-
 fs/ceph/file.c  |  6 +-
 fs/ceph/inode.c | 86 +
 fs/ceph/mds_client.c|  7 +-
 fs/ceph/snap.c  |  6 +-
 fs/cifs/cache.c |  4 +-
 fs/cifs/fscache.c   |  8 +--
 fs/cifs/inode.c | 26 
 fs/coda/coda_linux.c| 12 ++--
 fs/configfs/inode.c | 12 ++--
 fs/cramfs/inode.c   |  2 +-
 fs/ext4/ext4.h  | 34 ++
 fs/ext4/ialloc.c|  4 +-
 fs/ext4/namei.c |  2 +-
 fs/f2fs/f2fs.h  | 10 ++-
 fs/f2fs/file.c  | 12 ++--
 fs/f2fs/inode.c | 12 ++--
 fs/f2fs/namei.c |  4 +-
 fs/fat/inode.c  | 20 --
 fs/fat/namei_msdos.c| 21 +++---
 fs/fat/namei_vfat.c | 22 ---
 fs/fuse/inode.c |  2 +-
 fs/gfs2/dir.c   |  6 +-
 fs/gfs2/glops.c |  4 +-
 fs/hfs/inode.c  |  4 +-
 fs/hfsplus/inode.c  | 12 ++--
 fs/hostfs/hostfs_kern.c | 12 ++--
 fs/inode.c  | 58 -
 fs/jffs2/dir.c  

[GIT PULL] Transition vfs to 64-bit timestamps

2018-05-25 Thread Deepa Dinamani
Hi Thomas,

This is a pull request for the series that I sent earlier.
The series aims to switch vfs timestamps to use
struct timespec64. Currently vfs uses struct timespec,
which is not y2038 safe.

The flag patch applies cleanly. I've not seen the timestamps
update logic change often. The series applies cleanly on 4.17-rc6
and linux-next tip (top commit: next-20180517).

I'm not sure how to merge this kind of a series with a flag patch.
We are targeting 4.18 for this.
Let me know if you have other suggestions.

The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
   replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
   This is a flag day patch.

I've tried to keep the conversions with the script simple, to
aid in the reviews. I've kept all the internal filesystem data
structures and function signatures the same.

Next steps:
1. Convert APIs that can handle timespec64, instead of converting
   timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions.

-- 
Changes from v1:
* Fix the pointer bug in the udf patch.
* Include Kees's patch for pstore.
* Fix hostfs regression found by kbuild bot.

The following changes since commit 771c577c23bac90597c685971d7297ea00f99d11:

  Linux 4.17-rc6 (2018-05-20 15:31:38 -0700)

are available in the Git repository at:

  https://github.com/deepa-hub/vfs vfs_timespec64

for you to fetch changes up to 213ae530f442029284ab6041812df0571ebd9da7:

  vfs: change inode times to use struct timespec64 (2018-05-25 15:31:14 -0700)


Deepa Dinamani (6):
  fs: add timespec64_truncate()
  lustre: Use long long type to print inode time
  ceph: make inode time prints to be long long
  fs: nfs: get rid of memcpys for inode times
  udf: Simplify calls to udf_disk_stamp_to_time
  vfs: change inode times to use struct timespec64

Kees Cook (1):
  pstore: Convert internal records to timespec64

 drivers/firmware/efi/efi-pstore.c   | 27 
 drivers/staging/lustre/lustre/llite/llite_lib.c | 12 ++--
 drivers/staging/lustre/lustre/llite/namei.c |  5 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c |  7 +-
 drivers/staging/lustre/lustre/mdc/mdc_reint.c   |  6 +-
 drivers/staging/lustre/lustre/obdclass/obdo.c   |  6 +-
 drivers/tty/tty_io.c| 15 -
 drivers/usb/gadget/function/f_fs.c  |  2 +-
 fs/adfs/inode.c |  7 +-
 fs/afs/fsclient.c   |  2 +-
 fs/attr.c   | 14 ++--
 fs/bad_inode.c  |  2 +-
 fs/btrfs/file.c |  6 +-
 fs/btrfs/inode.c|  8 +--
 fs/btrfs/ioctl.c|  4 +-
 fs/btrfs/root-tree.c|  4 +-
 fs/btrfs/transaction.c  |  2 +-
 fs/ceph/addr.c  | 12 ++--
 fs/ceph/cache.c |  4 +-
 fs/ceph/caps.c  |  6 +-
 fs/ceph/file.c  |  6 +-
 fs/ceph/inode.c | 86 +
 fs/ceph/mds_client.c|  7 +-
 fs/ceph/snap.c  |  6 +-
 fs/cifs/cache.c |  4 +-
 fs/cifs/fscache.c   |  8 +--
 fs/cifs/inode.c | 26 
 fs/coda/coda_linux.c| 12 ++--
 fs/configfs/inode.c | 12 ++--
 fs/cramfs/inode.c   |  2 +-
 fs/ext4/ext4.h  | 34 ++
 fs/ext4/ialloc.c|  4 +-
 fs/ext4/namei.c |  2 +-
 fs/f2fs/f2fs.h  | 10 ++-
 fs/f2fs/file.c  | 12 ++--
 fs/f2fs/inode.c | 12 ++--
 fs/f2fs/namei.c |  4 +-
 fs/fat/inode.c  | 20 --
 fs/fat/namei_msdos.c| 21 +++---
 fs/fat/namei_vfat.c | 22 ---
 fs/fuse/inode.c |  2 +-
 fs/gfs2/dir.c   |  6 +-
 fs/gfs2/glops.c |  4 +-
 fs/hfs/inode.c  |  4 +-
 fs/hfsplus/inode.c  | 12 ++--
 fs/hostfs/hostfs_kern.c | 12 ++--
 fs/inode.c  | 58 -
 fs/jffs2/dir.c  

Re: [PATCH v7 6/8] tracing: Centralize preemptirq tracepoints and unify their usage

2018-05-25 Thread Joel Fernandes
On Fri, May 25, 2018 at 08:43:39PM +0900, Namhyung Kim wrote:
> Hi Joel,
> 
> On Wed, May 23, 2018 at 06:21:55PM -0700, Joel Fernandes wrote:
> > From: "Joel Fernandes (Google)" 
> > 
> > This patch detaches the preemptirq tracepoints from the tracers and
> > keeps it separate.
> > 
> > Advantages:
> > * Lockdep and irqsoff event can now run in parallel since they no longer
> > have their own calls.
> > 
> > * This unifies the usecase of adding hooks to an irqsoff and irqson
> > event, and a preemptoff and preempton event.
> >   3 users of the events exist:
> >   - Lockdep
> >   - irqsoff and preemptoff tracers
> >   - irqs and preempt trace events
> > 
> > The unification cleans up several ifdefs and makes the code in preempt
> > tracer and irqsoff tracers simpler. It gets rid of all the horrific
> > ifdeferry around PROVE_LOCKING and makes configuration of the different
> > users of the tracepoints more easy and understandable. It also gets rid
> > of the time_* function calls from the lockdep hooks used to call into
> > the preemptirq tracer which is not needed anymore. The negative delta in
> > lines of code in this patch is quite large too.
> > 
> [SNIP]
> >  
> >  #ifdef CONFIG_IRQSOFF_TRACER
> > +/*
> > + * We are only interested in hardirq on/off events:
> > + */
> > +static void tracer_hardirqs_on(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (!preempt_trace() && irq_trace())
> > +   stop_critical_timing(a0, a1);
> > +}
> > +
> > +static void tracer_hardirqs_off(void *none, unsigned long a0, unsigned 
> > long a1)
> > +{
> > +   if (!preempt_trace() && irq_trace())
> > +   start_critical_timing(a0, a1);
> > +}
> > +
> >  static int irqsoff_tracer_init(struct trace_array *tr)
> >  {
> > trace_type = TRACER_IRQS_OFF;
> >  
> > +   register_trace_irq_disable(tracer_hardirqs_off, NULL);
> > +   register_trace_irq_enable(tracer_hardirqs_on, NULL);
> > return __irqsoff_tracer_init(tr);
> >  }
> >  
> >  static void irqsoff_tracer_reset(struct trace_array *tr)
> >  {
> > +   unregister_trace_irq_disable(tracer_hardirqs_off, NULL);
> > +   unregister_trace_irq_enable(tracer_hardirqs_on, NULL);
> > __irqsoff_tracer_reset(tr);
> >  }
> >  
> > @@ -692,19 +650,37 @@ static struct tracer irqsoff_tracer __read_mostly =
> >  };
> >  # define register_irqsoff(trace) register_tracer()
> >  #else
> > +static inline void tracer_hardirqs_on(unsigned long a0, unsigned long a1) 
> > { }
> > +static inline void tracer_hardirqs_off(unsigned long a0, unsigned long a1) 
> > { }
> 
> Just a nitpick.  These lines seem unnecessary since they're used
> only when CONFIG_IRQSOFF_TRACER is enabled AFAICS.
> 
> 
> >  # define register_irqsoff(trace) do { } while (0)
> > -#endif
> > +#endif /*  CONFIG_IRQSOFF_TRACER */
> >  
> >  #ifdef CONFIG_PREEMPT_TRACER
> > +static void tracer_preempt_on(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (preempt_trace() && !irq_trace())
> > +   stop_critical_timing(a0, a1);
> > +}
> > +
> > +static void tracer_preempt_off(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (preempt_trace() && !irq_trace())
> > +   start_critical_timing(a0, a1);
> > +}
> > +
> >  static int preemptoff_tracer_init(struct trace_array *tr)
> >  {
> > trace_type = TRACER_PREEMPT_OFF;
> >  
> > +   register_trace_preempt_disable(tracer_preempt_off, NULL);
> > +   register_trace_preempt_enable(tracer_preempt_on, NULL);
> > return __irqsoff_tracer_init(tr);
> >  }
> >  
> >  static void preemptoff_tracer_reset(struct trace_array *tr)
> >  {
> > +   unregister_trace_preempt_disable(tracer_preempt_off, NULL);
> > +   unregister_trace_preempt_enable(tracer_preempt_on, NULL);
> > __irqsoff_tracer_reset(tr);
> >  }
> >  
> > @@ -729,21 +705,32 @@ static struct tracer preemptoff_tracer __read_mostly =
> >  };
> >  # define register_preemptoff(trace) register_tracer()
> >  #else
> > +static inline void tracer_preempt_on(unsigned long a0, unsigned long a1) { 
> > }
> > +static inline void tracer_preempt_off(unsigned long a0, unsigned long a1) 
> > { }
> 
> Ditto (for CONFIG_PREEMPT_TRACER).
> 
> Thanks,
> Namhyung

Yes you're right, saves quite a few lines actually! I also inlined the
register_tracer macros, It seems much cleaner. I will fold the below diff but
let me know if there's anything else.

Thanks Namhyung!

 - Joel


---8<---

diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index d2d8284088f0..d0bcb51d1a2a 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -648,11 +648,6 @@ static struct tracer irqsoff_tracer __read_mostly =
.allow_instances = true,
.use_max_tr = true,
 };
-# define register_irqsoff(trace) register_tracer()
-#else
-static inline void tracer_hardirqs_on(unsigned long a0, unsigned long a1) { }
-static inline void tracer_hardirqs_off(unsigned long a0, unsigned long 

Re: [PATCH v7 6/8] tracing: Centralize preemptirq tracepoints and unify their usage

2018-05-25 Thread Joel Fernandes
On Fri, May 25, 2018 at 08:43:39PM +0900, Namhyung Kim wrote:
> Hi Joel,
> 
> On Wed, May 23, 2018 at 06:21:55PM -0700, Joel Fernandes wrote:
> > From: "Joel Fernandes (Google)" 
> > 
> > This patch detaches the preemptirq tracepoints from the tracers and
> > keeps it separate.
> > 
> > Advantages:
> > * Lockdep and irqsoff event can now run in parallel since they no longer
> > have their own calls.
> > 
> > * This unifies the usecase of adding hooks to an irqsoff and irqson
> > event, and a preemptoff and preempton event.
> >   3 users of the events exist:
> >   - Lockdep
> >   - irqsoff and preemptoff tracers
> >   - irqs and preempt trace events
> > 
> > The unification cleans up several ifdefs and makes the code in preempt
> > tracer and irqsoff tracers simpler. It gets rid of all the horrific
> > ifdeferry around PROVE_LOCKING and makes configuration of the different
> > users of the tracepoints more easy and understandable. It also gets rid
> > of the time_* function calls from the lockdep hooks used to call into
> > the preemptirq tracer which is not needed anymore. The negative delta in
> > lines of code in this patch is quite large too.
> > 
> [SNIP]
> >  
> >  #ifdef CONFIG_IRQSOFF_TRACER
> > +/*
> > + * We are only interested in hardirq on/off events:
> > + */
> > +static void tracer_hardirqs_on(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (!preempt_trace() && irq_trace())
> > +   stop_critical_timing(a0, a1);
> > +}
> > +
> > +static void tracer_hardirqs_off(void *none, unsigned long a0, unsigned 
> > long a1)
> > +{
> > +   if (!preempt_trace() && irq_trace())
> > +   start_critical_timing(a0, a1);
> > +}
> > +
> >  static int irqsoff_tracer_init(struct trace_array *tr)
> >  {
> > trace_type = TRACER_IRQS_OFF;
> >  
> > +   register_trace_irq_disable(tracer_hardirqs_off, NULL);
> > +   register_trace_irq_enable(tracer_hardirqs_on, NULL);
> > return __irqsoff_tracer_init(tr);
> >  }
> >  
> >  static void irqsoff_tracer_reset(struct trace_array *tr)
> >  {
> > +   unregister_trace_irq_disable(tracer_hardirqs_off, NULL);
> > +   unregister_trace_irq_enable(tracer_hardirqs_on, NULL);
> > __irqsoff_tracer_reset(tr);
> >  }
> >  
> > @@ -692,19 +650,37 @@ static struct tracer irqsoff_tracer __read_mostly =
> >  };
> >  # define register_irqsoff(trace) register_tracer()
> >  #else
> > +static inline void tracer_hardirqs_on(unsigned long a0, unsigned long a1) 
> > { }
> > +static inline void tracer_hardirqs_off(unsigned long a0, unsigned long a1) 
> > { }
> 
> Just a nitpick.  These lines seem unnecessary since they're used
> only when CONFIG_IRQSOFF_TRACER is enabled AFAICS.
> 
> 
> >  # define register_irqsoff(trace) do { } while (0)
> > -#endif
> > +#endif /*  CONFIG_IRQSOFF_TRACER */
> >  
> >  #ifdef CONFIG_PREEMPT_TRACER
> > +static void tracer_preempt_on(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (preempt_trace() && !irq_trace())
> > +   stop_critical_timing(a0, a1);
> > +}
> > +
> > +static void tracer_preempt_off(void *none, unsigned long a0, unsigned long 
> > a1)
> > +{
> > +   if (preempt_trace() && !irq_trace())
> > +   start_critical_timing(a0, a1);
> > +}
> > +
> >  static int preemptoff_tracer_init(struct trace_array *tr)
> >  {
> > trace_type = TRACER_PREEMPT_OFF;
> >  
> > +   register_trace_preempt_disable(tracer_preempt_off, NULL);
> > +   register_trace_preempt_enable(tracer_preempt_on, NULL);
> > return __irqsoff_tracer_init(tr);
> >  }
> >  
> >  static void preemptoff_tracer_reset(struct trace_array *tr)
> >  {
> > +   unregister_trace_preempt_disable(tracer_preempt_off, NULL);
> > +   unregister_trace_preempt_enable(tracer_preempt_on, NULL);
> > __irqsoff_tracer_reset(tr);
> >  }
> >  
> > @@ -729,21 +705,32 @@ static struct tracer preemptoff_tracer __read_mostly =
> >  };
> >  # define register_preemptoff(trace) register_tracer()
> >  #else
> > +static inline void tracer_preempt_on(unsigned long a0, unsigned long a1) { 
> > }
> > +static inline void tracer_preempt_off(unsigned long a0, unsigned long a1) 
> > { }
> 
> Ditto (for CONFIG_PREEMPT_TRACER).
> 
> Thanks,
> Namhyung

Yes you're right, saves quite a few lines actually! I also inlined the
register_tracer macros, It seems much cleaner. I will fold the below diff but
let me know if there's anything else.

Thanks Namhyung!

 - Joel


---8<---

diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index d2d8284088f0..d0bcb51d1a2a 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -648,11 +648,6 @@ static struct tracer irqsoff_tracer __read_mostly =
.allow_instances = true,
.use_max_tr = true,
 };
-# define register_irqsoff(trace) register_tracer()
-#else
-static inline void tracer_hardirqs_on(unsigned long a0, unsigned long a1) { }
-static inline void tracer_hardirqs_off(unsigned long a0, unsigned long a1) { }
-# define 

Re: [PATCH, net-next] net/mlx5e: fix TLS dependency

2018-05-25 Thread Saeed Mahameed
On Fri, 2018-05-25 at 23:36 +0200, Arnd Bergmann wrote:
> With CONFIG_TLS=m and MLX5_CORE_EN=y, we get a link failure:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.o: In
> function `mlx5e_tls_handle_ooo':
> tls_rxtx.c:(.text+0x24c): undefined reference to `tls_get_record'
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.o: In
> function `mlx5e_tls_handle_tx_skb':
> tls_rxtx.c:(.text+0x9a8): undefined reference to
> `tls_device_sk_destruct'
> 
> This narrows down the dependency to only allow the configurations
> that will actually work. The existing dependency on TLS_DEVICE is
> not sufficient here since MLX5_EN_TLS is a 'bool' symbol.
> 
> Fixes: c83294b9efa5 ("net/mlx5e: TLS, Add Innova TLS TX support")
> Signed-off-by: Arnd Bergmann 
> ---

LGTM

Acked-by: Saeed Mahameed 

Thank you Arnd!


>  drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> index ee6684779d11..2545296a0c08 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> @@ -91,6 +91,7 @@ config MLX5_EN_TLS
>   bool "TLS cryptography-offload accelaration"
>   depends on MLX5_CORE_EN
>   depends on TLS_DEVICE
> + depends on TLS=y || MLX5_CORE=m
>   depends on MLX5_ACCEL
>   default n
>   ---help---

Re: [PATCH, net-next] net/mlx5e: fix TLS dependency

2018-05-25 Thread Saeed Mahameed
On Fri, 2018-05-25 at 23:36 +0200, Arnd Bergmann wrote:
> With CONFIG_TLS=m and MLX5_CORE_EN=y, we get a link failure:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.o: In
> function `mlx5e_tls_handle_ooo':
> tls_rxtx.c:(.text+0x24c): undefined reference to `tls_get_record'
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.o: In
> function `mlx5e_tls_handle_tx_skb':
> tls_rxtx.c:(.text+0x9a8): undefined reference to
> `tls_device_sk_destruct'
> 
> This narrows down the dependency to only allow the configurations
> that will actually work. The existing dependency on TLS_DEVICE is
> not sufficient here since MLX5_EN_TLS is a 'bool' symbol.
> 
> Fixes: c83294b9efa5 ("net/mlx5e: TLS, Add Innova TLS TX support")
> Signed-off-by: Arnd Bergmann 
> ---

LGTM

Acked-by: Saeed Mahameed 

Thank you Arnd!


>  drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> index ee6684779d11..2545296a0c08 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
> @@ -91,6 +91,7 @@ config MLX5_EN_TLS
>   bool "TLS cryptography-offload accelaration"
>   depends on MLX5_CORE_EN
>   depends on TLS_DEVICE
> + depends on TLS=y || MLX5_CORE=m
>   depends on MLX5_ACCEL
>   default n
>   ---help---

Re: mmotm 2018-05-25-14-52 uploaded (drivers/net/ethernet/ti/davinci_mdio.c)

2018-05-25 Thread Randy Dunlap
[forgot to add netdev]

On 05/25/2018 04:14 PM, Randy Dunlap wrote:
> On 05/25/2018 02:52 PM, a...@linux-foundation.org wrote:
>> The mm-of-the-moment snapshot 2018-05-25-14-52 has been uploaded to
>>
>>http://www.ozlabs.org/~akpm/mmotm/
>>
>> mmotm-readme.txt says
>>
>> README for mm-of-the-moment:
>>
>> http://www.ozlabs.org/~akpm/mmotm/
>>
>> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
>> more than once a week.
> 
> on x86_64:
> # CONFIG_OF is not set
> 
>   CC  drivers/net/ethernet/ti/davinci_cpdma.o
> ../drivers/net/ethernet/ti/davinci_mdio.c: In function 'davinci_mdio_probe':
> ../drivers/net/ethernet/ti/davinci_mdio.c:380:3: error: implicit declaration 
> of function 'davinci_mdio_probe_dt' [-Werror=implicit-function-declaration]
>ret = davinci_mdio_probe_dt(>pdata, pdev);
> 
> 
> 
>>
>> You will need quilt to apply these patches to the latest Linus release (4.x
>> or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
>> http://ozlabs.org/~akpm/mmotm/series
>>
>> The file broken-out.tar.gz contains two datestamp files: .DATE and
>> .DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
>> followed by the base kernel version against which this patch series is to
>> be applied.
>>
>> This tree is partially included in linux-next.  To see which patches are
>> included in linux-next, consult the `series' file.  Only the patches
>> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
>> linux-next.
>>
>> A git tree which contains the memory management portion of this tree is
>> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>> by Michal Hocko.  It contains the patches which are between the
>> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
>> file, http://www.ozlabs.org/~akpm/mmotm/series.
>>
>>
>> A full copy of the full kernel tree with the linux-next and mmotm patches
>> already applied is available through git within an hour of the mmotm
>> release.  Individual mmotm releases are tagged.  The master branch always
>> points to the latest release, so it's constantly rebasing.
>>
>> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
>>
>> To develop on top of mmotm git:
>>
>>   $ git remote add mmotm 
>> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>>   $ git remote update mmotm
>>   $ git checkout -b topic mmotm/master
>>   
>>   $ git send-email mmotm/master.. [...]
>>
>> To rebase a branch with older patches to a new mmotm release:
>>
>>   $ git remote update mmotm
>>   $ git rebase --onto mmotm/master  topic
>>
>>
>>
>>
>> The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
>> contains daily snapshots of the -mm tree.  It is updated more frequently
>> than mmotm, and is untested.
>>
>> A git copy of this tree is available at
>>
>>  http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/
>>
>> and use of this tree is similar to
>> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.
>>
>>
>> This mmotm tree contains the following patches against 4.17-rc6:
>> (patches marked "*" will be included in linux-next)
>>
> 
> 


-- 
~Randy


Re: mmotm 2018-05-25-14-52 uploaded (drivers/net/ethernet/ti/davinci_mdio.c)

2018-05-25 Thread Randy Dunlap
[forgot to add netdev]

On 05/25/2018 04:14 PM, Randy Dunlap wrote:
> On 05/25/2018 02:52 PM, a...@linux-foundation.org wrote:
>> The mm-of-the-moment snapshot 2018-05-25-14-52 has been uploaded to
>>
>>http://www.ozlabs.org/~akpm/mmotm/
>>
>> mmotm-readme.txt says
>>
>> README for mm-of-the-moment:
>>
>> http://www.ozlabs.org/~akpm/mmotm/
>>
>> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
>> more than once a week.
> 
> on x86_64:
> # CONFIG_OF is not set
> 
>   CC  drivers/net/ethernet/ti/davinci_cpdma.o
> ../drivers/net/ethernet/ti/davinci_mdio.c: In function 'davinci_mdio_probe':
> ../drivers/net/ethernet/ti/davinci_mdio.c:380:3: error: implicit declaration 
> of function 'davinci_mdio_probe_dt' [-Werror=implicit-function-declaration]
>ret = davinci_mdio_probe_dt(>pdata, pdev);
> 
> 
> 
>>
>> You will need quilt to apply these patches to the latest Linus release (4.x
>> or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
>> http://ozlabs.org/~akpm/mmotm/series
>>
>> The file broken-out.tar.gz contains two datestamp files: .DATE and
>> .DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
>> followed by the base kernel version against which this patch series is to
>> be applied.
>>
>> This tree is partially included in linux-next.  To see which patches are
>> included in linux-next, consult the `series' file.  Only the patches
>> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
>> linux-next.
>>
>> A git tree which contains the memory management portion of this tree is
>> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>> by Michal Hocko.  It contains the patches which are between the
>> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
>> file, http://www.ozlabs.org/~akpm/mmotm/series.
>>
>>
>> A full copy of the full kernel tree with the linux-next and mmotm patches
>> already applied is available through git within an hour of the mmotm
>> release.  Individual mmotm releases are tagged.  The master branch always
>> points to the latest release, so it's constantly rebasing.
>>
>> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
>>
>> To develop on top of mmotm git:
>>
>>   $ git remote add mmotm 
>> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>>   $ git remote update mmotm
>>   $ git checkout -b topic mmotm/master
>>   
>>   $ git send-email mmotm/master.. [...]
>>
>> To rebase a branch with older patches to a new mmotm release:
>>
>>   $ git remote update mmotm
>>   $ git rebase --onto mmotm/master  topic
>>
>>
>>
>>
>> The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
>> contains daily snapshots of the -mm tree.  It is updated more frequently
>> than mmotm, and is untested.
>>
>> A git copy of this tree is available at
>>
>>  http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/
>>
>> and use of this tree is similar to
>> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.
>>
>>
>> This mmotm tree contains the following patches against 4.17-rc6:
>> (patches marked "*" will be included in linux-next)
>>
> 
> 


-- 
~Randy


Re: mmotm 2018-05-25-14-52 uploaded (drivers/net/ethernet/ti/davinci_mdio.c)

2018-05-25 Thread Randy Dunlap
On 05/25/2018 02:52 PM, a...@linux-foundation.org wrote:
> The mm-of-the-moment snapshot 2018-05-25-14-52 has been uploaded to
> 
>http://www.ozlabs.org/~akpm/mmotm/
> 
> mmotm-readme.txt says
> 
> README for mm-of-the-moment:
> 
> http://www.ozlabs.org/~akpm/mmotm/
> 
> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
> more than once a week.

on x86_64:
# CONFIG_OF is not set

  CC  drivers/net/ethernet/ti/davinci_cpdma.o
../drivers/net/ethernet/ti/davinci_mdio.c: In function 'davinci_mdio_probe':
../drivers/net/ethernet/ti/davinci_mdio.c:380:3: error: implicit declaration of 
function 'davinci_mdio_probe_dt' [-Werror=implicit-function-declaration]
   ret = davinci_mdio_probe_dt(>pdata, pdev);



> 
> You will need quilt to apply these patches to the latest Linus release (4.x
> or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
> http://ozlabs.org/~akpm/mmotm/series
> 
> The file broken-out.tar.gz contains two datestamp files: .DATE and
> .DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
> followed by the base kernel version against which this patch series is to
> be applied.
> 
> This tree is partially included in linux-next.  To see which patches are
> included in linux-next, consult the `series' file.  Only the patches
> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
> linux-next.
> 
> A git tree which contains the memory management portion of this tree is
> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> by Michal Hocko.  It contains the patches which are between the
> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
> file, http://www.ozlabs.org/~akpm/mmotm/series.
> 
> 
> A full copy of the full kernel tree with the linux-next and mmotm patches
> already applied is available through git within an hour of the mmotm
> release.  Individual mmotm releases are tagged.  The master branch always
> points to the latest release, so it's constantly rebasing.
> 
> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
> 
> To develop on top of mmotm git:
> 
>   $ git remote add mmotm 
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>   $ git remote update mmotm
>   $ git checkout -b topic mmotm/master
>   
>   $ git send-email mmotm/master.. [...]
> 
> To rebase a branch with older patches to a new mmotm release:
> 
>   $ git remote update mmotm
>   $ git rebase --onto mmotm/master  topic
> 
> 
> 
> 
> The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
> contains daily snapshots of the -mm tree.  It is updated more frequently
> than mmotm, and is untested.
> 
> A git copy of this tree is available at
> 
>   http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/
> 
> and use of this tree is similar to
> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.
> 
> 
> This mmotm tree contains the following patches against 4.17-rc6:
> (patches marked "*" will be included in linux-next)
> 


-- 
~Randy


Re: mmotm 2018-05-25-14-52 uploaded (drivers/net/ethernet/ti/davinci_mdio.c)

2018-05-25 Thread Randy Dunlap
On 05/25/2018 02:52 PM, a...@linux-foundation.org wrote:
> The mm-of-the-moment snapshot 2018-05-25-14-52 has been uploaded to
> 
>http://www.ozlabs.org/~akpm/mmotm/
> 
> mmotm-readme.txt says
> 
> README for mm-of-the-moment:
> 
> http://www.ozlabs.org/~akpm/mmotm/
> 
> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
> more than once a week.

on x86_64:
# CONFIG_OF is not set

  CC  drivers/net/ethernet/ti/davinci_cpdma.o
../drivers/net/ethernet/ti/davinci_mdio.c: In function 'davinci_mdio_probe':
../drivers/net/ethernet/ti/davinci_mdio.c:380:3: error: implicit declaration of 
function 'davinci_mdio_probe_dt' [-Werror=implicit-function-declaration]
   ret = davinci_mdio_probe_dt(>pdata, pdev);



> 
> You will need quilt to apply these patches to the latest Linus release (4.x
> or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
> http://ozlabs.org/~akpm/mmotm/series
> 
> The file broken-out.tar.gz contains two datestamp files: .DATE and
> .DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
> followed by the base kernel version against which this patch series is to
> be applied.
> 
> This tree is partially included in linux-next.  To see which patches are
> included in linux-next, consult the `series' file.  Only the patches
> within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
> linux-next.
> 
> A git tree which contains the memory management portion of this tree is
> maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> by Michal Hocko.  It contains the patches which are between the
> "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
> file, http://www.ozlabs.org/~akpm/mmotm/series.
> 
> 
> A full copy of the full kernel tree with the linux-next and mmotm patches
> already applied is available through git within an hour of the mmotm
> release.  Individual mmotm releases are tagged.  The master branch always
> points to the latest release, so it's constantly rebasing.
> 
> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
> 
> To develop on top of mmotm git:
> 
>   $ git remote add mmotm 
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>   $ git remote update mmotm
>   $ git checkout -b topic mmotm/master
>   
>   $ git send-email mmotm/master.. [...]
> 
> To rebase a branch with older patches to a new mmotm release:
> 
>   $ git remote update mmotm
>   $ git rebase --onto mmotm/master  topic
> 
> 
> 
> 
> The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
> contains daily snapshots of the -mm tree.  It is updated more frequently
> than mmotm, and is untested.
> 
> A git copy of this tree is available at
> 
>   http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/
> 
> and use of this tree is similar to
> http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.
> 
> 
> This mmotm tree contains the following patches against 4.17-rc6:
> (patches marked "*" will be included in linux-next)
> 


-- 
~Randy


[PATCH] perf tools: Fix indexing for decoder packet queue

2018-05-25 Thread Mathieu Poirier
The tail of a queue is supposed to be pointing to the next available slot
in a queue.  In this implementation the tail is incremented before it is
used and as such points to the last used element, something that has the
immense advantage of centralizing tail management at a single location
and eliminating a lot of redundant code.

But this needs to be taken into consideration on the dequeueing side where
the head also needs to be incremented before it is used, or the first
available element of the queue will be skipped.

Signed-off-by: Mathieu Poirier 
---
 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c 
b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index c8b98fa22997..4d5fc374e730 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -96,11 +96,19 @@ int cs_etm_decoder__get_packet(struct cs_etm_decoder 
*decoder,
/* Nothing to do, might as well just return */
if (decoder->packet_count == 0)
return 0;
+   /*
+* The queueing process in function cs_etm_decoder__buffer_packet()
+* increments the tail *before* using it.  This is somewhat counter
+* intuitive but it has the advantage of centralizing tail management
+* at a single location.  Because of that we need to follow the same
+* heuristic with the head, i.e we increment it before using its
+* value.  Otherwise the first element of the packet queue is not
+* used.
+*/
+   decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1);
 
*packet = decoder->packet_buffer[decoder->head];
 
-   decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1);
-
decoder->packet_count--;
 
return 1;
-- 
2.7.4



[PATCH] perf tools: Fix indexing for decoder packet queue

2018-05-25 Thread Mathieu Poirier
The tail of a queue is supposed to be pointing to the next available slot
in a queue.  In this implementation the tail is incremented before it is
used and as such points to the last used element, something that has the
immense advantage of centralizing tail management at a single location
and eliminating a lot of redundant code.

But this needs to be taken into consideration on the dequeueing side where
the head also needs to be incremented before it is used, or the first
available element of the queue will be skipped.

Signed-off-by: Mathieu Poirier 
---
 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c 
b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index c8b98fa22997..4d5fc374e730 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -96,11 +96,19 @@ int cs_etm_decoder__get_packet(struct cs_etm_decoder 
*decoder,
/* Nothing to do, might as well just return */
if (decoder->packet_count == 0)
return 0;
+   /*
+* The queueing process in function cs_etm_decoder__buffer_packet()
+* increments the tail *before* using it.  This is somewhat counter
+* intuitive but it has the advantage of centralizing tail management
+* at a single location.  Because of that we need to follow the same
+* heuristic with the head, i.e we increment it before using its
+* value.  Otherwise the first element of the packet queue is not
+* used.
+*/
+   decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1);
 
*packet = decoder->packet_buffer[decoder->head];
 
-   decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1);
-
decoder->packet_count--;
 
return 1;
-- 
2.7.4



RE: [PATCH V4 0/3] Use efi_rts_wq to invoke EFI Runtime Services

2018-05-25 Thread Prakhya, Sai Praneeth
> > Changes from V3 to V4:
> > --
> > 1. As suggested by Peter, use completions instead of flush_work() as the
> >   former is cheaper
> > 2. Call efi_delete_dummy_variable() from efisubsys_init(). Sorry! Ard,
> >   wasn't able to find a better alternative to keep this change local to
> >   arch/x86.
> >
> 
> Two questions:
> - Should the non-blocking variants of the query and set_variable_store use the
> work queue? Doesn't that make them blocking?

That's a good question . I think you are right, calling non-blocking variants 
of efi_rts 
using work queues makes them blocking. But, I have a follow on question.

Assume some user requested to execute some non-blocking variant of efi_rts and 
the kernel hasn't called efi_call_virt() yet, but was scheduled out. IOW, even 
though 
user requests for non-blocking efi call, we might still block. Am I right?

With efi_rts_wq, I think, I have increased the window of getting blocked. With 
efi_rts_wq, 
kernel should explicitly call schedule() to run firmware and the chances of 
getting blocked 
are much more.

Expect this increased window, I think firmware should be executed as before.

So, can you please explain me the difference between blocking and non-blocking 
variants
from kernel perspective?
(the way we get locks are different down_interruptible() vs down_trylock())

> - If the non-blocking set_variable() does not use the work queue, can we just 
> call
> it from efi_delete_dummy_variable(), and keep the calls where they are?

Yes, I think we can do that (if we don't use efi_rts_wq for non-blocking 
variants).

Regards,
Sai


RE: [PATCH V4 0/3] Use efi_rts_wq to invoke EFI Runtime Services

2018-05-25 Thread Prakhya, Sai Praneeth
> > Changes from V3 to V4:
> > --
> > 1. As suggested by Peter, use completions instead of flush_work() as the
> >   former is cheaper
> > 2. Call efi_delete_dummy_variable() from efisubsys_init(). Sorry! Ard,
> >   wasn't able to find a better alternative to keep this change local to
> >   arch/x86.
> >
> 
> Two questions:
> - Should the non-blocking variants of the query and set_variable_store use the
> work queue? Doesn't that make them blocking?

That's a good question . I think you are right, calling non-blocking variants 
of efi_rts 
using work queues makes them blocking. But, I have a follow on question.

Assume some user requested to execute some non-blocking variant of efi_rts and 
the kernel hasn't called efi_call_virt() yet, but was scheduled out. IOW, even 
though 
user requests for non-blocking efi call, we might still block. Am I right?

With efi_rts_wq, I think, I have increased the window of getting blocked. With 
efi_rts_wq, 
kernel should explicitly call schedule() to run firmware and the chances of 
getting blocked 
are much more.

Expect this increased window, I think firmware should be executed as before.

So, can you please explain me the difference between blocking and non-blocking 
variants
from kernel perspective?
(the way we get locks are different down_interruptible() vs down_trylock())

> - If the non-blocking set_variable() does not use the work queue, can we just 
> call
> it from efi_delete_dummy_variable(), and keep the calls where they are?

Yes, I think we can do that (if we don't use efi_rts_wq for non-blocking 
variants).

Regards,
Sai


[GIT PULL] ARM: at91: DT for 4.18

2018-05-25 Thread Alexandre Belloni
Arnd, Olof,

I'm a bit late for this very small PR, as I had to extend the expiration
date for my GPG signature key.

Two small DT changes that have no functional impact.

The following changes since commit 60cc43fc888428bb2f18f08997432d426a243338:

  Linux 4.17-rc1 (2018-04-15 18:24:20 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux.git 
tags/at91-ab-4.18-dt

for you to fetch changes up to a642693882ce417683012a211ca9d6e65bae1dc4:

  ARM: dts: at91-sama5d2_xplained: Use IRQ_TYPE specifier (2018-05-14 15:29:52 
+0200)


AT91 DT for 4.18:

 - small DT improvements without functional changes


Dmitry Torokhov (1):
  ARM: dts: at91: sama5d4ek: use canonical compatible for touchscreen

Hernán Gonzalez (1):
  ARM: dts: at91-sama5d2_xplained: Use IRQ_TYPE specifier

 arch/arm/boot/dts/at91-sama5d2_xplained.dts | 2 +-
 arch/arm/boot/dts/at91-sama5d4ek.dts| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


[GIT PULL] ARM: at91: DT for 4.18

2018-05-25 Thread Alexandre Belloni
Arnd, Olof,

I'm a bit late for this very small PR, as I had to extend the expiration
date for my GPG signature key.

Two small DT changes that have no functional impact.

The following changes since commit 60cc43fc888428bb2f18f08997432d426a243338:

  Linux 4.17-rc1 (2018-04-15 18:24:20 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux.git 
tags/at91-ab-4.18-dt

for you to fetch changes up to a642693882ce417683012a211ca9d6e65bae1dc4:

  ARM: dts: at91-sama5d2_xplained: Use IRQ_TYPE specifier (2018-05-14 15:29:52 
+0200)


AT91 DT for 4.18:

 - small DT improvements without functional changes


Dmitry Torokhov (1):
  ARM: dts: at91: sama5d4ek: use canonical compatible for touchscreen

Hernán Gonzalez (1):
  ARM: dts: at91-sama5d2_xplained: Use IRQ_TYPE specifier

 arch/arm/boot/dts/at91-sama5d2_xplained.dts | 2 +-
 arch/arm/boot/dts/at91-sama5d4ek.dts| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


  1   2   3   4   5   6   7   8   9   10   >