Re: [V9fs-developer] [PATCH 02/13] 9p: Tell the VFS that readpage was synchronous

2020-09-17 Thread Dominique Martinet
Matthew Wilcox (Oracle) wrote on Thu, Sep 17, 2020:
> diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
> index cce9ace651a2..506ca0ba2ec7 100644
> --- a/fs/9p/vfs_addr.c
> +++ b/fs/9p/vfs_addr.c
> @@ -280,6 +280,10 @@ static int v9fs_write_begin(struct file *filp, struct 
> address_space *mapping,
>   goto out;
>  
>   retval = v9fs_fid_readpage(v9inode->writeback_fid, page);
> + if (retval == AOP_UPDATED_PAGE) {
> + retval = 0;
> + goto out;
> + }

FWIW this is a change of behaviour; for some reason the code used to
loop back to grab_cache_page_write_begin() and bail out on
PageUptodate() I suppose; some sort of race check?
The whole pattern is a bit weird to me and 9p has no guarantee on
concurrent writes to a file with cache enabled (except that it will
corrupt something), so this part is fine with me.

What I'm curious about is the page used to be both unlocked and put, but
now isn't either and the return value hasn't changed for the caller to
make a difference on write_begin / I don't see any code change in the
vfs  to handle that.
What did I miss?


(FWIW at least cifs in the series has the same pattern change; didn't
check all of them)


Thanks,
-- 
Dominique


Re: [v4] mm: khugepaged: avoid overriding min_free_kbytes set by user

2020-09-17 Thread Michal Hocko
On Thu 17-09-20 11:16:55, Vijay Balakrishna wrote:
> 
> 
> On 9/17/2020 10:52 AM, Michal Hocko wrote:
> > On Thu 17-09-20 10:27:16, Vijay Balakrishna wrote:
> > > 
> > > 
> > > On 9/17/2020 2:28 AM, Michal Hocko wrote:
> > > > On Wed 16-09-20 23:39:39, Vijay Balakrishna wrote:
> > > > > set_recommended_min_free_kbytes need to honor min_free_kbytes set by 
> > > > > the
> > > > > user.  Post start-of-day THP enable or memory hotplug operations can
> > > > > lose user specified min_free_kbytes, in particular when it is higher 
> > > > > than
> > > > > calculated recommended value.
> > > > 
> > > > I was about to recommend a more detailed explanation when I have
> > > > realized that this patch is not really needed after all. Unless I am
> > > > missing something.
> > > > 
> > > > init_per_zone_wmark_min ignores the newly calculated min_free_kbytes if
> > > > it is lower than user_min_free_kbytes. So calculated min_free_kbytes >=
> > > > user_min_free_kbytes.
> > > > 
> > > > Except for value clamping when the value is reduced and this likely
> > > > needs fixing. But set_recommended_min_free_kbytes should be fine.
> > > > 
> > > 
> > > IIUC, after start-of-day if a user performs
> > > - THP disable
> > > - modifies min_free_bytes
> > > - THP enable
> > > above sequence currently wouldn't result in calling 
> > > init_per_zone_wmark_min.
> > 
> > I will not, but why do you think this matters? All we should care about
> > is that auto-tuning shouldn't reduce user provided value [1] and that
> > the memory hotplug should be consistent with the boot time heuristic.
> > init_per_zone_wmark_min should make sure that the user value is not
> > reduced and thp heuristic makes sure it will not reduce this value.
> > So the property should be transitive with the existing code (modulo the
> > problem I have highlighted).
> > 
> > [1] one could argue that it shouldn't even increase the value strictly
> > speaking because an admin might have a very good reason to decrease the
> > value but this has never been the semantic and changing it now might be
> > problematic
> > 
> 
> I made an attempt to address Kirill A. Shutemov's comment.

This is for Kirill to comment on but my take would be that memory
hotplug really has to alter the user defined min_free_kbytes because it
is manipulating the amount of memory. There are usecases which are
adding a lot of memory.

We are trying to not decrease the value which is arguably a weird semantic
but this is what've been doing for years. We would need to hear a
specific usecase where this matters (e.g. memory hotremove heavy
workalod with manually tuned min_free_kbytes) that misbehaves.

> And incrased
> min_free_kbytes to see the issue in my testing and attempted a fix.  I'm ok
> leaving as it is.  Do not want introduce any changes that may cause
> regression.

I would recommend reposting the patch which adds heuristic for THP (if
THP is enabled) into the hotplug path, arguing with the consistency and
surprising results when adding memory decreases the value. Your initial
problem is in sizing as mentioned in other email thread and you should
be investigating more but this inconsistency might really come as a
surprise.

All that if Kirill is reconsidering his initial position of course.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Tony Lindgren
* Christoph Hellwig  [200917 17:37]:
> Switch the omap1510 platform ohci device to use dma_direct_set_offset
> to set the DMA offset instead of using direct hooks into the DMA
> mapping code and remove the now unused hooks.

Looks nice to me :) I still can't test this probably for few more weeks
though but hopefully Aaro or Janusz (Added to Cc) can test it.

Regards,

Tony

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm/include/asm/dma-direct.h | 18 -
>  arch/arm/mach-omap1/include/mach/memory.h | 31 ---
>  arch/arm/mach-omap1/usb.c | 22 
>  3 files changed, 22 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-direct.h 
> b/arch/arm/include/asm/dma-direct.h
> index 436544aeb83405..77fcb7ee5ec907 100644
> --- a/arch/arm/include/asm/dma-direct.h
> +++ b/arch/arm/include/asm/dma-direct.h
> @@ -9,7 +9,6 @@
>   * functions used internally by the DMA-mapping API to provide DMA
>   * addresses. They must not be used by drivers.
>   */
> -#ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
>   if (dev && dev->dma_range_map)
> @@ -34,23 +33,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
> void *addr)
>   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
>  }
>  
> -#else
> -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> -{
> - return __arch_pfn_to_dma(dev, pfn);
> -}
> -
> -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
> -{
> - return __arch_dma_to_pfn(dev, addr);
> -}
> -
> -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
> -{
> - return __arch_virt_to_dma(dev, addr);
> -}
> -#endif
> -
>  static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
>  {
>   unsigned int offset = paddr & ~PAGE_MASK;
> diff --git a/arch/arm/mach-omap1/include/mach/memory.h 
> b/arch/arm/mach-omap1/include/mach/memory.h
> index 1142560e0078f5..36bccb6ab8 100644
> --- a/arch/arm/mach-omap1/include/mach/memory.h
> +++ b/arch/arm/mach-omap1/include/mach/memory.h
> @@ -14,42 +14,11 @@
>   * OMAP-1510 bus address is translated into a Local Bus address if the
>   * OMAP bus type is lbus. We do the address translation based on the
>   * device overriding the defaults used in the dma-mapping API.
> - * Note that the is_lbus_device() test is not very efficient on 1510
> - * because of the strncmp().
>   */
> -#if defined(CONFIG_ARCH_OMAP15XX) && !defined(__ASSEMBLER__)
>  
>  /*
>   * OMAP-1510 Local Bus address offset
>   */
>  #define OMAP1510_LB_OFFSET   UL(0x3000)
>  
> -#define virt_to_lbus(x)  ((x) - PAGE_OFFSET + OMAP1510_LB_OFFSET)
> -#define lbus_to_virt(x)  ((x) - OMAP1510_LB_OFFSET + PAGE_OFFSET)
> -#define is_lbus_device(dev)  (cpu_is_omap15xx() && dev && 
> (strncmp(dev_name(dev), "ohci", 4) == 0))
> -
> -#define __arch_pfn_to_dma(dev, pfn)  \
> - ({ dma_addr_t __dma = __pfn_to_phys(pfn); \
> -if (is_lbus_device(dev)) \
> - __dma = __dma - PHYS_OFFSET + OMAP1510_LB_OFFSET; \
> -__dma; })
> -
> -#define __arch_dma_to_pfn(dev, addr) \
> - ({ dma_addr_t __dma = addr; \
> -if (is_lbus_device(dev)) \
> - __dma += PHYS_OFFSET - OMAP1510_LB_OFFSET;  \
> -__phys_to_pfn(__dma);\
> - })
> -
> -#define __arch_dma_to_virt(dev, addr)({ (void *) 
> (is_lbus_device(dev) ? \
> - lbus_to_virt(addr) : \
> - __phys_to_virt(addr)); })
> -
> -#define __arch_virt_to_dma(dev, addr)({ unsigned long __addr = 
> (unsigned long)(addr); \
> -(dma_addr_t) (is_lbus_device(dev) ? \
> - virt_to_lbus(__addr) : \
> - __virt_to_phys(__addr)); })
> -
> -#endif   /* CONFIG_ARCH_OMAP15XX */
> -
>  #endif
> diff --git a/arch/arm/mach-omap1/usb.c b/arch/arm/mach-omap1/usb.c
> index d8e9bbda8f7bdd..ba8566204ea9f4 100644
> --- a/arch/arm/mach-omap1/usb.c
> +++ b/arch/arm/mach-omap1/usb.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -542,6 +543,25 @@ static u32 __init omap1_usb2_init(unsigned nwires, 
> unsigned alt_pingroup)
>  /* ULPD_APLL_CTRL */
>  #define APLL_NDPLL_SWITCH(1 << 0)
>  
> +static int omap_1510_usb_ohci_notifier(struct notifier_block *nb,
> + unsigned long event, void *data)
> +{
> + struct device *dev = data;
> +
> + if (event != BUS_NOTIFY_ADD_DEVICE)
> + return NOTIFY_DONE;
> +
> + if (strncmp(dev_name(dev), "ohci", 4) == 0 &&
> + dma_direct_set_offset(dev, PHYS_OFFSET, OMAP1510_LB_OFFSET,
> + (u64)-1))
> 

Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 17, 2020, at 6:47 PM, Chao Yu  wrote:
> 
> On 2020/9/18 3:34, Nick Terrell wrote:
>>> On Sep 17, 2020, at 11:00 AM, Nick Terrell  wrote:
>>> 
>>> 
>>> 
 On Sep 16, 2020, at 11:31 PM, Chao Yu  wrote:
 
 Hi Nick,
 
 On 2020/9/17 2:39, Nick Terrell wrote:
>> On Sep 15, 2020, at 11:31 PM, Chao Yu  wrote:
>> 
>> Hi Nick,
>> 
>> remove not related mailing list.
>> 
>> On 2020/9/16 11:43, Nick Terrell wrote:
>>> From: Nick Terrell 
>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>>> code is more efficient because it uses the single-pass API instead of
>>> the streaming API. The streaming API is not necessary because the whole
>>> input and output buffers are available. This saves memory because we
>>> don't need to allocate a buffer for the window. It is also more
>>> efficient because it saves unnecessary memcpy calls.
>>> I've had problems testing this code because I see data truncation before
>>> and after this patchset. Help testing this patch would be much
>>> appreciated.
>> 
>> Can you please explain more about data truncation? I'm a little 
>> confused...
>> 
>> Do you mean that f2fs doesn't allocate enough memory for zstd 
>> compression,
>> so that compression is not finished actually, the compressed data is 
>> truncated
>> at dst buffer?
> Hi Chao,
> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It 
> is possible
> that the script I’m using is buggy or is exposing an edge case in F2FS. 
> The files
> that I copy to F2FS and compress end up truncated with a hole at the end.
 
 Thanks for your explanation. :)
 
> It is based off of upstream commit ab29a807a7.
> E.g. the end of the copied file looks like this, but the original file 
> has non-zero data
> In the end. Until the hole at the end the file is correct.
> od dickens | tail -n 5
>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412
>> 4667 00 00 00 00 00 00 00 00
>> *
>> 46703060 00 00 00 00 00 00 00
>> 46703076
> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4
 
 Shouldn't we just get sha1 value by flitering sha1sum output?
 
   asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}`
   bsha=`sha1sum $MP/$i/$file |awk {'print $1'}`
>>> 
>>> Probably, but it was just a quick one-off script.
>> Ah, never mind, you are right.
 I can't reproduce this issue by using simple data sample, could you share
 that 'dickens' file or other smaller-sized sample if you have?
>>> 
>>> The /tmp/silesia directory in the example is populated with all the files 
>>> from
>>> this website. It is a popular data compression benchmark corpus. You can
>>> click on the “total” link to download a zip archive of all the files.
>>> 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIDaQ=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=-bYa7TavRodl96xy65hjVIkt5HdMldv4LOCRHJf12n8=mdX82rCzyHO-Q3KGJ5b94mqDKcDh1IWEqEWfuqw7P3I=
>>>  
>>> -Nick
>> I’ve spent some time minimizing the test case. This script [0] is the 
>> minimized
>> test case that doesn’t require any input files, it builds its own.
>> Several observations:
>> * The input file needs to be 7700481 bytes large, smaller files don’t 
>> trigger the bug.
>> * You have to `chattr +c` the file after copying it otherwise the bug 
>> doesn’t occur.
>> * After `chattr +c` you have to unmount and remount the filesystem to 
>> trigger the bug.
>> I’ve reproduced on v5.9-rc5 (856deb866d16e). I’ve also reproduced on my host 
>> machine
>> running 5.8.5-arch1-1.
>> [0] https://gist.github.com/terrelln/4bba325abdfa3a6f014e9911ac92a185
> 
> Ah, I got it.
> 
> Step of enabling compressed inode is not correct, we should touch an empty 
> file, and
> then use 'chattr +c' on that file to enable compression, otherwise the race 
> condition
> could be complicated to handle. So we need below diff to disallow setting 
> compression
> flag on an non-empty file:

Yup, that did the trick. After that change I was able to successfully test 
F2FS. I found
a bug in my compatibility wrappers, so I’m going to be sending a V2 that fixes 
it.

I’ll include these numbers in my next commit message, but with these changes 
F2FS
decompression memory usage drops from 1.4 MB to 160 KB. Decompression speeds
up 20% in total from the entire series, and compression speeds up 8%.

Thanks for the help debugging,
Nick

> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 8a422400e824..b462db7898fd 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1836,6 +1836,8 @@ static int f2fs_setflags_common(struct inode *inode, 
> u32 iflags, u32 mask)
>   if (iflags & F2FS_COMPR_FL) {
>

Re: [PATCH v4 0/7] HWpoison: further fixes and cleanups

2020-09-17 Thread osalvador

On 2020-09-17 17:27, HORIGUCHI NAOYA wrote:
Sorry, I modified the patches based on the different assumption from 
yours.

I firstly thought of taking page off after confirming the error page
is freed back to buddy. This approach leaves the possibility of reusing
the error page (which is acceptable), but simpler and less invasive 
one.


Your approach removes the error page from page allocator's control in
freeing time. It has no possibility of reusing the error page but 
changes

are tightly coupled with page free code.

This is a tradeoff between complexity and completeness of soft offline,
Now I'm not sure I could persist on my own opinion without providing
working code, and it's OK for me to take your one.


Yeah, you are right it is a trade off.
I would suggest taking this path now, and if it proofs to be problematic 
in some way, we can always

do the:

free_page
 take_it_off_buddy
  OK: mark it as hwpoison and increment refcount
  NOT_OK (raced with allocation): oops, sorry


The test passed in my environment, so this is fine.


Thanks for trying it out.



If they do, I will try to see if Andrew can squezee above changes into 
[1],

where they belong to.


Yes, proposing the fix for 
mmhwpoison-rework-soft-offline-for-in-use-pages.patch

seems fine to me.

Again, sorry for modifying code without asking.


No worries, I wil do a couple of tests on my own and then I will talk to 
Andrew to see if we can squeeze the changes in there.





Re: [PATCH v2] dt-bindings: mfd: rohm,bd71837-pmic: Add common properties

2020-09-17 Thread Vaittinen, Matti
Hi d Ho peeps!

On Thu, 2020-09-17 at 21:37 +0200, Krzysztof Kozlowski wrote:
> Add common properties appearing in DTSes (clock-names,
> clock-output-names) with the common values (actually used in DTSes)
> to
> fix dtbs_check warnings like:
> 
>   arch/arm64/boot/dts/freescale/imx8mq-librem5-r2.dt.yaml:
> pmic@4b: 'clock-names', 'clock-output-names', do not match any of
> the regexes: 'pinctrl-[0-9]+'
> 
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---
> 
> Changes since v1:
> 1. Define the names, as used in existing DTS files.
> ---
>  .../devicetree/bindings/mfd/rohm,bd71837-pmic.yaml  | 6
> ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/mfd/rohm,bd71837-
> pmic.yaml b/Documentation/devicetree/bindings/mfd/rohm,bd71837-
> pmic.yaml
> index 65018a019e1d..3bfdd33702ad 100644
> --- a/Documentation/devicetree/bindings/mfd/rohm,bd71837-pmic.yaml
> +++ b/Documentation/devicetree/bindings/mfd/rohm,bd71837-pmic.yaml
> @@ -32,9 +32,15 @@ properties:
>clocks:
>  maxItems: 1
>  
> +  clock-names:
> +const: osc

I guess existing board dtses use "osc" then? Ok.

>"#clock-cells":
>  const: 0
>  
> +  clock-output-names:
> +const: pmic_clk

This is not a strong opinion but I feel that pmic_clk is a bit too
generic name? I mean, what if there is a system with more than one
PMICs? (I don't see such use-case with the BD718x7 though - but perhaps
this can serve as a misleading example for other PMICs? For example
with the ROHM BD96801 family there may be multiple PMICs in one
system). Anyways - if Rob is happy with this then please go with it :)

Acked-By: Matti Vaittinen 
 * 
Thanks again for improving these bindings! I am constantly struggling
with these x_x. Writing the bindings is probably hardest part of PMIC
driver development -_-;




[PATCH v2] dm: Call proper helper to determine dax support

2020-09-17 Thread Dan Williams
From: Jan Kara 

DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" 
device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:

dm-3: error: dax access failed (-95)

when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.

Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple 
devices")
Cc: 
Tested-by: Adrian Huang 
Signed-off-by: Jan Kara 
Acked-by: Mike Snitzer 
Signed-off-by: Dan Williams 
---
Changes since v1 [1]:
- Add missing dax_read_lock() around dax_supported()

[1]: http://lore.kernel.org/r/20200916151445.450-1-j...@suse.cz

 drivers/dax/super.c   |4 
 drivers/md/dm-table.c |   10 +++---
 include/linux/dax.h   |   11 +--
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..b6284c5cae0a 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
 bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
int blocksize, sector_t start, sector_t len)
 {
+   if (!dax_dev)
+   return false;
+
if (!dax_alive(dax_dev))
return false;
 
return dax_dev->ops->dax_supported(dax_dev, bdev, blocksize, start, 
len);
 }
+EXPORT_SYMBOL_GPL(dax_supported);
 
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void 
*addr,
size_t bytes, struct iov_iter *i)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..229f461e7def 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -860,10 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
 int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
 {
-   int blocksize = *(int *) data;
+   int blocksize = *(int *) data, id;
+   bool rc;
 
-   return generic_fsdax_supported(dev->dax_dev, dev->bdev, blocksize,
-  start, len);
+   id = dax_read_lock();
+   rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
+   dax_read_unlock(id);
+
+   return rc;
 }
 
 /* Check devices support synchronous DAX */
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 6904d4e0b2e0..9f916326814a 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -130,6 +130,8 @@ static inline bool generic_fsdax_supported(struct 
dax_device *dax_dev,
return __generic_fsdax_supported(dax_dev, bdev, blocksize, start,
sectors);
 }
+bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
+   int blocksize, sector_t start, sector_t len);
 
 static inline void fs_put_dax(struct dax_device *dax_dev)
 {
@@ -157,6 +159,13 @@ static inline bool generic_fsdax_supported(struct 
dax_device *dax_dev,
return false;
 }
 
+static inline bool dax_supported(struct dax_device *dax_dev,
+   struct block_device *bdev, int blocksize, sector_t start,
+   sector_t len)
+{
+   return false;
+}
+
 static inline void fs_put_dax(struct dax_device *dax_dev)
 {
 }
@@ -195,8 +204,6 @@ bool dax_alive(struct dax_device *dax_dev);
 void *dax_get_private(struct dax_device *dax_dev);
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long 
nr_pages,
void **kaddr, pfn_t *pfn);
-bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
-   int blocksize, sector_t start, sector_t len);
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void 
*addr,
size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,



[PATCH] Documentation/admin-guide: kernel-parameters: fix "disable_ddw" wording

2020-09-17 Thread Randy Dunlap
Drop and extraneous word (if) in a sentence.

Signed-off-by: Randy Dunlap 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200917.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-next-20200917/Documentation/admin-guide/kernel-parameters.txt
@@ -951,7 +951,7 @@
Arch Perfmon v4 (Skylake and newer).
 
disable_ddw [PPC/PSERIES]
-   Disable Dynamic DMA Window support. Use this if
+   Disable Dynamic DMA Window support. Use this
to workaround buggy firmware.
 
disable_ipv6=   [IPV6]


[PATCH] Documentation/admin-guide: kernel-parameters: fix "io7" parameter description

2020-09-17 Thread Randy Dunlap
Fix punctuation and capitalization for the "io7" boot parameter.

Signed-off-by: Randy Dunlap 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200917.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-next-20200917/Documentation/admin-guide/kernel-parameters.txt
@@ -1973,7 +1973,7 @@
1 - Bypass the IOMMU for DMA.
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
 
-   io7=[HW] IO7 for Marvel based alpha systems
+   io7=[HW] IO7 for Marvel-based Alpha systems
See comment before marvel_specify_io7 in
arch/alpha/kernel/core_marvel.c.
 


[PATCH] Documentation/admin-guide: kernel-parameters: capitalize Korina

2020-09-17 Thread Randy Dunlap
Fix typo, capitalize Korina proper noun.

Signed-off-by: Randy Dunlap 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20200917.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-next-20200917/Documentation/admin-guide/kernel-parameters.txt
@@ -2194,7 +2194,7 @@
kgdbwait[KGDB] Stop kernel execution and enter the
kernel debugger at the earliest opportunity.
 
-   kmac=   [MIPS] korina ethernet MAC address.
+   kmac=   [MIPS] Korina ethernet MAC address.
Configure the RouterBoard 532 series on-chip
Ethernet adapter MAC address.
 


[PATCH] Documentation: admin-guide: kernel-parameters: reformat "lapic=" boot option

2020-09-17 Thread Randy Dunlap
Reformat "lapic=" to try to make it more understandable and similar
to the style that is mostly used in this file.

Signed-off-by: Randy Dunlap 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next-20200917.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-next-20200917/Documentation/admin-guide/kernel-parameters.txt
@@ -2384,9 +2384,10 @@
lapic   [X86-32,APIC] Enable the local APIC even if BIOS
disabled it.
 
-   lapic=  [X86,APIC] "notscdeadline" Do not use TSC deadline
+   lapic=  [X86,APIC] Do not use TSC deadline
value for LAPIC timer one-shot implementation. Default
back to the programmable timer unit in the LAPIC.
+   Format: notscdeadline
 
lapic_timer_c2_ok   [X86,APIC] trust the local apic timer
in C2 power state.


Re: [PATCH v1 4/6] wdt: Support wdt on ROHM BD9576MUF and BD9573MUF

2020-09-17 Thread Guenter Roeck
On 9/17/20 1:03 AM, Matti Vaittinen wrote:
> Add Watchdog support for ROHM BD9576MUF and BD9573MUF PMICs which are
> mainly used to power the R-Car series processors. The watchdog is
> pinged using a GPIO and enabled using another GPIO. Additionally
> watchdog time-out can be configured to HW prior starting the watchdog.
> Watchdog timeout can be configured to detect only delayed ping or in
> a window mode where also too fast pings are detected.
> 
> Signed-off-by: Matti Vaittinen 
> ---
>  drivers/watchdog/Kconfig  |  13 ++
>  drivers/watchdog/Makefile |   1 +
>  drivers/watchdog/bd9576_wdt.c | 295 ++
>  3 files changed, 309 insertions(+)
>  create mode 100644 drivers/watchdog/bd9576_wdt.c
> 
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index ab7aad5a1e69..d042a517a946 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -172,6 +172,19 @@ config BD70528_WATCHDOG
> Alternatively say M to compile the driver as a module,
> which will be called bd70528_wdt.
>  
> +config BD957XMUF_WATCHDOG
> + tristate "ROHM BD9576MUF and BD9573MUF PMIC Watchdog"
> + depends on MFD_ROHM_BD957XMUF
> + select WATCHDOG_CORE
> + help
> +   Support for the watchdog in the ROHM BD9576 and BD9573 PMICs.
> +   These PMIC ICs contain watchdog block which can be configured
> +   to toggle reset line if SoC fails to ping watchdog via GPIO.
> +
> +   Say Y here to include support for the ROHM BD9576 or BD9573
> +   watchdog. Alternatively say M to compile the driver as a module,
> +   which will be called bd9576_wdt.
> +
>  config DA9052_WATCHDOG
>   tristate "Dialog DA9052 Watchdog"
>   depends on PMIC_DA9052 || COMPILE_TEST
> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
> index 97bed1d3d97c..14d75f98e3df 100644
> --- a/drivers/watchdog/Makefile
> +++ b/drivers/watchdog/Makefile
> @@ -208,6 +208,7 @@ obj-$(CONFIG_XEN_WDT) += xen_wdt.o
>  
>  # Architecture Independent
>  obj-$(CONFIG_BD70528_WATCHDOG) += bd70528_wdt.o
> +obj-$(CONFIG_BD957XMUF_WATCHDOG) += bd9576_wdt.o
>  obj-$(CONFIG_DA9052_WATCHDOG) += da9052_wdt.o
>  obj-$(CONFIG_DA9055_WATCHDOG) += da9055_wdt.o
>  obj-$(CONFIG_DA9062_WATCHDOG) += da9062_wdt.o
> diff --git a/drivers/watchdog/bd9576_wdt.c b/drivers/watchdog/bd9576_wdt.c
> new file mode 100644
> index ..917c8c7ddeb1
> --- /dev/null
> +++ b/drivers/watchdog/bd9576_wdt.c
> @@ -0,0 +1,295 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright (C) 2020 ROHM Semiconductors
> + *
> + * ROHM BD9576MUF and BD9573MUF Watchdog driver
> + */
> +
> +#include 
> +#include 

Alphabetic include file order please.

> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static bool nowayout;
> +module_param(nowayout, bool, 0);
> +MODULE_PARM_DESC(nowayout,
> + "Watchdog cannot be stopped once started (default=\"false\")");
> +
> +#define HW_MARGIN_MIN 2
> +#define HW_MARGIN_MAX 4416
> +#define BD957X_WDT_DEFAULT_MARGIN 4416
> +
> +struct bd9576_wdt_priv {
> + struct gpio_desc*gpiod_ping;
> + struct gpio_desc*gpiod_en;
> + struct device   *dev;
> + struct regmap   *regmap;
> + boolalways_running;
> + struct watchdog_device  wdd;
> +};
> +
> +static void bd9576_wdt_disable(struct bd9576_wdt_priv *priv)
> +{
> + gpiod_set_value_cansleep(priv->gpiod_en, 0);
> +}
> +
> +static int bd9576_wdt_ping(struct watchdog_device *wdd)
> +{
> + struct bd9576_wdt_priv *priv = watchdog_get_drvdata(wdd);
> +
> + /* Pulse */
> + gpiod_set_value_cansleep(priv->gpiod_ping, 1);
> + gpiod_set_value_cansleep(priv->gpiod_ping, 0);
> +
> + return 0;
> +}
> +
> +static int bd9576_wdt_start(struct watchdog_device *wdd)
> +{
> + struct bd9576_wdt_priv *priv = watchdog_get_drvdata(wdd);
> +
> + gpiod_set_value_cansleep(priv->gpiod_en, 1);
> +
> + return bd9576_wdt_ping(wdd);
> +}
> +
> +static int bd9576_wdt_stop(struct watchdog_device *wdd)
> +{
> + struct bd9576_wdt_priv *priv = watchdog_get_drvdata(wdd);
> +
> + if (!priv->always_running)
> + bd9576_wdt_disable(priv);
> + else
> + set_bit(WDOG_HW_RUNNING, >status);
> +
> + return 0;
> +}
> +
> +static const struct watchdog_info bd957x_wdt_ident = {
> + .options= WDIOF_MAGICCLOSE | WDIOF_KEEPALIVEPING |
> +   WDIOF_SETTIMEOUT,
> + .identity   = "BD957x Watchdog",
> +};
> +
> +static const struct watchdog_ops bd957x_wdt_ops = {
> + .owner  = THIS_MODULE,
> + .start  = bd9576_wdt_start,
> + .stop   = bd9576_wdt_stop,
> + .ping   = bd9576_wdt_ping,
> +};
> +
> +/* Unit is hundreds of uS */
> +#define FASTNG_MIN 23
> +
> +static int find_closest_fast(int target, int *sel, int *val)
> +{
> + int i;
> + int window = 

Re: [[PATCH]] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-09-17 Thread Michal Hocko
On Thu 17-09-20 11:03:56, Vijay Balakrishna wrote:
[...]
> > > The auto tuned value is incorrect post hotplug memory operation, in our 
> > > use
> > > case memoy hot add occurs very early during boot.
> > Define incorrect. What are the actual values? Have you tried to increase
> > the value manually after the hotplug?
> 
> In our case SoC with 8GB memory, system tuned min_free_kbytes
> - first to 22528
> - we perform memory hot add very early in boot

What was the original and after-the-hotplug size of memory and layout?
I suspect that all the hotplugged memory is in Movable zone, right?

> - now min_free_kbytes is 8703
> 
> Before looking at code, first I manually restored min_free_kbytes soon after
> boot, reran stress and didn't notice symptoms I mentioned in change log.

This is really surprising and I strongly suspect that an earlier reclaim
just changed the timing enough so that workload has spread the memory
prpessure over a longer time and that might have been enough to recycle
some of the unreclaimable memory due to its natural life time. But this
is a pure speculation. Much more data would be needed to analyze this.

In any case your stress test is oveprovisioning your Normal zone and
increased min_free_kbytes just papers over the sizing problem.
-- 
Michal Hocko
SUSE Labs


Re: linux-next: manual merge of the staging tree with the crypto tree

2020-09-17 Thread Herbert Xu
On Fri, Sep 18, 2020 at 03:21:27PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the staging tree got a conflict in:
> 
>   drivers/staging/rtl8192e/Kconfig
> 
> between commit:
> 
>   054694a46d64 ("staging/rtl8192e: switch to RC4 library interface")
> 
> from the crypto tree and commits:
> 
>   243d040a6e4a ("staging: rtl8192e: fix kconfig dependency warning for 
> RTLLIB_CRYPTO_TKIP")
>   02c4260713d6 ("staging: rtl8192e: fix kconfig dependency warning for 
> RTLLIB_CRYPTO_WEP")
> 
> from the staging tree.

Those two commits should just be dropped.
 
> diff --cc drivers/staging/rtl8192e/Kconfig
> index 4c440bdaaf6e,31e076cc6f16..
> --- a/drivers/staging/rtl8192e/Kconfig
> +++ b/drivers/staging/rtl8192e/Kconfig
> @@@ -25,7 -26,8 +26,8 @@@ config RTLLIB_CRYPTO_CCM
>   config RTLLIB_CRYPTO_TKIP
>   tristate "Support for rtllib TKIP crypto"
>   depends on RTLLIB
> + select CRYPTO
>  -select CRYPTO_ARC4
>  +select CRYPTO_LIB_ARC4

As the driver has been converted over to the lib arc4 API, it
does not need to select CRYPTO at all.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] objtool: ignore unreachable trap after call to noreturn functions

2020-09-17 Thread Ilie Halip
> The patch looks good to me.  Which versions of Clang do the trap after
> noreturn call?  It would be good to have that in the commit message.

I omitted this because it happens with all versions of clang that are
supported for building the kernel. clang-9 is the oldest version that
could build the mainline x86_64 kernel right now, and it has the same
behavior.

Should I send a v2 with this info?

I.H.


Re: [PATCH 3/3] hwmon: (lm75) Add regulator support

2020-09-17 Thread Guenter Roeck
On 9/17/20 3:18 AM, Alban Bedel wrote:
> Add regulator support for boards where the sensor first need to be
> powered up before it can be used.
> 
> Signed-off-by: Alban Bedel 
> ---
>  drivers/hwmon/lm75.c | 31 +--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
> index ba0be48aeadd..b673f8d2ef20 100644
> --- a/drivers/hwmon/lm75.c
> +++ b/drivers/hwmon/lm75.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "lm75.h"
>  
>  /*
> @@ -101,6 +102,7 @@ static const unsigned short normal_i2c[] = { 0x48, 0x49, 
> 0x4a, 0x4b, 0x4c,
>  struct lm75_data {
>   struct i2c_client   *client;
>   struct regmap   *regmap;
> + struct regulator*vs;
>   u8  orig_conf;
>   u8  current_conf;
>   u8  resolution; /* In bits, 9 to 16 */
> @@ -540,6 +542,8 @@ static void lm75_remove(void *data)
>   struct i2c_client *client = lm75->client;
>  
>   i2c_smbus_write_byte_data(client, LM75_REG_CONF, lm75->orig_conf);
> + if (lm75->vs)
> + regulator_disable(lm75->vs);
>  }
>  
>  static int
> @@ -567,6 +571,14 @@ lm75_probe(struct i2c_client *client, const struct 
> i2c_device_id *id)
>   data->client = client;
>   data->kind = kind;
>  
> + data->vs = devm_regulator_get_optional(dev, "vs");

Looking into the regulator API, it may be better if you use 
devm_regulator_get().
AFAICS it returns a dummy regulator if there is none, and NULL if the regulator
subsystem is disabled. So
data->vs = devm_regulator_get(dev, "vs");
if (IS_ERR(data->vs))
return PTR_ERR(data->vs);
should work and would be less messy.

> + if (IS_ERR(data->vs)) {
> + if (PTR_ERR(data->vs) == -ENODEV)
> + data->vs = NULL;
> + else
> + return PTR_ERR(data->vs);
> + }
> +
>   data->regmap = devm_regmap_init_i2c(client, _regmap_config);
>   if (IS_ERR(data->regmap))
>   return PTR_ERR(data->regmap);
> @@ -581,11 +593,21 @@ lm75_probe(struct i2c_client *client, const struct 
> i2c_device_id *id)
>   data->sample_time = data->params->default_sample_time;
>   data->resolution = data->params->default_resolution;
>  
> + /* Enable the power */
> + if (data->vs) {
> + err = regulator_enable(data->vs);
> + if (err) {
> + dev_err(dev, "failed to enable regulator: %d\n", err);
> + return err;
> + }
> + }
> +

How about device removal ? Don't you have to call regulator_disable()
there as well ? If so, it might be best to use devm_add_action_or_reset()
to register a disable function.

Thanks,
Guenter

>   /* Cache original configuration */
>   status = i2c_smbus_read_byte_data(client, LM75_REG_CONF);
>   if (status < 0) {
>   dev_dbg(dev, "Can't read config? %d\n", status);
> - return status;
> + err = status;
> + goto disable_regulator;
>   }
>   data->orig_conf = status;
>   data->current_conf = status;
> @@ -593,7 +615,7 @@ lm75_probe(struct i2c_client *client, const struct 
> i2c_device_id *id)
>   err = lm75_write_config(data, data->params->set_mask,
>   data->params->clr_mask);
>   if (err)
> - return err;
> + goto disable_regulator;
>  
>   err = devm_add_action_or_reset(dev, lm75_remove, data);
>   if (err)
> @@ -608,6 +630,11 @@ lm75_probe(struct i2c_client *client, const struct 
> i2c_device_id *id)
>   dev_info(dev, "%s: sensor '%s'\n", dev_name(hwmon_dev), client->name);
>  
>   return 0;
> +
> +disable_regulator:
> + if (data->vs)
> + regulator_disable(data->vs);
> + return err;
>  }
>  
>  static const struct i2c_device_id lm75_ids[] = {
> 



Re: [PATCH v12 3/9] x86: kdump: use macro CRASH_ADDR_LOW_MAX in functions reserve_crashkernel[_low]()

2020-09-17 Thread Dave Young
On 09/18/20 at 11:57am, chenzhou wrote:
> Hi Dave,
> 
> 
> On 2020/9/18 11:01, Dave Young wrote:
> > On 09/07/20 at 09:47pm, Chen Zhou wrote:
> >> To make the functions reserve_crashkernel[_low]() as generic,
> >> replace some hard-coded numbers with macro CRASH_ADDR_LOW_MAX.
> >>
> >> Signed-off-by: Chen Zhou 
> >> ---
> >>  arch/x86/kernel/setup.c | 11 ++-
> >>  1 file changed, 6 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> >> index d7fd90c52dae..71a6a6e7ca5b 100644
> >> --- a/arch/x86/kernel/setup.c
> >> +++ b/arch/x86/kernel/setup.c
> >> @@ -430,7 +430,7 @@ static int __init reserve_crashkernel_low(void)
> >>unsigned long total_low_mem;
> >>int ret;
> >>  
> >> -  total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
> >> +  total_low_mem = memblock_mem_size(CRASH_ADDR_LOW_MAX >> PAGE_SHIFT);
> > total_low_mem != CRASH_ADDR_LOW_MAX
> I just replace the magic number with macro, no other change.
> Besides, function memblock_mem_size(limit_pfn) will compute the memory size
> according to the actual system ram.
> 

Ok, it is not obvious in patch this is 64bit only, I'm fine with this
then.



Re: [PATCH 2/2] dt-bindings: phy: cdns,torrent-phy: add reset-names

2020-09-17 Thread Vinod Koul
On 16-09-20, 15:47, Tomi Valkeinen wrote:
> Add reset-names as a required property.
> 
> There are no dts files using torrent phy yet, so it is safe to add a new
> required property.
> 
> Signed-off-by: Tomi Valkeinen 
> ---
>  .../devicetree/bindings/phy/phy-cadence-torrent.yaml | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/phy/phy-cadence-torrent.yaml 
> b/Documentation/devicetree/bindings/phy/phy-cadence-torrent.yaml
> index 4071438be2ba..12ce022e4764 100644
> --- a/Documentation/devicetree/bindings/phy/phy-cadence-torrent.yaml
> +++ b/Documentation/devicetree/bindings/phy/phy-cadence-torrent.yaml
> @@ -54,6 +54,10 @@ properties:
>Torrent PHY reset.
>See Documentation/devicetree/bindings/reset/reset.txt
>  
> +  reset-names:
> +items:
> +  - const: torrent_reset
> +
>  patternProperties:
>'^phy@[0-7]+$':
>  type: object
> @@ -111,6 +115,7 @@ required:
>- reg
>- reg-names
>- resets
> +  - reset-names

Update the example as well please.
>  
>  additionalProperties: false
>  
> -- 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

-- 
~Vinod


linux-next: manual merge of the staging tree with the crypto tree

2020-09-17 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the staging tree got a conflict in:

  drivers/staging/rtl8192e/Kconfig

between commit:

  054694a46d64 ("staging/rtl8192e: switch to RC4 library interface")

from the crypto tree and commits:

  243d040a6e4a ("staging: rtl8192e: fix kconfig dependency warning for 
RTLLIB_CRYPTO_TKIP")
  02c4260713d6 ("staging: rtl8192e: fix kconfig dependency warning for 
RTLLIB_CRYPTO_WEP")

from the staging tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/staging/rtl8192e/Kconfig
index 4c440bdaaf6e,31e076cc6f16..
--- a/drivers/staging/rtl8192e/Kconfig
+++ b/drivers/staging/rtl8192e/Kconfig
@@@ -25,7 -26,8 +26,8 @@@ config RTLLIB_CRYPTO_CCM
  config RTLLIB_CRYPTO_TKIP
tristate "Support for rtllib TKIP crypto"
depends on RTLLIB
+   select CRYPTO
 -  select CRYPTO_ARC4
 +  select CRYPTO_LIB_ARC4
select CRYPTO_MICHAEL_MIC
default y
help
@@@ -35,7 -37,8 +37,8 @@@
  
  config RTLLIB_CRYPTO_WEP
tristate "Support for rtllib WEP crypto"
+   select CRYPTO
 -  select CRYPTO_ARC4
 +  select CRYPTO_LIB_ARC4
depends on RTLLIB
default y
help


pgpgUf9iA05fD.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 00/13] PHY: Add support for multilink configurations in Cadence Torrent PHY driver

2020-09-17 Thread Vinod Koul
On 17-09-20, 09:30, Swapnil Jakhade wrote:
> Cadence Torrent PHY is a multiprotocol PHY supporting different multilink
> PHY configurations including DisplayPort, PCIe, USB, SGMII, QSGMII etc.
> This patch series extends functionality of Torrent PHY driver to support
> following configurations:
> - Single link PCIe configuration
> - PCIe + SGMII/QSGMII Unique SSC multilink configuration
> - Single link SGMII/QSGMII configuration
> - Single link USB configuration
> - PCIe + USB Unique SSC multilink configuration
> - USB + SGMII/QSGMII multilink configuration
> 
> The changes have been validated on TI J7200 platform.

Applied, thanks

-- 
~Vinod


Re: [PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h

2020-09-17 Thread Christoph Hellwig
On Thu, Sep 17, 2020 at 07:50:10PM +0100, Russell King - ARM Linux admin wrote:
> On Thu, Sep 17, 2020 at 07:32:28PM +0200, Christoph Hellwig wrote:
> > The DMA API removed support for not passing in a device a long time
> > ago, so remove the NULL checks.
> 
> What happens with ISA devices?

For actual drivers they've been switched to struct isa_driver, which
provides a struct device.  For some of the special case like the
arch/arm/kernel/dma-isa.c we now use static struct device instances.


[PATCH] Documentation: admin-guide: reformat "lapic=" boot option

2020-09-17 Thread Randy Dunlap
Reformat "lapic=" to try to make it more understandable and similar
to the style that is mostly used in this file.

Signed-off-by: Randy Dunlap 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
---
 Documentation/admin-guide/kernel-parameters.txt |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next-20200917.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-next-20200917/Documentation/admin-guide/kernel-parameters.txt
@@ -2384,9 +2384,10 @@
lapic   [X86-32,APIC] Enable the local APIC even if BIOS
disabled it.
 
-   lapic=  [X86,APIC] "notscdeadline" Do not use TSC deadline
+   lapic=  [X86,APIC] Do not use TSC deadline
value for LAPIC timer one-shot implementation. Default
back to the programmable timer unit in the LAPIC.
+   Format: notscdeadline
 
lapic_timer_c2_ok   [X86,APIC] trust the local apic timer
in C2 power state.


[PATCHv3 0/1] Optimize ext4 file overwrites - perf improvement

2020-09-17 Thread Ritesh Harjani
Hello,

v2 -> v3
1. Switched to suggested approach from Jan to make the approach general
for all file writes rather than only for DAX.
(So as of now both DAX & DIO should benefit from this as both uses the same
iomap path. Although note that I only tested performance improvement for DAX)

Gave a run on xfstests with -g quick,dax and didn't observe any new
issues with this patch.

In case of file writes, currently we start a journal txn irrespective of whether
it's an overwrite or not. In case of an overwrite we don't need to start a
jbd2 txn since the blocks are already allocated.
So this patch optimizes away the txn start in case of file (DAX/DIO) overwrites.
This could significantly boost performance for multi-threaded writes
specially random writes (overwrite).
Fio script used to collect perf numbers is mentioned below.

Below numbers were calculated on a QEMU setup on ppc64 box with simulated
pmem (fsdax) device. 

Didn't observe any new failures with this patch in xfstests "-g quick,dax"

Performance numbers with different threads - (~10x improvement)
==

vanilla_kernel(kIOPS) (randomwrite)
 60 +-+--+---++++---+--+-+   
 |+   +++**  +   +|   
  55 +-+ ** +-+   
 |  **   **   |   
 |  **   **   |   
  50 +-+**   ** +-+   
 |  **   **   |   
  45 +-+**   ** +-+   
 |  **   **   |   
 |  **   **   |   
  40 +-+**   ** +-+   
 |  **   **   |   
  35 +-+   **   **   ** +-+   
 | **   **   **   **  |   
 | **   **   **  **   **  |   
  30 +-+  **   **   **   **  **   **+-+   
 |**  +**  +**  +**  **  +**  |   
  25 +-+--**--+**--+**--+**--**--+**+-+   
  1   248   12  16
 Threads   
patched_kernel(kIOPS) (randomwrite)
  600 +-+-+++---++---+--+-+   
  |   +++   ++   +**  |   
  |   **  |   
  500 +-+ **+-+   
  |   **  |   
  |   **  **  |   
  400 +-+ **  **+-+   
  |   **  **  |   
  300 +-+**   **  **+-+   
  |  **   **  **  |   
  |  **   **  **  |   
  200 +-+**   **  **+-+   
  | **   **   **  **  |   
  | **   **   **  **  |   
  100 +-+   **  **   **   **  **+-+   
  | **  **   **   **  **  |   
  |   +**  +**  **  +**  +** +**  |   
0 +-+-+**--+**--**--+**--+**-+**+-+   
  124   8   12  16
Threads   
fio script
==
[global]
rw=randwrite
norandommap=1
invalidate=0
bs=4k
numjobs=16  --> changed this for different thread options
time_based=1
ramp_time=30
runtime=60
group_reporting=1
ioengine=psync
direct=1
size=16G
filename=file1.0.0:file1.0.1:file1.0.2:file1.0.3:file1.0.4:file1.0.5:file1.0.6:file1.0.7:file1.0.8:file1.0.9:file1.0.10:file1.0.11:file1.0.12:file1.0.13:file1.0.14:file1.0.15:file1.0.16:file1.0.17:file1.0.18:file1.0.19:file1.0.20:file1.0.21:file1.0.22:file1.0.23:file1.0.24:file1.0.25:file1.0.26:file1.0.27:file1.0.28:file1.0.29:file1.0.30:file1.0.31
file_service_type=random
nrfiles=32
directory=/mnt/

[name]
directory=/mnt/
direct=1

NOTE:
==
1. Looking at ~10x perf delta, I probed a bit deeper to understand what's 
causing
this scalability problem. It seems when we are starting a jbd2 txn then slab
alloc code is observing some serious contention around spinlock.

I think that the spinlock contention 

[PATCHv3 1/1] ext4: Optimize file overwrites

2020-09-17 Thread Ritesh Harjani
In case if the file already has underlying blocks/extents allocated
then we don't need to start a journal txn and can directly return
the underlying mapping. Currently ext4_iomap_begin() is used by
both DAX & DIO path. We can check if the write request is an
overwrite & then directly return the mapping information.

This could give a significant perf boost for multi-threaded writes
specially random overwrites.
On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement
could be seen in random writes (overwrite). Also bcoz this optimizes
away the spinlock contention during jbd2 slab cache allocation
(jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed.

Reported-by: Dan Williams 
Suggested-by: Jan Kara 
Signed-off-by: Ritesh Harjani 
---
 fs/ext4/inode.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 10dd470876b3..6eae17758ece 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3437,14 +3437,26 @@ static int ext4_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
  EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
 
-   if (flags & IOMAP_WRITE)
+   if (flags & IOMAP_WRITE) {
+   /*
+* We check here if the blocks are already allocated, then we
+* don't need to start a journal txn and we can directly return
+* the mapping information. This could boost performance
+* especially in multi-threaded overwrite requests.
+*/
+   if (offset + length <= i_size_read(inode)) {
+   ret = ext4_map_blocks(NULL, inode, , 0);
+   if (ret > 0 && (map.m_flags & EXT4_MAP_MAPPED))
+   goto out;
+   }
ret = ext4_iomap_alloc(inode, , flags);
-   else
+   } else {
ret = ext4_map_blocks(NULL, inode, , 0);
+   }
 
if (ret < 0)
return ret;
-
+out:
ext4_set_iomap(inode, iomap, , offset, length);
 
return 0;
-- 
2.26.2



Re: [PATCH v3 0/7] PHY: Prepare Cadence Torrent PHY driver to support multilink configurations

2020-09-17 Thread Vinod Koul
On 16-09-20, 20:28, Swapnil Jakhade wrote:
> Cadence Torrent PHY is a multiprotocol PHY supporting different multilink
> PHY configurations including DisplayPort, PCIe, USB, SGMII, QSGMII etc.
> Existing Torrent PHY driver supports only DisplayPort. This patch series
> prepares Torrent PHY driver so that different multilink configurations can
> be supported. It also updates DT bindings accordingly. This doesn't affect
> ABI as Torrent PHY driver has never been functional, and therefore do not
> exist in any active use case.
> 
> Support for different multilink configurations with register sequences for
> protocols above will be added in a separate patch series.

Applied, thanks

-- 
~Vinod


Re: [PATCH] gpio: aspeed: fix ast2600 bank properties

2020-09-17 Thread Tao Ren
On Thu, Sep 17, 2020 at 08:42:27AM +0930, Andrew Jeffery wrote:
> 
> 
> On Thu, 17 Sep 2020, at 06:12, rentao.b...@gmail.com wrote:
> > From: Tao Ren 
> > 
> > GPIO_U is mapped to the least significant byte of input/output mask, and
> > the byte in "output" mask should be 0 because GPIO_U is input only. All
> > the other bits need to be 1 because GPIO_V/W/X support both input and
> > output modes.
> > 
> > Similarly, GPIO_Y/Z are mapped to the 2 least significant bytes, and the
> > according bits need to be 1 because GPIO_Y/Z support both input and
> > output modes.
> > 
> > Fixes: ab4a85534c3e ("gpio: aspeed: Add in ast2600 details to Aspeed 
> > driver")
> > Signed-off-by: Tao Ren 
> 
> Thanks Tao,
> 
> Reviewed-by: Andrew Jeffery 

Thanks Andrew for the quick review.

Cheers,

Tao


Re: [PATCH 5/6] scsi: ufs: show ufs part info in error case

2020-09-17 Thread Jaegeuk Kim
On 09/17, Can Guo wrote:
> On 2020-09-17 00:05, Jaegeuk Kim wrote:
> > On 09/16, Bean Huo wrote:
> > > On Tue, 2020-09-15 at 13:45 -0700, Jaegeuk Kim wrote:
> > > > Cc: Avri Altman 
> > > > Signed-off-by: Jaegeuk Kim 
> > > > ---
> > > >  drivers/scsi/ufs/ufshcd.c | 8 
> > > >  1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > > index bdc82cc3824aa..b81c116b976ff 100644
> > > > --- a/drivers/scsi/ufs/ufshcd.c
> > > > +++ b/drivers/scsi/ufs/ufshcd.c
> > > > @@ -500,6 +500,14 @@ static void ufshcd_print_tmrs(struct ufs_hba
> > > > *hba, unsigned long bitmap)
> > > >  static void ufshcd_print_host_state(struct ufs_hba *hba)
> > > >  {
> > > > dev_err(hba->dev, "UFS Host state=%d\n", hba->ufshcd_state);
> > > > +   if (hba->sdev_ufs_device) {
> > > > +   dev_err(hba->dev, " vendor = %.8s\n",
> > > > +   hba->sdev_ufs_device-
> > > > >vendor);
> > > > +   dev_err(hba->dev, " model = %.16s\n",
> > > > +   hba->sdev_ufs_device->model);
> > > > +   dev_err(hba->dev, " rev = %.4s\n",
> > > > +   hba->sdev_ufs_device->rev);
> > > > +   }
> > > 
> > > Hi Jaegeuk
> > > these prints have been added since this change:
> > > 
> > > commit 3f8af6044713 ("scsi: ufs: Add some debug information to
> > > ufshcd_print_host_state()")
> > > 
> > > https://patchwork.kernel.org/patch/11694371/
> > 
> > Cool, thank you for pointing this out. BTW, which branch can I see the
> > -next
> > patches?
> > 
> 
> Hi Jaegeuk,
> 
> This patch comes from a series of changes trying to fix and simplify
> the UFS error handling. You can find the whole series here - they are
> picked up on scsi-queue-5.10
> 
> https://lore.kernel.org/linux-scsi/1596975355-39813-10-git-send-email-c...@codeaurora.org/
> 
> Besides, several more fixes for error handling based on above series are
> 
> https://lore.kernel.org/patchwork/patch/1290405/
> &
> https://lore.kernel.org/linux-scsi/159961731708.5787.8825955850640714260.b4...@oracle.com/
> 
> I've mainline all above changes to Android12-5.4 and Android11-5.4.

I've seen the patches in Android branches. Thank you for the explanation.

> 
> Moreover, there are 2 more fixes on the way for error handling, I
> will push them soon.

BTW, could you please take a look at these patches?

Thanks,

> 
> Thanks,
> 
> Can Guo.
> 
> > > 
> > > Thanks,
> > > Bean


Re: [PATCH 4/6] scsi: ufs: fix LINERESET on hibern8

2020-09-17 Thread Jaegeuk Kim
Please ignore this patch.
Thanks.

On 09/15, Jaegeuk Kim wrote:
> From: Jaegeuk Kim 
> 
> When testing infinite test to read sysfs entries of UFS, I got a UFS timeout
> with the following kernel message.
> 
> query: dev_cmd_send: seq_no=78082 tag=31, idn=2
> query: ufshcd_wait_for_dev_cmd: dev_cmd request timedout, tag 31
> query: __ufshcd_query_descriptor: opcode 0x01 for idn 2 failed, index 0, err 
> = -11
>  --  hibern8: dme: dme_send: cmd_id=0x17 idn=0
> query: ufshcd_query_descriptor: failed with error -11, retries 3
>  --  hibern8: ufshcd_update_uic_error: LINERESET during hibern8 enter
>  --  hibern8: __ufshcd_uic_hibern8_enter: hibern8 enter failed. ret = -110
> 
> The problem is casued by hibern8 command issued by ufshcd_suspend(), which is
> not aware of query command. If autohibern8 is enabled, we actually don't need
> to issue hibern8 command by suspend.
> 
> Cc: Alim Akhtar 
> Cc: Avri Altman 
> Signed-off-by: Jaegeuk Kim 
> ---
>  drivers/scsi/ufs/ufshcd.c | 20 ++--
>  drivers/scsi/ufs/ufshcd.h |  1 +
>  2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 848e33ec40639..bdc82cc3824aa 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -3079,8 +3079,12 @@ int ufshcd_query_descriptor_retry(struct ufs_hba *hba,
>   int retries;
>  
>   for (retries = QUERY_REQ_RETRIES; retries > 0; retries--) {
> - err = __ufshcd_query_descriptor(hba, opcode, idn, index,
> + err = -EAGAIN;
> + down_read(>query_lock);
> + if (!ufshcd_is_link_hibern8(hba))
> + err = __ufshcd_query_descriptor(hba, opcode, idn, index,
>   selector, desc_buf, buf_len);
> + up_read(>query_lock);
>   if (!err || err == -EINVAL)
>   break;
>   }
> @@ -8263,8 +8267,8 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
>   enum ufs_pm_level pm_lvl;
>   enum ufs_dev_pwr_mode req_dev_pwr_mode;
>   enum uic_link_state req_link_state;
> + bool need_upwrite = false;
>  
> - hba->pm_op_in_progress = 1;
>   if (!ufshcd_is_shutdown_pm(pm_op)) {
>   pm_lvl = ufshcd_is_runtime_pm(pm_op) ?
>hba->rpm_lvl : hba->spm_lvl;
> @@ -8275,6 +8279,15 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
>   req_link_state = UIC_LINK_OFF_STATE;
>   }
>  
> + if (ufshcd_is_runtime_pm(pm_op) &&
> + req_link_state == UIC_LINK_HIBERN8_STATE &&
> + hba->capabilities & MASK_AUTO_HIBERN8_SUPPORT) {
> + need_upwrite = true;
> + if (!down_write_trylock(>query_lock))
> + return -EBUSY;
> + }
> + hba->pm_op_in_progress = 1;
> +
>   /*
>* If we can't transition into any of the low power modes
>* just gate the clocks.
> @@ -8403,6 +8416,8 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
>   }
>  
>   hba->pm_op_in_progress = 0;
> + if (need_upwrite)
> + up_write(>query_lock);
>  
>   if (ret)
>   ufshcd_update_reg_hist(>ufs_stats.suspend_err, (u32)ret);
> @@ -8894,6 +8909,7 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem 
> *mmio_base, unsigned int irq)
>   mutex_init(>dev_cmd.lock);
>  
>   init_rwsem(>clk_scaling_lock);
> + init_rwsem(>query_lock);
>  
>   ufshcd_init_clk_gating(hba);
>  
> diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
> index 363589c0bd370..6f8e05eaf9661 100644
> --- a/drivers/scsi/ufs/ufshcd.h
> +++ b/drivers/scsi/ufs/ufshcd.h
> @@ -754,6 +754,7 @@ struct ufs_hba {
>   bool is_urgent_bkops_lvl_checked;
>  
>   struct rw_semaphore clk_scaling_lock;
> + struct rw_semaphore query_lock;
>   unsigned char desc_size[QUERY_DESC_IDN_MAX];
>   atomic_t scsi_block_reqs_cnt;
>  
> -- 
> 2.28.0.618.gf4bc123cb7-goog


[PATCH net] hinic: fix sending pkts from core while self testing

2020-09-17 Thread Luo bin
Call netif_tx_disable firstly before starting doing self-test to
avoid sending packet from networking core and self-test packet
simultaneously which may cause self-test failure or hw abnormal.

Fixes: 4aa218a4fe77 ("hinic: add self test support")
Signed-off-by: Luo bin 
---
 drivers/net/ethernet/huawei/hinic/hinic_ethtool.c | 4 
 drivers/net/ethernet/huawei/hinic/hinic_tx.c  | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c 
b/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
index 6bb65ade1d77..c340d9acba80 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_ethtool.c
@@ -1654,6 +1654,7 @@ static void hinic_diag_test(struct net_device *netdev,
}
 
netif_carrier_off(netdev);
+   netif_tx_disable(netdev);
 
err = do_lp_test(nic_dev, eth_test->flags, LP_DEFAULT_TIME,
 _index);
@@ -1662,9 +1663,12 @@ static void hinic_diag_test(struct net_device *netdev,
data[test_index] = 1;
}
 
+   netif_tx_wake_all_queues(netdev);
+
err = hinic_port_link_state(nic_dev, _state);
if (!err && link_state == HINIC_LINK_STATE_UP)
netif_carrier_on(netdev);
+
 }
 
 static int hinic_set_phys_id(struct net_device *netdev,
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c 
b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
index 2b418b568767..c1f81e9144a1 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
@@ -717,8 +717,8 @@ static int free_tx_poll(struct napi_struct *napi, int 
budget)
netdev_txq = netdev_get_tx_queue(txq->netdev, qp->q_id);
 
__netif_tx_lock(netdev_txq, smp_processor_id());
-
-   netif_wake_subqueue(nic_dev->netdev, qp->q_id);
+   if (!netif_testing(nic_dev->netdev))
+   netif_wake_subqueue(nic_dev->netdev, qp->q_id);
 
__netif_tx_unlock(netdev_txq);
 
-- 
2.17.1



Re: [RFC PATCH 00/22] Enhance VHOST to enable SoC-to-SoC communication

2020-09-17 Thread Jason Wang



On 2020/9/16 下午7:47, Kishon Vijay Abraham I wrote:

Hi Jason,

On 16/09/20 8:40 am, Jason Wang wrote:

On 2020/9/15 下午11:47, Kishon Vijay Abraham I wrote:

Hi Jason,

On 15/09/20 1:48 pm, Jason Wang wrote:

Hi Kishon:

On 2020/9/14 下午3:23, Kishon Vijay Abraham I wrote:

Then you need something that is functional equivalent to virtio PCI
which is actually the concept of vDPA (e.g vDPA provides
alternatives if
the queue_sel is hard in the EP implementation).

Okay, I just tried to compare the 'struct vdpa_config_ops' and 'struct
vhost_config_ops' ( introduced in [RFC PATCH 03/22] vhost: Add ops for
the VHOST driver to configure VHOST device).

struct vdpa_config_ops {
  /* Virtqueue ops */
  int (*set_vq_address)(struct vdpa_device *vdev,
    u16 idx, u64 desc_area, u64 driver_area,
    u64 device_area);
  void (*set_vq_num)(struct vdpa_device *vdev, u16 idx, u32 num);
  void (*kick_vq)(struct vdpa_device *vdev, u16 idx);
  void (*set_vq_cb)(struct vdpa_device *vdev, u16 idx,
    struct vdpa_callback *cb);
  void (*set_vq_ready)(struct vdpa_device *vdev, u16 idx, bool
ready);
  bool (*get_vq_ready)(struct vdpa_device *vdev, u16 idx);
  int (*set_vq_state)(struct vdpa_device *vdev, u16 idx,
  const struct vdpa_vq_state *state);
  int (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
  struct vdpa_vq_state *state);
  struct vdpa_notification_area
  (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
  /* vq irq is not expected to be changed once DRIVER_OK is set */
  int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);

  /* Device ops */
  u32 (*get_vq_align)(struct vdpa_device *vdev);
  u64 (*get_features)(struct vdpa_device *vdev);
  int (*set_features)(struct vdpa_device *vdev, u64 features);
  void (*set_config_cb)(struct vdpa_device *vdev,
    struct vdpa_callback *cb);
  u16 (*get_vq_num_max)(struct vdpa_device *vdev);
  u32 (*get_device_id)(struct vdpa_device *vdev);
  u32 (*get_vendor_id)(struct vdpa_device *vdev);
  u8 (*get_status)(struct vdpa_device *vdev);
  void (*set_status)(struct vdpa_device *vdev, u8 status);
  void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
     void *buf, unsigned int len);
  void (*set_config)(struct vdpa_device *vdev, unsigned int offset,
     const void *buf, unsigned int len);
  u32 (*get_generation)(struct vdpa_device *vdev);

  /* DMA ops */
  int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb
*iotlb);
  int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
     u64 pa, u32 perm);
  int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);

  /* Free device resources */
  void (*free)(struct vdpa_device *vdev);
};

+struct vhost_config_ops {
+    int (*create_vqs)(struct vhost_dev *vdev, unsigned int nvqs,
+  unsigned int num_bufs, struct vhost_virtqueue *vqs[],
+  vhost_vq_callback_t *callbacks[],
+  const char * const names[]);
+    void (*del_vqs)(struct vhost_dev *vdev);
+    int (*write)(struct vhost_dev *vdev, u64 vhost_dst, void *src,
int len);
+    int (*read)(struct vhost_dev *vdev, void *dst, u64 vhost_src, int
len);
+    int (*set_features)(struct vhost_dev *vdev, u64 device_features);
+    int (*set_status)(struct vhost_dev *vdev, u8 status);
+    u8 (*get_status)(struct vhost_dev *vdev);
+};
+
struct virtio_config_ops
I think there's some overlap here and some of the ops tries to do the
same thing.

I think it differs in (*set_vq_address)() and (*create_vqs)().
[create_vqs() introduced in struct vhost_config_ops provides
complimentary functionality to (*find_vqs)() in struct
virtio_config_ops. It seemingly encapsulates the functionality of
(*set_vq_address)(), (*set_vq_num)(), (*set_vq_cb)(),..].

Back to the difference between (*set_vq_address)() and (*create_vqs)(),
set_vq_address() directly provides the virtqueue address to the vdpa
device but create_vqs() only provides the parameters of the virtqueue
(like the number of virtqueues, number of buffers) but does not
directly
provide the address. IMO the backend client drivers (like net or vhost)
shouldn't/cannot by itself know how to access the vring created on
virtio front-end. The vdpa device/vhost device should have logic for
that. That will help the client drivers to work with different types of
vdpa device/vhost device and can access the vring created by virtio
irrespective of whether the vring can be accessed via mmio or kernel
space or user space.

I think vdpa always works with client drivers in userspace and
providing
userspace address for vring.

Sorry for being unclear. What I meant is not replacing vDPA with the
vhost(bus) you proposed but the possibility of replacing virtio-pci-epf
with vDPA in:

Okay, so the virtio back-end still use vhost and front end 

Re: [PATCH] iomap: Fix the write_count in iomap_add_to_ioend().

2020-09-17 Thread Darrick J. Wong
On Thu, Sep 17, 2020 at 03:48:04PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 17, 2020 at 06:42:19AM -0400, Brian Foster wrote:
> > That wouldn't address the latency concern Dave brought up. That said, I
> > have no issue with this as a targeted solution for the softlockup issue.
> > iomap_finish_ioend[s]() is common code for both the workqueue and
> > ->bi_end_io() contexts so that would require either some kind of context
> > detection (and my understanding is in_atomic() is unreliable/frowned
> > upon) or a new "atomic" parameter through iomap_finish_ioend[s]() to
> > indicate whether it's safe to reschedule. Preference?
> 
> True, it would not help with latency.  But then again the latency
> should be controlled by the writeback code not doing giant writebacks
> to start with, shouldn't it?
> 
> Any XFS/iomap specific limit also would not help with the block layer
> merging bios.

/me hasn't totally been following this thread, but iomap will also
aggregate the ioend completions; do we need to cap that to keep
latencies down?  I was assuming that amortization was always favorable,
but maybe not?

--D


RE: [PATCH v3] drm/bridge: add it6505 driver

2020-09-17 Thread allen.chen
It has been about two weeks since I posted v3 and haven't heard anything.
Consider this a gentle ping.

If there is something need to fix. I will fix and upstream again.

Thanks.

-Original Message-
From: Allen Chen (陳柏宇) 
Sent: Friday, September 04, 2020 10:10 AM
Cc: Allen Chen (陳柏宇); Kenneth Hung (洪家倫); Jau-Chih Tseng (曾昭智); Hermes Wu 
(吳佳宏); Pi-Hsun Shih; Jitao Shi; Yilun Lin; Hermes Wu (吳佳宏); kernel test robot; 
Andrzej Hajda; Neil Armstrong; Laurent Pinchart; Jonas Karlman; Jernej Skrabec; 
David Airlie; Daniel Vetter; Matthias Brugger; open list; open list:DRM 
DRIVERS; moderated list:ARM/Mediatek SoC support; moderated list:ARM/Mediatek 
SoC support
Subject: [PATCH v3] drm/bridge: add it6505 driver

This adds support for the iTE IT6505.
This device can convert DPI signal to DP output.

From: Allen Chen 
Signed-off-by: Jitao Shi 
Signed-off-by: Pi-Hsun Shih 
Signed-off-by: Yilun Lin 
Signed-off-by: Hermes Wu 
Signed-off-by: Allen Chen 
Reported-by: kernel test robot 
---
 drivers/gpu/drm/bridge/Kconfig  |7 +
 drivers/gpu/drm/bridge/Makefile |1 +
 drivers/gpu/drm/bridge/ite-it6505.c | 3338 +++
 3 files changed, 3346 insertions(+)
 create mode 100644 drivers/gpu/drm/bridge/ite-it6505.c

diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index 3e11af4e9f63e..f21dce3fabeb9 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -61,6 +61,13 @@ config DRM_LONTIUM_LT9611
  HDMI signals
  Please say Y if you have such hardware.
 
+config DRM_ITE_IT6505
+   tristate "ITE IT6505 DisplayPort bridge"
+   depends on OF
+   select DRM_KMS_HELPER
+   help
+ ITE IT6505 DisplayPort bridge chip driver.
+
 config DRM_LVDS_CODEC
tristate "Transparent LVDS encoders and decoders support"
depends on OF
diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index c589a6a7cbe1d..8a118fd901ad7 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_DRM_CDNS_DSI) += cdns-dsi.o
 obj-$(CONFIG_DRM_CHRONTEL_CH7033) += chrontel-ch7033.o
 obj-$(CONFIG_DRM_DISPLAY_CONNECTOR) += display-connector.o
 obj-$(CONFIG_DRM_LONTIUM_LT9611) += lontium-lt9611.o
+obj-$(CONFIG_DRM_ITE_IT6505) += ite-it6505.o
 obj-$(CONFIG_DRM_LVDS_CODEC) += lvds-codec.o
 obj-$(CONFIG_DRM_MEGACHIPS_STDP_GE_B850V3_FW) += 
megachips-stdp-ge-b850v3-fw.o
 obj-$(CONFIG_DRM_NXP_PTN3460) += nxp-ptn3460.o
diff --git a/drivers/gpu/drm/bridge/ite-it6505.c 
b/drivers/gpu/drm/bridge/ite-it6505.c
new file mode 100644
index 0..0ed19673431ee
--- /dev/null
+++ b/drivers/gpu/drm/bridge/ite-it6505.c
@@ -0,0 +1,3338 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/*
+ * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define REG_IC_VER 0x04
+
+#define REG_RESET_CTRL 0x05
+#define VIDEO_RESET BIT(0)
+#define AUDIO_RESET BIT(1)
+#define ALL_LOGIC_RESET BIT(2)
+#define AUX_RESET BIT(3)
+#define HDCP_RESET BIT(4)
+
+#define INT_STATUS_01 0x06
+#define INT_MASK_01 0x09
+#define INT_HPD_CHANGE BIT(0)
+#define INT_RECEIVE_HPD_IRQ BIT(1)
+#define INT_SCDT_CHANGE BIT(2)
+#define INT_HDCP_FAIL BIT(3)
+#define INT_HDCP_DONE BIT(4)
+
+#define INT_STATUS_02 0x07
+#define INT_MASK_02 0x0A
+#define INT_AUX_CMD_FAIL BIT(0)
+#define INT_HDCP_KSV_CHECK BIT(1)
+#define INT_AUDIO_FIFO_ERROR BIT(2)
+
+#define INT_STATUS_03 0x08
+#define INT_MASK_03 0x0B
+#define INT_LINK_TRAIN_FAIL BIT(4)
+#define INT_VID_FIFO_ERROR BIT(5)
+#define INT_IO_LATCH_FIFO_OVERFLOW BIT(7)
+
+#define REG_SYSTEM_STS 0x0D
+#define INT_STS BIT(0)
+#define HPD_STS BIT(1)
+#define VIDEO_STB BIT(2)
+
+#define REG_LINK_TRAIN_STS 0x0E
+#define LINK_STATE_CR BIT(2)
+#define LINK_STATE_EQ BIT(3)
+#define LINK_STATE_NORP BIT(4)
+
+#define REG_BANK_SEL 0x0F
+#define REG_CLK_CTRL0 0x10
+#define M_PCLK_DELAY 0x03
+
+#define REG_AUX_OPT 0x11
+#define AUX_AUTO_RST BIT(0)
+#define AUX_FIX_FREQ BIT(3)
+
+#define REG_DATA_CTRL0 0x12
+#define VIDEO_LATCH_EDGE BIT(4)
+#define ENABLE_PCLK_COUNTER BIT(7)
+
+#define REG_PCLK_COUNTER_VALUE 0x13
+
+#define REG_501_FIFO_CTRL 0x15
+#define RST_501_FIFO BIT(1)
+
+#define REG_TRAIN_CTRL0 0x16
+#define FORCE_LBR BIT(0)
+#define LANE_COUNT_MASK 0x06
+#define LANE_SWAP BIT(3)
+#define SPREAD_AMP_5 BIT(4)
+#define FORCE_CR_DONE BIT(5)
+#define FORCE_EQ_DONE BIT(6)
+
+#define REG_TRAIN_CTRL1 0x17
+#define AUTO_TRAIN BIT(0)
+#define MANUAL_TRAIN BIT(1)
+#define FORCE_RETRAIN BIT(2)
+
+#define REG_AUX_CTRL 0x23
+#define CLR_EDID_FIFO BIT(0)
+#define AUX_USER_MODE BIT(1)
+#define AUX_NO_SEGMENT_WR BIT(6)
+#define 

Re: [PATCH v12 3/9] x86: kdump: use macro CRASH_ADDR_LOW_MAX in functions reserve_crashkernel[_low]()

2020-09-17 Thread chenzhou
Hi Dave,


On 2020/9/18 11:01, Dave Young wrote:
> On 09/07/20 at 09:47pm, Chen Zhou wrote:
>> To make the functions reserve_crashkernel[_low]() as generic,
>> replace some hard-coded numbers with macro CRASH_ADDR_LOW_MAX.
>>
>> Signed-off-by: Chen Zhou 
>> ---
>>  arch/x86/kernel/setup.c | 11 ++-
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index d7fd90c52dae..71a6a6e7ca5b 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -430,7 +430,7 @@ static int __init reserve_crashkernel_low(void)
>>  unsigned long total_low_mem;
>>  int ret;
>>  
>> -total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
>> +total_low_mem = memblock_mem_size(CRASH_ADDR_LOW_MAX >> PAGE_SHIFT);
> total_low_mem != CRASH_ADDR_LOW_MAX
I just replace the magic number with macro, no other change.
Besides, function memblock_mem_size(limit_pfn) will compute the memory size
according to the actual system ram.

Thanks,
Chen Zhou
>
>>  
>>  /* crashkernel=Y,low */
>>  ret = parse_crashkernel_low(boot_command_line, total_low_mem, 
>> _size, );
> The param total_low_mem is for dynamically change crash_size according
> to system ram size.
>
> Is above change a must for your arm64 patches?
See above.
>
>> @@ -451,7 +451,7 @@ static int __init reserve_crashkernel_low(void)
>>  return 0;
>>  }
>>  
>> -low_base = memblock_find_in_range(CRASH_ALIGN, 1ULL << 32, low_size, 
>> CRASH_ALIGN);
>> +low_base = memblock_find_in_range(CRASH_ALIGN, CRASH_ADDR_LOW_MAX, 
>> low_size, CRASH_ALIGN);
>>  if (!low_base) {
>>  pr_err("Cannot reserve %ldMB crashkernel low memory, please try 
>> smaller size.\n",
>> (unsigned long)(low_size >> 20));
>> @@ -504,8 +504,9 @@ static void __init reserve_crashkernel(void)
>>  if (!crash_base) {
>>  /*
>>   * Set CRASH_ADDR_LOW_MAX upper bound for crash memory,
>> - * crashkernel=x,high reserves memory over 4G, also allocates
>> - * 256M extra low memory for DMA buffers and swiotlb.
>> + * crashkernel=x,high reserves memory over CRASH_ADDR_LOW_MAX,
>> + * also allocates 256M extra low memory for DMA buffers
>> + * and swiotlb.
>>   * But the extra memory is not required for all machines.
>>   * So try low memory first and fall back to high memory
>>   * unless "crashkernel=size[KMG],high" is specified.
>> @@ -539,7 +540,7 @@ static void __init reserve_crashkernel(void)
>>  return;
>>  }
>>  
>> -if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
>> +if (crash_base >= CRASH_ADDR_LOW_MAX && reserve_crashkernel_low()) {
>>  memblock_free(crash_base, crash_size);
>>  return;
>>  }
>> -- 
>> 2.20.1
>>
> .
>



Re: linux-next: Tree for Sep 17 (netdevice.h: net_has_fallback_tunnels when SYSCTL is not set)

2020-09-17 Thread महेश बंडेवार
On Thu, Sep 17, 2020 at 1:33 PM Randy Dunlap  wrote:
>
> On 9/17/20 3:23 AM, Stephen Rothwell wrote:
> > Hi all,
> >
> > Changes since 20200916:
> >
>
> I am seeing build errors when CONFIG_SYSCTL is not set:
>
> ld: net/ipv4/ip_tunnel.o: in function `ip_tunnel_init_net':
> ip_tunnel.c:(.text+0x2ea0): undefined reference to 
> `sysctl_fb_tunnels_only_for_init_net'
> ld: net/ipv6/ip6_vti.o: in function `vti6_init_net':
> ip6_vti.c:(.text+0x1b56): undefined reference to 
> `sysctl_fb_tunnels_only_for_init_net'
> ld: net/ipv6/sit.o: in function `sit_init_net':
> sit.c:(.text+0x4568): undefined reference to 
> `sysctl_fb_tunnels_only_for_init_net'
> ld: net/ipv6/ip6_tunnel.o: in function `ip6_tnl_init_net':
> ip6_tunnel.c:(.text+0x27d6): undefined reference to 
> `sysctl_fb_tunnels_only_for_init_net'
> ld: net/ipv6/ip6_gre.o: in function `ip6gre_init_net':
> ip6_gre.c:(.text+0x3a5e): undefined reference to 
> `sysctl_fb_tunnels_only_for_init_net'
>
> due to 316cdaa1158af:
>
> commit 316cdaa1158af17250397054f92bb339fbd8e282
> Author: Mahesh Bandewar 
> Date:   Wed Aug 26 09:05:35 2020 -0700
>
> net: add option to not create fall-back tunnels in root-ns as well
>
>
> This was first reported to netdev@ on Sept. 02 but Mahesh was not cc-ed
> on that report.
>
Thanks Randy for the report.
Probably we shouldn't have removed the !ENBALED(CONFIG_SYSCTL) check.
Let me cook a fix and send it.
>
> --
> ~Randy
> Reported-by: Randy Dunlap 


[PATCH -next] drm/amd/display: remove unused variable in amdgpu_dm.c

2020-09-17 Thread Yang Yingliang
Fix the compile warning:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:892:26: warning: 
variable ‘stream’ set but not used [-Wunused-but-set-variable]
  struct dc_stream_state *stream;
  ^~

Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index bb1bc7f5d149..7d9e8c311879 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -889,7 +889,6 @@ static void 
amdgpu_check_debugfs_connector_property_change(struct amdgpu_device
struct drm_connector_state *conn_state;
struct dm_crtc_state *acrtc_state;
struct drm_crtc_state *crtc_state;
-   struct dc_stream_state *stream;
struct drm_device *dev = adev_to_drm(adev);
 
list_for_each_entry(connector, >mode_config.connector_list, head) {
@@ -906,8 +905,6 @@ static void 
amdgpu_check_debugfs_connector_property_change(struct amdgpu_device
if (!(acrtc_state && acrtc_state->stream))
continue;
 
-   stream = acrtc_state->stream;
-
if (amdgpu_dm_connector->dsc_settings.dsc_force_enable ||
amdgpu_dm_connector->dsc_settings.dsc_num_slices_v ||
amdgpu_dm_connector->dsc_settings.dsc_num_slices_h ||
-- 
2.25.1



Re: [PATCH 1/7] usb: mtu3: convert to devm_platform_ioremap_resource_byname

2020-09-17 Thread Chunfeng Yun
Hi Felip,


On Mon, 2020-09-07 at 10:42 +0300, Felipe Balbi wrote:
> Hi,
> 
> Chunfeng Yun  writes:
> > Use devm_platform_ioremap_resource_byname() to simplify code
> >
> > Signed-off-by: Chunfeng Yun 
> 
> why is it so that your patches always come base64 encoded? They look
> fine on the email client, but when I try to pipe the message to git am
> it always gives me a lot of trouble and I have to manually decode the
> body of your messages and recombine with the patch.
> 
> Can you try to send your patches as actual plain text without encoding
> the body with base64?
Missed the email.

Sorry for inconvenience!
Is only the commit message base64 encoded, or includes the codes?

> 



Re: [PATCH 1/2] locktorture: doesn't check nreaders_stress when no readlock support

2020-09-17 Thread Paul E. McKenney
On Fri, Sep 18, 2020 at 09:13:14AM +0800, Hou Tao wrote:
> Hi Paul,
> 
> On 2020/9/18 0:58, Paul E. McKenney wrote:
> > On Thu, Sep 17, 2020 at 09:59:09PM +0800, Hou Tao wrote:
> >> To ensure there is always at least one locking thread.
> >>
> >> Signed-off-by: Hou Tao 
> >> ---
> >>  kernel/locking/locktorture.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
> >> index 9cfa5e89cff7f..bebdf98e6cd78 100644
> >> --- a/kernel/locking/locktorture.c
> >> +++ b/kernel/locking/locktorture.c
> >> @@ -868,7 +868,8 @@ static int __init lock_torture_init(void)
> >>goto unwind;
> >>}
> >>  
> >> -  if (nwriters_stress == 0 && nreaders_stress == 0) {
> >> +  if (nwriters_stress == 0 &&
> >> +  (!cxt.cur_ops->readlock || nreaders_stress == 0)) {
> > 
> > You lost me on this one.  How does it help to allow tests with zero
> > writers on exclusive locks?  Or am I missing something subtle here?
> > 
> The purpose is to prohibit test with only readers on exclusive locks, not 
> allow it.
> 
> So if the module parameters are "torture_type=mutex_lock nwriters_stress=0 
> nreaders_stress=3",
> locktorture can fail early instead of continuing but doing nothing useful.

Very good!

Now please make that clear in the commit log.  (Your English looks to
me to be more than equal to that challenge.)

In this commit log, please first state what is wrong.  Then what the
change is and how it improves things.

Thanx, Paul

> Regards,
> Tao
> 
> > Thanx, Paul
> > 
> >>pr_alert("lock-torture: must run at least one locking 
> >> thread\n");
> >>firsterr = -EINVAL;
> >>goto unwind;
> >> -- 
> >> 2.25.0.4.g0ad7144999
> >>
> > .
> > 


Re: [PATCH] selftests/harness: Flush stdout before forking

2020-09-17 Thread Michael Ellerman
Shuah Khan  writes:
> On 9/16/20 10:53 PM, Max Filippov wrote:
>> On Wed, Sep 16, 2020 at 9:16 PM Michael Ellerman  wrote:
>>>
>>> The test harness forks() a child to run each test. Both the parent and
>>> the child print to stdout using libc functions. That can lead to
>>> duplicated (or more) output if the libc buffers are not flushed before
>>> forking.
>>>
>>> It's generally not seen when running programs directly, because stdout
>>> will usually be line buffered when it's pointing to a terminal.
>>>
>>> This was noticed when running the seccomp_bpf test, eg:
>>>
>>>$ ./seccomp_bpf | tee test.log
>>>$ grep -c "TAP version 13" test.log
>>>2
>>>
>>> But we only expect the TAP header to appear once.
>>>
>>> It can be exacerbated using stdbuf to increase the buffer size:
>>>
>>>$ stdbuf -o 1MB ./seccomp_bpf > test.log
>>>$ grep -c "TAP version 13" test.log
>>>13
>>>
>>> The fix is simple, we just flush stdout & stderr before fork. Usually
>>> stderr is unbuffered, but that can be changed, so flush it as well
>>> just to be safe.
>>>
>>> Signed-off-by: Michael Ellerman 
>>> ---
>>>   tools/testing/selftests/kselftest_harness.h | 5 +
>>>   1 file changed, 5 insertions(+)
>> 
>> Tested-by: Max Filippov 
>
> Thank you both. Applying to linux-kselftest fixes for 5.9-rc7

It can wait for v5.10 IMHO, but up to you.

cheers


[PATCH v3] EDAC/mc_sysfs: Add missing newlines when printing {max,dimm}_location

2020-09-17 Thread Xiongfeng Wang
Reading those sysfs entries gives:

  [root@localhost /]# cat /sys/devices/system/edac/mc/mc0/max_location
  memory 3 [root@localhost /]# cat 
/sys/devices/system/edac/mc/mc0/dimm0/dimm_location
  memory 0 [root@localhost /]#

Add newlines after the value it prints for better readability.

Signed-off-by: Xiongfeng Wang 
Signed-off-by: Borislav Petkov 
Suggested-by: Joe Perches 
---
 drivers/edac/edac_mc_sysfs.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 4e6aca5..2f9f1e7 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -474,8 +474,12 @@ static ssize_t dimmdev_location_show(struct device *dev,
 struct device_attribute *mattr, char *data)
 {
struct dimm_info *dimm = to_dimm(dev);
+   ssize_t count;
 
-   return edac_dimm_info_location(dimm, data, PAGE_SIZE);
+   count = edac_dimm_info_location(dimm, data, PAGE_SIZE);
+   count += scnprintf(data + count, PAGE_SIZE - count, "\n");
+
+   return count;
 }
 
 static ssize_t dimmdev_label_show(struct device *dev,
@@ -813,15 +817,23 @@ static ssize_t mci_max_location_show(struct device *dev,
 char *data)
 {
struct mem_ctl_info *mci = to_mci(dev);
-   int i;
+   int len = PAGE_SIZE;
char *p = data;
+   int i, n;
 
for (i = 0; i < mci->n_layers; i++) {
-   p += sprintf(p, "%s %d ",
-edac_layer_name[mci->layers[i].type],
-mci->layers[i].size - 1);
+   n = scnprintf(p, len, "%s %d ",
+ edac_layer_name[mci->layers[i].type],
+ mci->layers[i].size - 1);
+   len -= n;
+   if (len <= 0)
+   goto out;
+
+   p += n;
}
 
+   p += scnprintf(p, len, "\n");
+out:
return p - data;
 }
 
-- 
1.7.12.4



[PATCH v2] arm64: Enable PCI write-combine resources under sysfs

2020-09-17 Thread Clint Sbisa
This change exposes write-combine mappings under sysfs for
prefetchable PCI resources on arm64.

Originally, the usage of "write combine" here was driven by the x86
definition of write combine. This definition is specific to x86 and
does not generalize to other architectures. However, the usage of WC
has mutated to "write combine" semantics, which is implemented
differently on each arch.

Generally, prefetchable BARs are accepted to allow speculative
accesses, write combining, and re-ordering-- from the PCI perspective,
this means there are no read side effects. (This contradicts the PCI
spec which allows prefetchable BARs to have read side effects, but
this definition is ill-advised as it is impossible to meet.) On x86,
prefetchable BARs are mapped as WC as originally defined (with some
conditionals on arch features). On arm64, WC is taken to mean normal
non-cacheable memory.

In practice, write combine semantics are used to minimize write
operations. A common usage of this is minimizing PCI TLPs which can
significantly improve performance with PCI devices. In order to
provide the same benefits to userspace, we need to allow userspace to
map prefetchable BARs with write combine semantics. The resourceX_wc
mapping is used today by userspace programs and libraries.

While this model is flawed as "write combine" is very ill-defined, it
is already used by multiple non-x86 archs to expose write combine
semantics to user space. We enable this on arm64 to give userspace on
arm64 an equivalent mechanism for utilizing write combining with PCI
devices.

Cc: Benjamin Herrenschmidt 
Cc: Bjorn Helgaas 
Cc: Catalin Marinas 
Cc: Jason Gunthorpe 
Cc: Lorenzo Pieralisi 
Cc: Will Deacon 
Signed-off-by: Clint Sbisa 
---
Changes in v2:
  - Rewrote the commit message.

 arch/arm64/include/asm/pci.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index 70b323cf8300..b33ca260e3c9 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -17,6 +17,7 @@
 #define pcibios_assign_all_busses() \
(pci_has_flag(PCI_REASSIGN_ALL_BUS))
 
+#define arch_can_pci_mmap_wc() 1
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE 1
 
 extern int isa_dma_bridge_buggy;
-- 
2.23.3



Re: [PATCH] nvme: fix NULL pointer dereference

2020-09-17 Thread Tong Zhang
Please correct me if I am wrong.
After a bit more digging I found out that it is indeed command_id got
corrupted is causing this problem. Although the tag and command_id
range is checked like you said, the elements in rqs cannot be
guaranteed to be not NULL. thus although the range check is passed,
blk_mq_tag_to_rq() can still return NULL. It is clear that the current
sanitization is not enough and there's more implication about this --
when all rqs got populated, a corrupted command_id may silently
corrupt other data not belonging to the current command.

- Tong

On Thu, Sep 17, 2020 at 8:44 PM Tong Zhang  wrote:
>
> Hmm..Yeah.. I see your point.
> I was naivly thinking the command_id was the culprit.
>
> On Thu, Sep 17, 2020 at 1:14 PM Keith Busch  wrote:
> >
> > On Thu, Sep 17, 2020 at 12:56:59PM -0400, Tong Zhang wrote:
> > > The command_id in CQE is writable by NVMe controller, driver should
> > > check its sanity before using it.
> >
> > We already do that.


Re: [PATCH RFC 0/3] scsi: mpt: Refactor and port to dma_* interface

2020-09-17 Thread Martin K. Petersen


Alex,

>> Have you tested your changes?
>
> No, as I'm afraid I don't have the hardware.

QEMU supports it, I propose you try testing with that.

I hesitate merging big changes to abandoned drivers unless they've been
tested. It's too easy to miss things during review...

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH] Only allow to set crash_kexec_post_notifiers on boot time

2020-09-17 Thread Dave Young
crash_kexec_post_notifiers enables running various panic notifier
before kdump kernel booting. This increases risks of kdump failure.
It is well documented in kernel-parameters.txt. We do not suggest
people to enable it together with kdump unless he/she is really sure.
This is also not suggested to be enabled by default when users are
not aware in distributions.

But unfortunately it is enabled by default in systemd, see below
discussions in a systemd report, we can not convince systemd to change
it:
https://github.com/systemd/systemd/issues/16661

Actually we have got reports about kdump kernel hangs in both s390x
and powerpcle cases caused by the systemd change,  also some x86 cases
could also be caused by the same (although that is in Hyper-V code
instead of systemd, that need to be addressed separately).

Thus to avoid the auto enablement here just disable the param writable
permission in sysfs.

Signed-off-by: Dave Young 
---
 kernel/panic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index aef8872ba843..bea44fc4eb3b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -695,7 +695,7 @@ core_param(panic, panic_timeout, int, 0644);
 core_param(panic_print, panic_print, ulong, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
 core_param(panic_on_warn, panic_on_warn, int, 0644);
-core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
+core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0444);
 
 static int __init oops_setup(char *s)
 {
-- 
2.26.2



Re: [PATCH v2] mm/migrate: correct thp migration stats.

2020-09-17 Thread Anshuman Khandual
Hi Zi,

On 09/18/2020 02:34 AM, Zi Yan wrote:
> From: Zi Yan 
> 
> PageTransHuge returns true for both thp and hugetlb, so thp stats was
> counting both thp and hugetlb migrations. Exclude hugetlb migration by
> setting is_thp variable right.

Coincidentally, I had just detected this problem last evening and was
in the process of sending a patch this morning :) Nonetheless, thanks
for the patch.

Earlier there was a similar THP-HugeTLB ambiguity down the error path
as well. In hindsight, I should have noticed or remembered about this
earlier fix during the THP stats patch.

e6112fc30070 (mm/migrate.c: split only transparent huge pages when allocation 
fails)

> 
> Clean up thp handling code too when we are there.
> 
> Fixes: 1a5bae25e3cf ("mm/vmstat: add events for THP migration without split")
> Signed-off-by: Zi Yan 
> Reviewed-by: Daniel Jordan 
> Cc: Daniel Jordan 
> Cc: Anshuman Khandual 
> ---
>  mm/migrate.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 941b89383cf3..6bc9559afc70 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1445,7 +1445,7 @@ int migrate_pages(struct list_head *from, new_page_t 
> get_new_page,
>* Capture required information that might get lost
>* during migration.
>*/
> - is_thp = PageTransHuge(page);
> + is_thp = PageTransHuge(page) && !PageHuge(page);
>   nr_subpages = thp_nr_pages(page);
>   cond_resched();
>  
> @@ -1471,7 +1471,7 @@ int migrate_pages(struct list_head *from, new_page_t 
> get_new_page,
>* we encounter them after the rest of the list
>* is processed.
>*/
> - if (PageTransHuge(page) && !PageHuge(page)) {
> + if (is_thp) {
>   lock_page(page);
>   rc = split_huge_page_to_list(page, 
> from);
>   unlock_page(page);
> @@ -1480,8 +1480,7 @@ int migrate_pages(struct list_head *from, new_page_t 
> get_new_page,
>   nr_thp_split++;
>   goto retry;
>   }
> - }
> - if (is_thp) {
> +
>   nr_thp_failed++;
>   nr_failed += nr_subpages;
>   goto out;
> 

Moving the failure path inside the split path makes sense, now
that it is already established that the page is indeed a THP.

Reviewed-by: Anshuman Khandual 


Re: [PATCH -next] RDMA/mlx5: fix type warning of sizeof in __mlx5_ib_alloc_counters()

2020-09-17 Thread Liu Shixin
On 2020/9/18 1:33, Leon Romanovsky wrote:
> On Thu, Sep 17, 2020 at 02:24:51PM -0300, Jason Gunthorpe wrote:
>> On Thu, Sep 17, 2020 at 08:05:11PM +0300, Leon Romanovsky wrote:
>>> On Thu, Sep 17, 2020 at 09:38:06AM -0300, Jason Gunthorpe wrote:
 On Thu, Sep 17, 2020 at 12:08:10PM +0300, Leon Romanovsky wrote:
> On Thu, Sep 17, 2020 at 05:10:08PM +0800, Liu Shixin wrote:
>> sizeof() when applied to a pointer typed expression should give the
>> size of the pointed data, even if the data is a pointer.
>>
>> Signed-off-by: Liu Shixin 
 Needs a fixes line

>>  if (!cnts->names)
>>  return -ENOMEM;
>>
>>  cnts->offsets = kcalloc(num_counters,
>> -sizeof(cnts->offsets), GFP_KERNEL);
>> +sizeof(*cnts->offsets), GFP_KERNEL);
> This is not.
 Why not?
>>> cnts->offsets is array of pointers that we will set later.
>>> The "sizeof(*cnts->offsets)" will return the size of size_t, while we
>>> need to get "size_t *".
>> Then why isn't a pointer to size **?
>>
>> Something is rotten here
> No problem, I'll check.
I think cnts->offsets is an array pointer whose element is size_t rathen than 
pointer,
so the patch description does not correspond.
And I think it should be modified to sizeof(*cnts->offsets) with other 
description.
>
>> Jason
> .
>



linux-next: build warning after merge of the sound-asoc tree

2020-09-17 Thread Stephen Rothwell
Hi all,

After merging the sound-asoc tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

WARNING: modpost: missing MODULE_LICENSE() in sound/soc/sof/imx/imx-common.o

Introduced by commit

  18ebffe4d043 ("ASoC: SOF: imx: Add debug support for imx platforms")

-- 
Cheers,
Stephen Rothwell


pgpvkhmmiCNnH.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 1/2] dt-bindings: PCI: sprd: Document Unisoc PCIe RC host controller

2020-09-17 Thread Hongtao Wu
On Wed, Sep 16, 2020 at 1:25 AM Rob Herring  wrote:
>
> On Wed, Sep 09, 2020 at 05:48:31PM +0800, Hongtao Wu wrote:
> > From: Hongtao Wu 
> >
> > This series adds PCIe bindings for Unisoc SoCs.
> > This controller is based on DesignWare PCIe IP.
> >
> > Signed-off-by: Hongtao Wu 
> > ---
> >  .../devicetree/bindings/pci/sprd-pcie.yaml | 101 
> > +
> >  1 file changed, 101 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/pci/sprd-pcie.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/pci/sprd-pcie.yaml 
> > b/Documentation/devicetree/bindings/pci/sprd-pcie.yaml
> > new file mode 100644
> > index 000..c52edfb
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/pci/sprd-pcie.yaml
> > @@ -0,0 +1,101 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/pci/sprd-pcie.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: SoC PCIe Host Controller Device Tree Bindings
> > +
> > +maintainers:
> > +  - Hongtao Wu 
> > +
> > +allOf:
> > +  - $ref: /schemas/pci/pci-bus.yaml#
> > +
> > +properties:
> > +  compatible:
> > +items:
> > +  - const: sprd,pcie-rc
> > +
> > +  reg:
> > +minItems: 2
> > +items:
> > +  - description: Controller control and status registers.
> > +  - description: PCIe configuration registers.
> > +
> > +  reg-names:
> > +items:
> > +  - const: dbi
> > +  - const: config
> > +
> > +  ranges:
> > +maxItems: 2
> > +
> > +  num-lanes:
> > +maximum: 1
> > +description: Number of lanes to use for this port.
> > +
> > +  interrupts:
> > +minItems: 1
> > +description: Builtin MSI controller and PCIe host controller.
> > +
> > +  interrupt-names:
> > +items:
> > +  - const: msi
> > +
> > +  sprd-pcie-poweron-syscons:
>

I am Sorry!
I'll fix it.

> Doesn't match the example.
>
> > +minItems: 1
> > +description: Global register.
> > +  The first value is the phandle to the global registers required to
> > +  confige PCIe phy, clock and so on.
> > +  The second value is the global register type which indicates whether 
> > it
> > +  is a set/clear register or not.
> > +  The third value is the time to delay after the global register is 
> > set or
> > +  cleared.
> > +  The fourth value is the global register address.
> > +  The fifth value is the the mask value that the global register must
> > +  be operate.
> > +  The sixth value is the value that will be set to the global register.
> > +  Note that Some Unisoc global registers have not been upstreamed.
> > +  The global register and its mask can't be found in linux kernel,
> > +  so we use an offset address and a number to instead them.
>
> From the example, it looks like you set/clear 2 bits for power on/off.
> What's the worst case you expect here? What do the 2 bits do? If they
> are for clocks, resets, or power domains, then we have bindings for
> those which should be used. This use of phandles to syscons should be
> avoided whenever possible.
>

There are two kinds of global register ( set/clear registers and
non-set/clear registers )
about PCIe on Unisoc SoCs.
Each set of set/clear registers contain two addresses. One can be
written and the other one
can be read. Different bits in  the set/clear register indicate
different functions, so we
set/clear one bit for power on/off.
The non-set/clear registers are normal which only have one address.

The second value in property 'sprd,pcie-poweron-syscons' is a flag
which indicates whether
the global register is set/clear or not. If this value is 1, we think
that it's a set/clear register.
If this value is 0, we think it's a non-set/clear register.

I wanted to parse all of the global registers about power on/off in an
array (include set/clear
registers and non-set/clear registers). However, it may not be a good idea.
I'll split the property 'sprd,pcie-poweron-syscons' info clocks, power
domains, phy and so on
in the next version.

> If we wanted a language for specifying sequences of register accesses in
> DT, we would have defined that a long time ago.
>

> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - reg-names
> > +  - num-lanes
> > +  - ranges
> > +  - interrupts
> > +  - interrupt-names
> > +
> > +examples:
> > +  - |
> > +#include 
> > +
> > +ipa {
> > +#address-cells = <2>;
> > +#size-cells = <2>;
> > +
> > +pcie0: pcie@2b10 {
> > +compatible = "sprd,pcie-rc";
> > +reg = <0x0 0x2b10 0x0 0x2000>,
> > +  <0x2 0x 0x0 0x2000>;
> > +reg-names = "dbi", "config";
> > +#address-cells = <3>;
> > +#size-cells = <2>;
> > +device_type = "pci";
> > +ranges = <0x0100 0x0 0x 0x2 0x2000 0x0 
> > 0x0001>,
> > + 

[PATCH -next] drm/amd/display: remove unused variable in dcn30_hwseq.c

2020-09-17 Thread Yang Yingliang
Fix the compile warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_hwseq.c:322:27: warning: 
variable ‘optc’ set but not used [-Wunused-but-set-variable]
  struct timing_generator *optc;
   ^~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_hwseq.c:641:7: warning: 
variable ‘is_dp’ set but not used [-Wunused-but-set-variable]
  bool is_dp;
   ^

Reported-by: Hulk Robot 
Signed-off-by: Yang Yingliang 
---
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index 204773ffc376..f875b1e98dd3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -319,13 +319,10 @@ void dcn30_enable_writeback(
 {
struct dwbc *dwb;
struct mcif_wb *mcif_wb;
-   struct timing_generator *optc;
 
dwb = dc->res_pool->dwbc[wb_info->dwb_pipe_inst];
mcif_wb = dc->res_pool->mcif_wb[wb_info->dwb_pipe_inst];
 
-   /* set the OPTC source mux */
-   optc = dc->res_pool->timing_generators[dwb->otg_inst];
DC_LOG_DWB("%s dwb_pipe_inst = %d, mpcc_inst = %d",\
__func__, wb_info->dwb_pipe_inst,\
wb_info->mpcc_inst);
@@ -638,7 +635,6 @@ void dcn30_set_avmute(struct pipe_ctx *pipe_ctx, bool 
enable)
 void dcn30_update_info_frame(struct pipe_ctx *pipe_ctx)
 {
bool is_hdmi_tmds;
-   bool is_dp;
 
ASSERT(pipe_ctx->stream);
 
@@ -646,7 +642,6 @@ void dcn30_update_info_frame(struct pipe_ctx *pipe_ctx)
return;  /* this is not root pipe */
 
is_hdmi_tmds = dc_is_hdmi_tmds_signal(pipe_ctx->stream->signal);
-   is_dp = dc_is_dp_signal(pipe_ctx->stream->signal);
 
if (!is_hdmi_tmds)
return;
-- 
2.25.1



[PATCH AUTOSEL 5.4 014/330] mm: fix double page fault on arm64 if PTE_AF is cleared

2020-09-17 Thread Sasha Levin
From: Jia He 

[ Upstream commit 83d116c53058d505ddef051e90ab27f57015b025 ]

When we tested pmdk unit test [1] vmmalloc_fork TEST3 on arm64 guest, there
will be a double page fault in __copy_from_user_inatomic of cow_user_page.

To reproduce the bug, the cmd is as follows after you deployed everything:
make -C src/test/vmmalloc_fork/ TEST_TIME=60m check

Below call trace is from arm64 do_page_fault for debugging purpose:
[  110.016195] Call trace:
[  110.016826]  do_page_fault+0x5a4/0x690
[  110.017812]  do_mem_abort+0x50/0xb0
[  110.018726]  el1_da+0x20/0xc4
[  110.019492]  __arch_copy_from_user+0x180/0x280
[  110.020646]  do_wp_page+0xb0/0x860
[  110.021517]  __handle_mm_fault+0x994/0x1338
[  110.022606]  handle_mm_fault+0xe8/0x180
[  110.023584]  do_page_fault+0x240/0x690
[  110.024535]  do_mem_abort+0x50/0xb0
[  110.025423]  el0_da+0x20/0x24

The pte info before __copy_from_user_inatomic is (PTE_AF is cleared):
[9b007000] pgd=00023d4f8003, pud=00023da9b003,
   pmd=00023d4b3003, pte=36298607bd3

As told by Catalin: "On arm64 without hardware Access Flag, copying from
user will fail because the pte is old and cannot be marked young. So we
always end up with zeroed page after fork() + CoW for pfn mappings. we
don't always have a hardware-managed access flag on arm64."

This patch fixes it by calling pte_mkyoung. Also, the parameter is
changed because vmf should be passed to cow_user_page()

Add a WARN_ON_ONCE when __copy_from_user_inatomic() returns error
in case there can be some obscure use-case (by Kirill).

[1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork

Signed-off-by: Jia He 
Reported-by: Yibo Cai 
Reviewed-by: Catalin Marinas 
Acked-by: Kirill A. Shutemov 
Signed-off-by: Catalin Marinas 
Signed-off-by: Sasha Levin 
---
 mm/memory.c | 104 
 1 file changed, 89 insertions(+), 15 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index cb7c940cf800c..9ea917e28ef4e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -118,6 +118,18 @@ int randomize_va_space __read_mostly =
2;
 #endif
 
+#ifndef arch_faults_on_old_pte
+static inline bool arch_faults_on_old_pte(void)
+{
+   /*
+* Those arches which don't have hw access flag feature need to
+* implement their own helper. By default, "true" means pagefault
+* will be hit on old pte.
+*/
+   return true;
+}
+#endif
+
 static int __init disable_randmaps(char *s)
 {
randomize_va_space = 0;
@@ -2145,32 +2157,82 @@ static inline int pte_unmap_same(struct mm_struct *mm, 
pmd_t *pmd,
return same;
 }
 
-static inline void cow_user_page(struct page *dst, struct page *src, unsigned 
long va, struct vm_area_struct *vma)
+static inline bool cow_user_page(struct page *dst, struct page *src,
+struct vm_fault *vmf)
 {
+   bool ret;
+   void *kaddr;
+   void __user *uaddr;
+   bool force_mkyoung;
+   struct vm_area_struct *vma = vmf->vma;
+   struct mm_struct *mm = vma->vm_mm;
+   unsigned long addr = vmf->address;
+
debug_dma_assert_idle(src);
 
+   if (likely(src)) {
+   copy_user_highpage(dst, src, addr, vma);
+   return true;
+   }
+
/*
 * If the source page was a PFN mapping, we don't have
 * a "struct page" for it. We do a best-effort copy by
 * just copying from the original user address. If that
 * fails, we just zero-fill it. Live with it.
 */
-   if (unlikely(!src)) {
-   void *kaddr = kmap_atomic(dst);
-   void __user *uaddr = (void __user *)(va & PAGE_MASK);
+   kaddr = kmap_atomic(dst);
+   uaddr = (void __user *)(addr & PAGE_MASK);
+
+   /*
+* On architectures with software "accessed" bits, we would
+* take a double page fault, so mark it accessed here.
+*/
+   force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
+   if (force_mkyoung) {
+   pte_t entry;
+
+   vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, >ptl);
+   if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
+   /*
+* Other thread has already handled the fault
+* and we don't need to do anything. If it's
+* not the case, the fault will be triggered
+* again on the same address.
+*/
+   ret = false;
+   goto pte_unlock;
+   }
 
+   entry = pte_mkyoung(vmf->orig_pte);
+   if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
+   update_mmu_cache(vma, addr, vmf->pte);
+   }
+
+   /*
+* This really shouldn't fail, because the page is there
+* in the page tables. But it might just 

[PATCH AUTOSEL 5.4 003/330] scsi: lpfc: Fix pt2pt discovery on SLI3 HBAs

2020-09-17 Thread Sasha Levin
From: James Smart 

[ Upstream commit 359e10f087dbb7b9c9f3035a8cc4391af45bd651 ]

After exchanging PLOGI on an SLI-3 adapter, the PRLI exchange failed.  Link
trace showed the port was assigned a non-zero n_port_id, but didn't use the
address on the PRLI. The assigned address is set on the port by the
CONFIG_LINK mailbox command. The driver responded to the PRLI before the
mailbox command completed. Thus the PRLI response used the old n_port_id.

Defer the PRLI response until CONFIG_LINK completes.

Link: https://lore.kernel.org/r/20190922035906.10977-2-jsmart2...@gmail.com
Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_nportdisc.c | 141 +++--
 1 file changed, 115 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c 
b/drivers/scsi/lpfc/lpfc_nportdisc.c
index 6961713825585..2a340624bfc99 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -279,6 +279,55 @@ lpfc_els_abort(struct lpfc_hba *phba, struct lpfc_nodelist 
*ndlp)
lpfc_cancel_retry_delay_tmo(phba->pport, ndlp);
 }
 
+/* lpfc_defer_pt2pt_acc - Complete SLI3 pt2pt processing on link up
+ * @phba: pointer to lpfc hba data structure.
+ * @link_mbox: pointer to CONFIG_LINK mailbox object
+ *
+ * This routine is only called if we are SLI3, direct connect pt2pt
+ * mode and the remote NPort issues the PLOGI after link up.
+ */
+void
+lpfc_defer_pt2pt_acc(struct lpfc_hba *phba, LPFC_MBOXQ_t *link_mbox)
+{
+   LPFC_MBOXQ_t *login_mbox;
+   MAILBOX_t *mb = _mbox->u.mb;
+   struct lpfc_iocbq *save_iocb;
+   struct lpfc_nodelist *ndlp;
+   int rc;
+
+   ndlp = link_mbox->ctx_ndlp;
+   login_mbox = link_mbox->context3;
+   save_iocb = login_mbox->context3;
+   link_mbox->context3 = NULL;
+   login_mbox->context3 = NULL;
+
+   /* Check for CONFIG_LINK error */
+   if (mb->mbxStatus) {
+   lpfc_printf_log(phba, KERN_ERR, LOG_DISCOVERY,
+   "4575 CONFIG_LINK fails pt2pt discovery: %x\n",
+   mb->mbxStatus);
+   mempool_free(login_mbox, phba->mbox_mem_pool);
+   mempool_free(link_mbox, phba->mbox_mem_pool);
+   lpfc_sli_release_iocbq(phba, save_iocb);
+   return;
+   }
+
+   /* Now that CONFIG_LINK completed, and our SID is configured,
+* we can now proceed with sending the PLOGI ACC.
+*/
+   rc = lpfc_els_rsp_acc(link_mbox->vport, ELS_CMD_PLOGI,
+ save_iocb, ndlp, login_mbox);
+   if (rc) {
+   lpfc_printf_log(phba, KERN_ERR, LOG_DISCOVERY,
+   "4576 PLOGI ACC fails pt2pt discovery: %x\n",
+   rc);
+   mempool_free(login_mbox, phba->mbox_mem_pool);
+   }
+
+   mempool_free(link_mbox, phba->mbox_mem_pool);
+   lpfc_sli_release_iocbq(phba, save_iocb);
+}
+
 static int
 lpfc_rcv_plogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
   struct lpfc_iocbq *cmdiocb)
@@ -291,10 +340,12 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
IOCB_t *icmd;
struct serv_parm *sp;
uint32_t ed_tov;
-   LPFC_MBOXQ_t *mbox;
+   LPFC_MBOXQ_t *link_mbox;
+   LPFC_MBOXQ_t *login_mbox;
+   struct lpfc_iocbq *save_iocb;
struct ls_rjt stat;
uint32_t vid, flag;
-   int rc;
+   int rc, defer_acc;
 
memset(, 0, sizeof (struct ls_rjt));
pcmd = (struct lpfc_dmabuf *) cmdiocb->context2;
@@ -343,6 +394,7 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
else
ndlp->nlp_fcp_info |= CLASS3;
 
+   defer_acc = 0;
ndlp->nlp_class_sup = 0;
if (sp->cls1.classValid)
ndlp->nlp_class_sup |= FC_COS_CLASS1;
@@ -354,7 +406,6 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
ndlp->nlp_class_sup |= FC_COS_CLASS4;
ndlp->nlp_maxframe =
((sp->cmn.bbRcvSizeMsb & 0x0F) << 8) | sp->cmn.bbRcvSizeLsb;
-
/* if already logged in, do implicit logout */
switch (ndlp->nlp_state) {
case  NLP_STE_NPR_NODE:
@@ -396,6 +447,10 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
ndlp->nlp_fcp_info &= ~NLP_FCP_2_DEVICE;
ndlp->nlp_flag &= ~NLP_FIRSTBURST;
 
+   login_mbox = NULL;
+   link_mbox = NULL;
+   save_iocb = NULL;
+
/* Check for Nport to NPort pt2pt protocol */
if ((vport->fc_flag & FC_PT2PT) &&
!(vport->fc_flag & FC_PT2PT_PLOGI)) {
@@ -423,17 +478,22 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
if (phba->sli_rev == LPFC_SLI_REV4)
lpfc_issue_reg_vfi(vport);
else {
-

[PATCH AUTOSEL 5.4 005/330] selinux: allow labeling before policy is loaded

2020-09-17 Thread Sasha Levin
From: Jonathan Lebon 

[ Upstream commit 3e3e24b42043eceb97ed834102c2d094dfd7aaa6 ]

Currently, the SELinux LSM prevents one from setting the
`security.selinux` xattr on an inode without a policy first being
loaded. However, this restriction is problematic: it makes it impossible
to have newly created files with the correct label before actually
loading the policy.

This is relevant in distributions like Fedora, where the policy is
loaded by systemd shortly after pivoting out of the initrd. In such
instances, all files created prior to pivoting will be unlabeled. One
then has to relabel them after pivoting, an operation which inherently
races with other processes trying to access those same files.

Going further, there are use cases for creating the entire root
filesystem on first boot from the initrd (e.g. Container Linux supports
this today[1], and we'd like to support it in Fedora CoreOS as well[2]).
One can imagine doing this in two ways: at the block device level (e.g.
laying down a disk image), or at the filesystem level. In the former,
labeling can simply be part of the image. But even in the latter
scenario, one still really wants to be able to set the right labels when
populating the new filesystem.

This patch enables this by changing behaviour in the following two ways:
1. allow `setxattr` if we're not initialized
2. don't try to set the in-core inode SID if we're not initialized;
   instead leave it as `LABEL_INVALID` so that revalidation may be
   attempted at a later time

Note the first hunk of this patch is mostly the same as a previously
discussed one[3], though it was part of a larger series which wasn't
accepted.

[1] https://coreos.com/os/docs/latest/root-filesystem-placement.html
[2] https://github.com/coreos/fedora-coreos-tracker/issues/94
[3] https://www.spinics.net/lists/linux-initramfs/msg04593.html

Co-developed-by: Victor Kamensky 
Signed-off-by: Victor Kamensky 
Signed-off-by: Jonathan Lebon 
Signed-off-by: Paul Moore 
Signed-off-by: Sasha Levin 
---
 security/selinux/hooks.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 552e73d90fd25..212f48025db81 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3156,6 +3156,9 @@ static int selinux_inode_setxattr(struct dentry *dentry, 
const char *name,
return dentry_has_perm(current_cred(), dentry, FILE__SETATTR);
}
 
+   if (!selinux_state.initialized)
+   return (inode_owner_or_capable(inode) ? 0 : -EPERM);
+
sbsec = inode->i_sb->s_security;
if (!(sbsec->flags & SBLABEL_MNT))
return -EOPNOTSUPP;
@@ -3239,6 +3242,15 @@ static void selinux_inode_post_setxattr(struct dentry 
*dentry, const char *name,
return;
}
 
+   if (!selinux_state.initialized) {
+   /* If we haven't even been initialized, then we can't validate
+* against a policy, so leave the label as invalid. It may
+* resolve to a valid label on the next revalidation try if
+* we've since initialized.
+*/
+   return;
+   }
+
rc = security_context_to_sid_force(_state, value, size,
   );
if (rc) {
-- 
2.25.1



[PATCH AUTOSEL 5.4 006/330] media: mc-device.c: fix memleak in media_device_register_entity

2020-09-17 Thread Sasha Levin
From: zhengbin 

[ Upstream commit 713f871b30a66dc4daff4d17b760c9916aaaf2e1 ]

In media_device_register_entity, if media_graph_walk_init fails,
need to free the previously memory.

Reported-by: Hulk Robot 
Signed-off-by: zhengbin 
Signed-off-by: Sakari Ailus 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Sasha Levin 
---
 drivers/media/mc/mc-device.c | 65 ++--
 1 file changed, 33 insertions(+), 32 deletions(-)

diff --git a/drivers/media/mc/mc-device.c b/drivers/media/mc/mc-device.c
index e19df5165e78c..da80883511352 100644
--- a/drivers/media/mc/mc-device.c
+++ b/drivers/media/mc/mc-device.c
@@ -575,6 +575,38 @@ static void media_device_release(struct media_devnode 
*devnode)
dev_dbg(devnode->parent, "Media device released\n");
 }
 
+static void __media_device_unregister_entity(struct media_entity *entity)
+{
+   struct media_device *mdev = entity->graph_obj.mdev;
+   struct media_link *link, *tmp;
+   struct media_interface *intf;
+   unsigned int i;
+
+   ida_free(>entity_internal_idx, entity->internal_idx);
+
+   /* Remove all interface links pointing to this entity */
+   list_for_each_entry(intf, >interfaces, graph_obj.list) {
+   list_for_each_entry_safe(link, tmp, >links, list) {
+   if (link->entity == entity)
+   __media_remove_intf_link(link);
+   }
+   }
+
+   /* Remove all data links that belong to this entity */
+   __media_entity_remove_links(entity);
+
+   /* Remove all pads that belong to this entity */
+   for (i = 0; i < entity->num_pads; i++)
+   media_gobj_destroy(>pads[i].graph_obj);
+
+   /* Remove the entity */
+   media_gobj_destroy(>graph_obj);
+
+   /* invoke entity_notify callbacks to handle entity removal?? */
+
+   entity->graph_obj.mdev = NULL;
+}
+
 /**
  * media_device_register_entity - Register an entity with a media device
  * @mdev:  The media device
@@ -632,6 +664,7 @@ int __must_check media_device_register_entity(struct 
media_device *mdev,
 */
ret = media_graph_walk_init(, mdev);
if (ret) {
+   __media_device_unregister_entity(entity);
mutex_unlock(>graph_mutex);
return ret;
}
@@ -644,38 +677,6 @@ int __must_check media_device_register_entity(struct 
media_device *mdev,
 }
 EXPORT_SYMBOL_GPL(media_device_register_entity);
 
-static void __media_device_unregister_entity(struct media_entity *entity)
-{
-   struct media_device *mdev = entity->graph_obj.mdev;
-   struct media_link *link, *tmp;
-   struct media_interface *intf;
-   unsigned int i;
-
-   ida_free(>entity_internal_idx, entity->internal_idx);
-
-   /* Remove all interface links pointing to this entity */
-   list_for_each_entry(intf, >interfaces, graph_obj.list) {
-   list_for_each_entry_safe(link, tmp, >links, list) {
-   if (link->entity == entity)
-   __media_remove_intf_link(link);
-   }
-   }
-
-   /* Remove all data links that belong to this entity */
-   __media_entity_remove_links(entity);
-
-   /* Remove all pads that belong to this entity */
-   for (i = 0; i < entity->num_pads; i++)
-   media_gobj_destroy(>pads[i].graph_obj);
-
-   /* Remove the entity */
-   media_gobj_destroy(>graph_obj);
-
-   /* invoke entity_notify callbacks to handle entity removal?? */
-
-   entity->graph_obj.mdev = NULL;
-}
-
 void media_device_unregister_entity(struct media_entity *entity)
 {
struct media_device *mdev = entity->graph_obj.mdev;
-- 
2.25.1



[PATCH AUTOSEL 5.4 012/330] ath10k: fix memory leak for tpc_stats_final

2020-09-17 Thread Sasha Levin
From: Miaoqing Pan 

[ Upstream commit 486a8849843455298d49e694cca9968336ce2327 ]

The memory of ar->debug.tpc_stats_final is reallocated every debugfs
reading, it should be freed in ath10k_debug_destroy() for the last
allocation.

Tested HW: QCA9984
Tested FW: 10.4-3.9.0.2-00035

Signed-off-by: Miaoqing Pan 
Signed-off-by: Kalle Valo 
Signed-off-by: Sasha Levin 
---
 drivers/net/wireless/ath/ath10k/debug.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath10k/debug.c 
b/drivers/net/wireless/ath/ath10k/debug.c
index 40baf25ac99f3..04c50a26a4f47 100644
--- a/drivers/net/wireless/ath/ath10k/debug.c
+++ b/drivers/net/wireless/ath/ath10k/debug.c
@@ -2532,6 +2532,7 @@ void ath10k_debug_destroy(struct ath10k *ar)
ath10k_debug_fw_stats_reset(ar);
 
kfree(ar->debug.tpc_stats);
+   kfree(ar->debug.tpc_stats_final);
 }
 
 int ath10k_debug_register(struct ath10k *ar)
-- 
2.25.1



[PATCH AUTOSEL 5.4 013/330] PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes

2020-09-17 Thread Sasha Levin
From: Pierre Crégut 

[ Upstream commit 35ff867b76576e32f34c698ccd11343f7d616204 ]

When sriov_numvfs is being updated, we call the driver->sriov_configure()
function, which may enable VFs and call probe functions, which may make new
devices visible.  This all happens before before sriov_numvfs_store()
updates sriov->num_VFs, so previously, concurrent sysfs reads of
sriov_numvfs returned stale values.

Serialize the sysfs read vs the write so the read returns the correct
num_VFs value.

[bhelgaas: hold device_lock instead of checking mutex_is_locked()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202991
Link: https://lore.kernel.org/r/20190911072736.32091-1-pierre.cre...@orange.com
Signed-off-by: Pierre Crégut 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin 
---
 drivers/pci/iov.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index deec9f9e0b616..9c116cbaa95d8 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -253,8 +253,14 @@ static ssize_t sriov_numvfs_show(struct device *dev,
 char *buf)
 {
struct pci_dev *pdev = to_pci_dev(dev);
+   u16 num_vfs;
+
+   /* Serialize vs sriov_numvfs_store() so readers see valid num_VFs */
+   device_lock(>dev);
+   num_vfs = pdev->sriov->num_VFs;
+   device_unlock(>dev);
 
-   return sprintf(buf, "%u\n", pdev->sriov->num_VFs);
+   return sprintf(buf, "%u\n", num_vfs);
 }
 
 /*
-- 
2.25.1



[PATCH AUTOSEL 5.4 024/330] ata: sata_mv, avoid trigerrable BUG_ON

2020-09-17 Thread Sasha Levin
From: Jiri Slaby 

[ Upstream commit e9f691d899188679746eeb96e6cb520459eda9b4 ]

There are several reports that the BUG_ON on unsupported command in
mv_qc_prep can be triggered under some circumstances:
https://bugzilla.suse.com/show_bug.cgi?id=1110252
https://serverfault.com/questions/97/raid-problems-after-power-outage
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652185
https://bugs.centos.org/view.php?id=14998

Let sata_mv handle the failure gracefully: warn about that incl. the
failed command number and return an AC_ERR_INVALID error. We can do that
now thanks to the previous patch.

Remove also the long-standing FIXME.

[v2] use %.2x as commands are defined as hexa.

Signed-off-by: Jiri Slaby 
Cc: Jens Axboe 
Cc: linux-...@vger.kernel.org
Cc: Sergei Shtylyov 
Signed-off-by: Jens Axboe 
Signed-off-by: Sasha Levin 
---
 drivers/ata/sata_mv.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c
index bde695a320973..0229b618d0eee 100644
--- a/drivers/ata/sata_mv.c
+++ b/drivers/ata/sata_mv.c
@@ -2098,12 +2098,10 @@ static void mv_qc_prep(struct ata_queued_cmd *qc)
 * non-NCQ mode are: [RW] STREAM DMA and W DMA FUA EXT, none
 * of which are defined/used by Linux.  If we get here, this
 * driver needs work.
-*
-* FIXME: modify libata to give qc_prep a return value and
-* return error here.
 */
-   BUG_ON(tf->command);
-   break;
+   ata_port_err(ap, "%s: unsupported command: %.2x\n", __func__,
+   tf->command);
+   return AC_ERR_INVALID;
}
mv_crqb_pack_cmd(cw++, tf->nsect, ATA_REG_NSECT, 0);
mv_crqb_pack_cmd(cw++, tf->hob_lbal, ATA_REG_LBAL, 0);
-- 
2.25.1



[PATCH AUTOSEL 5.4 021/330] media: smiapp: Fix error handling at NVM reading

2020-09-17 Thread Sasha Levin
From: Sakari Ailus 

[ Upstream commit a5b1d5413534607b05fb34470ff62bf395f5c8d0 ]

If NVM reading failed, the device was left powered on. Fix that.

Signed-off-by: Sakari Ailus 
Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Sasha Levin 
---
 drivers/media/i2c/smiapp/smiapp-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/media/i2c/smiapp/smiapp-core.c 
b/drivers/media/i2c/smiapp/smiapp-core.c
index 42805dfbffeb9..06edbe8749c64 100644
--- a/drivers/media/i2c/smiapp/smiapp-core.c
+++ b/drivers/media/i2c/smiapp/smiapp-core.c
@@ -2327,11 +2327,12 @@ smiapp_sysfs_nvm_read(struct device *dev, struct 
device_attribute *attr,
if (rval < 0) {
if (rval != -EBUSY && rval != -EAGAIN)
pm_runtime_set_active(>dev);
-   pm_runtime_put(>dev);
+   pm_runtime_put_noidle(>dev);
return -ENODEV;
}
 
if (smiapp_read_nvm(sensor, sensor->nvm)) {
+   pm_runtime_put(>dev);
dev_err(>dev, "nvm read failed\n");
return -ENODEV;
}
-- 
2.25.1



[PATCH AUTOSEL 5.4 016/330] m68k: q40: Fix info-leak in rtc_ioctl

2020-09-17 Thread Sasha Levin
From: Fuqian Huang 

[ Upstream commit 7cf78b6b12fd5550545e4b73b35dca18bd46b44c ]

When the option is RTC_PLL_GET, pll will be copied to userland
via copy_to_user. pll is initialized using mach_get_rtc_pll indirect
call and mach_get_rtc_pll is only assigned with function
q40_get_rtc_pll in arch/m68k/q40/config.c.
In function q40_get_rtc_pll, the field pll_ctrl is not initialized.
This will leak uninitialized stack content to userland.
Fix this by zeroing the uninitialized field.

Signed-off-by: Fuqian Huang 
Link: https://lore.kernel.org/r/20190927121544.7650-1-huangfq.dax...@gmail.com
Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Sasha Levin 
---
 arch/m68k/q40/config.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/m68k/q40/config.c b/arch/m68k/q40/config.c
index e63eb5f069995..f31890078197e 100644
--- a/arch/m68k/q40/config.c
+++ b/arch/m68k/q40/config.c
@@ -264,6 +264,7 @@ static int q40_get_rtc_pll(struct rtc_pll_info *pll)
 {
int tmp = Q40_RTC_CTRL;
 
+   pll->pll_ctrl = 0;
pll->pll_value = tmp & Q40_RTC_PLL_MASK;
if (tmp & Q40_RTC_PLL_SIGN)
pll->pll_value = -pll->pll_value;
-- 
2.25.1



[PATCH AUTOSEL 5.4 023/330] xfs: properly serialise fallocate against AIO+DIO

2020-09-17 Thread Sasha Levin
From: Dave Chinner 

[ Upstream commit 249bd9087a5264d2b8a974081870e2e27671b4dc ]

AIO+DIO can extend the file size on IO completion, and it holds
no inode locks while the IO is in flight. Therefore, a race
condition exists in file size updates if we do something like this:

aio-thread  fallocate-thread

lock inode
submit IO beyond inode->i_size
unlock inode
.
lock inode
break layouts
if (off + len > inode->i_size)
new_size = off + len
.
inode_dio_wait()

.
completes
inode->i_size updated
inode_dio_done()



if (new_size)
xfs_vn_setattr(inode, new_size)

Yup, that attempt to extend the file size in the fallocate code
turns into a truncate - it removes the whatever the aio write
allocated and put to disk, and reduced the inode size back down to
where the fallocate operation ends.

Fundamentally, xfs_file_fallocate()  not compatible with racing
AIO+DIO completions, so we need to move the inode_dio_wait() call
up to where the lock the inode and break the layouts.

Secondly, storing the inode size and then using it unchecked without
holding the ILOCK is not safe; we can only do such a thing if we've
locked out and drained all IO and other modification operations,
which we don't do initially in xfs_file_fallocate.

It should be noted that some of the fallocate operations are
compound operations - they are made up of multiple manipulations
that may zero data, and so we may need to flush and invalidate the
file multiple times during an operation. However, we only need to
lock out IO and other space manipulation operations once, as that
lockout is maintained until the entire fallocate operation has been
completed.

Signed-off-by: Dave Chinner 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Brian Foster 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
Signed-off-by: Sasha Levin 
---
 fs/xfs/xfs_bmap_util.c |  8 +---
 fs/xfs/xfs_file.c  | 30 ++
 fs/xfs/xfs_ioctl.c |  1 +
 3 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 0c71acc1b8317..d6d78e1276254 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1039,6 +1039,7 @@ out_trans_cancel:
goto out_unlock;
 }
 
+/* Caller must first wait for the completion of any pending DIOs if required. 
*/
 int
 xfs_flush_unmap_range(
struct xfs_inode*ip,
@@ -1050,9 +1051,6 @@ xfs_flush_unmap_range(
xfs_off_t   rounding, start, end;
int error;
 
-   /* wait for the completion of any pending DIOs */
-   inode_dio_wait(inode);
-
rounding = max_t(xfs_off_t, 1 << mp->m_sb.sb_blocklog, PAGE_SIZE);
start = round_down(offset, rounding);
end = round_up(offset + len, rounding) - 1;
@@ -1084,10 +1082,6 @@ xfs_free_file_space(
if (len <= 0)   /* if nothing being freed */
return 0;
 
-   error = xfs_flush_unmap_range(ip, offset, len);
-   if (error)
-   return error;
-
startoffset_fsb = XFS_B_TO_FSB(mp, offset);
endoffset_fsb = XFS_B_TO_FSBT(mp, offset + len);
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 1e2176190c86f..203065a647652 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -818,6 +818,36 @@ xfs_file_fallocate(
if (error)
goto out_unlock;
 
+   /*
+* Must wait for all AIO to complete before we continue as AIO can
+* change the file size on completion without holding any locks we
+* currently hold. We must do this first because AIO can update both
+* the on disk and in memory inode sizes, and the operations that follow
+* require the in-memory size to be fully up-to-date.
+*/
+   inode_dio_wait(inode);
+
+   /*
+* Now AIO and DIO has drained we flush and (if necessary) invalidate
+* the cached range over the first operation we are about to run.
+*
+* We care about zero and collapse here because they both run a hole
+* punch over the range first. Because that can zero data, and the range
+* of invalidation for the shift operations is much larger, we still do
+* the required flush for collapse in xfs_prepare_shift().
+*
+* Insert has the same range requirements as collapse, and we extend the
+* file first which can zero data. Hence insert has the same
+* flush/invalidate requirements as collapse and so they are both
+* handled at the right time by xfs_prepare_shift().
+*/
+   if (mode & 

[PATCH AUTOSEL 5.4 015/330] scsi: aacraid: fix illegal IO beyond last LBA

2020-09-17 Thread Sasha Levin
From: Balsundar P 

[ Upstream commit c86fbe484c10b2cd1e770770db2d6b2c88801c1d ]

The driver fails to handle data when read or written beyond device reported
LBA, which triggers kernel panic

Link: 
https://lore.kernel.org/r/1571120524-6037-2-git-send-email-balsunda...@microsemi.com
Signed-off-by: Balsundar P 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/aacraid/aachba.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index 0ed3f806ace54..2388143d59f5d 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -2467,13 +2467,13 @@ static int aac_read(struct scsi_cmnd * scsicmd)
scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 |
SAM_STAT_CHECK_CONDITION;
set_sense(>fsa_dev[cid].sense_data,
- HARDWARE_ERROR, SENCODE_INTERNAL_TARGET_FAILURE,
+ ILLEGAL_REQUEST, SENCODE_LBA_OUT_OF_RANGE,
  ASENCODE_INTERNAL_TARGET_FAILURE, 0, 0);
memcpy(scsicmd->sense_buffer, >fsa_dev[cid].sense_data,
   min_t(size_t, sizeof(dev->fsa_dev[cid].sense_data),
 SCSI_SENSE_BUFFERSIZE));
scsicmd->scsi_done(scsicmd);
-   return 1;
+   return 0;
}
 
dprintk((KERN_DEBUG "aac_read[cpu %d]: lba = %llu, t = %ld.\n",
@@ -2559,13 +2559,13 @@ static int aac_write(struct scsi_cmnd * scsicmd)
scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 |
SAM_STAT_CHECK_CONDITION;
set_sense(>fsa_dev[cid].sense_data,
- HARDWARE_ERROR, SENCODE_INTERNAL_TARGET_FAILURE,
+ ILLEGAL_REQUEST, SENCODE_LBA_OUT_OF_RANGE,
  ASENCODE_INTERNAL_TARGET_FAILURE, 0, 0);
memcpy(scsicmd->sense_buffer, >fsa_dev[cid].sense_data,
   min_t(size_t, sizeof(dev->fsa_dev[cid].sense_data),
 SCSI_SENSE_BUFFERSIZE));
scsicmd->scsi_done(scsicmd);
-   return 1;
+   return 0;
}
 
dprintk((KERN_DEBUG "aac_write[cpu %d]: lba = %llu, t = %ld.\n",
-- 
2.25.1



[PATCH AUTOSEL 5.4 027/330] PM / devfreq: tegra30: Fix integer overflow on CPU's freq max out

2020-09-17 Thread Sasha Levin
From: Dmitry Osipenko 

[ Upstream commit 53b4b2aeee26f42cde5ff2a16dd0d8590c51a55a ]

There is another kHz-conversion bug in the code, resulting in integer
overflow. Although, this time the resulting value is 4294966296 and it's
close to ULONG_MAX, which is okay in this case.

Reviewed-by: Chanwoo Choi 
Tested-by: Peter Geis 
Signed-off-by: Dmitry Osipenko 
Signed-off-by: Chanwoo Choi 
Signed-off-by: Sasha Levin 
---
 drivers/devfreq/tegra30-devfreq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/devfreq/tegra30-devfreq.c 
b/drivers/devfreq/tegra30-devfreq.c
index a6ba75f4106d8..e273011c83fbd 100644
--- a/drivers/devfreq/tegra30-devfreq.c
+++ b/drivers/devfreq/tegra30-devfreq.c
@@ -68,6 +68,8 @@
 
 #define KHZ1000
 
+#define KHZ_MAX(ULONG_MAX / 
KHZ)
+
 /* Assume that the bus is saturated if the utilization is 25% */
 #define BUS_SATURATION_RATIO   25
 
@@ -169,7 +171,7 @@ struct tegra_actmon_emc_ratio {
 };
 
 static struct tegra_actmon_emc_ratio actmon_emc_ratios[] = {
-   { 140, ULONG_MAX },
+   { 140,KHZ_MAX },
{ 120,75 },
{ 110,60 },
{ 100,50 },
-- 
2.25.1



[PATCH AUTOSEL 5.4 025/330] leds: mlxreg: Fix possible buffer overflow

2020-09-17 Thread Sasha Levin
From: Oleh Kravchenko 

[ Upstream commit 7c6082b903ac28dc3f383fba57c6f9e7e2594178 ]

Error was detected by PVS-Studio:
V512 A call of the 'sprintf' function will lead to overflow of
the buffer 'led_data->led_cdev_name'.

Acked-by: Jacek Anaszewski 
Acked-by: Pavel Machek 
Signed-off-by: Oleh Kravchenko 
Signed-off-by: Pavel Machek 
Signed-off-by: Sasha Levin 
---
 drivers/leds/leds-mlxreg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/leds/leds-mlxreg.c b/drivers/leds/leds-mlxreg.c
index cabe379071a7c..82aea1cd0c125 100644
--- a/drivers/leds/leds-mlxreg.c
+++ b/drivers/leds/leds-mlxreg.c
@@ -228,8 +228,8 @@ static int mlxreg_led_config(struct mlxreg_led_priv_data 
*priv)
brightness = LED_OFF;
led_data->base_color = MLXREG_LED_GREEN_SOLID;
}
-   sprintf(led_data->led_cdev_name, "%s:%s", "mlxreg",
-   data->label);
+   snprintf(led_data->led_cdev_name, 
sizeof(led_data->led_cdev_name),
+"mlxreg:%s", data->label);
led_cdev->name = led_data->led_cdev_name;
led_cdev->brightness = brightness;
led_cdev->max_brightness = LED_ON;
-- 
2.25.1



Re: [PATCH] kprobes: Do not disarm disabled ftrace kprobe

2020-09-17 Thread Masami Hiramatsu
Hi Steve,

Ah, this seems to fix same issue which I sent.

https://lkml.kernel.org/r/159888672694.1411785.5987998076694782591.stgit@devnote2

Could you confirm it?

Thank you,

On Thu, 17 Sep 2020 19:17:54 -0400
Steven Rostedt  wrote:

> From: Steven Rostedt (VMware) 
> 
> Only disable a ftrace probe if it is enabled, otherwise:
> 
> The following triggers a warning:
> 
>   # modprobe trace_printk
>   # echo "p:kprobes1/event1 trace_printk:trace_printk_irq_work" > 
> /sys/kernel/tracing/kprobe_events
>   # rmmod trace_printk
> 
>  [ cut here ]
>  Failed to disarm kprobe-ftrace at trace_printk_irq_work+0x0/0x76 
> [trace_printk] (-2)
>  WARNING: CPU: 5 PID: 4852 at kernel/kprobes.c:1100 
> __disarm_kprobe_ftrace.isra.0+0x78/0xa0
>  Modules linked in: trace_printk(-) [..] [last unloaded: trace_printk]
>  CPU: 5 PID: 4852 Comm: rmmod Tainted: GW 5.9.0-rc4-test+ #506
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 
> 07/14/2016
>  RIP: 0010:__disarm_kprobe_ftrace.isra.0+0x78/0xa0
>  Code: 8b 03 eb cb 80 3d 5d 95 58 01 00 75 de 48 8b 75 00 89 c2 89 44 24 04 
> 48 c7 c7 38 e3 33 8b c6 05 43 95 58 01 01 e8 c8 1d ef ff <0f> 0b 8b 44 24 04 
> eb b9 89 c6 48 c7 c7 08 e3 33 8b 89 44 24 04 e8
>  RSP: 0018:971ce04b7e38 EFLAGS: 00010282
>  RAX:  RBX: 8c900b30 RCX: 
>  RDX: 0001 RSI: 8a16c5af RDI: 8a16c5af
>  RBP: 971cf2722440 R08: 0001 R09: 0001
>  R10:  R11: 0046 R12: 8b7b33a0
>  R13: 8c901eb0 R14:  R15: 
>  FS:  7f4fe349b740() GS:971d5ab4() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 55d07b0148b8 CR3: b76cc006 CR4: 001706e0
>  Call Trace:
>   kprobes_module_callback+0x1b3/0x3c0
>   notifier_call_chain+0x47/0x70
>   blocking_notifier_call_chain+0x43/0x60
>   __x64_sys_delete_module+0x161/0x2a0
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  RIP: 0033:0x7f4fe35cb00b
>  Code: 73 01 c3 48 8b 0d 7d fe 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 
> 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 
> 73 01 c3 48 8b 0d 4d fe 0b 00 f7 d8 64 89 01 48
>  RSP: 002b:7ffe820f9888 EFLAGS: 0206 ORIG_RAX: 00b0
>  RAX: ffda RBX: 55d07b00a800 RCX: 7f4fe35cb00b
>  RDX: 000a RSI: 0800 RDI: 55d07b00a868
>  RBP: 7ffe820f98e8 R08:  R09: 
>  R10: 7f4fe363eac0 R11: 0206 R12: 7ffe820f9ab0
>  R13: 7ffe820fb223 R14: 55d07b00a2a0 R15: 55d07b00a800
>  irq event stamp: 7463
>  hardirqs last  enabled at (7489): [] 
> __tick_nohz_task_switch+0xad/0xc0
>  hardirqs last disabled at (7510): [] 
> __tick_nohz_task_switch+0xb4/0xc0
>  softirqs last  enabled at (7530): [] 
> __do_softirq+0x3b4/0x501
>  softirqs last disabled at (7545): [] 
> asm_call_on_stack+0x12/0x20
>  ---[ end trace 71f3303cdebb63e3 ]---
> 
> As well as the following two ftrace selftests fail:
> 
>  test.d/kprobe/kprobe_module.tc
>  test.d/kprobe/kretprobe_args.tc
> 
> This is because we are trying to remove a probe that is not enabled or
> registered with ftrace, but exists in the kprobe tables.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 0cb2f1372baa ("kprobes: Fix NULL pointer dereference at 
> kprobe_ftrace_handler")
> Signed-off-by: Steven Rostedt (VMware) 
> ---
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 287b263c9cb9..7557883771f9 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1088,6 +1088,9 @@ static int __disarm_kprobe_ftrace(struct kprobe *p, 
> struct ftrace_ops *ops,
>  {
>   int ret = 0;
>  
> + if (kprobe_disabled(p))
> + return ret;
> +
>   if (*cnt == 1) {
>   ret = unregister_ftrace_function(ops);
>   if (WARN(ret < 0, "Failed to unregister kprobe-ftrace (%d)\n", 
> ret))


-- 
Masami Hiramatsu 


[PATCH AUTOSEL 5.4 034/330] f2fs: avoid kernel panic on corruption test

2020-09-17 Thread Sasha Levin
From: Jaegeuk Kim 

[ Upstream commit bc005a4d5347da68e690f78d365d8927c87dc85a ]

xfstests/generic/475 complains kernel warn/panic while testing corrupted disk.

Reviewed-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
Signed-off-by: Sasha Levin 
---
 fs/f2fs/node.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index daeac4268c1ab..e6f1b1d0c3b68 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -2350,7 +2350,6 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info 
*sbi,
 
if (ret) {
up_read(_i->nat_tree_lock);
-   f2fs_bug_on(sbi, !mount);
f2fs_err(sbi, "NAT is corrupt, run fsck to fix 
it");
return ret;
}
-- 
2.25.1



[PATCH AUTOSEL 5.4 026/330] dm table: do not allow request-based DM to stack on partitions

2020-09-17 Thread Sasha Levin
From: Mike Snitzer 

[ Upstream commit 6ba01df72b4b63a26b4977790f58d8f775d2992c ]

Partitioned request-based devices cannot be used as underlying devices
for request-based DM because no partition offsets are added to each
incoming request.  As such, until now, stacking on partitioned devices
would _always_ result in data corruption (e.g. wiping the partition
table, writing to other partitions, etc).  Fix this by disallowing
request-based stacking on partitions.

While at it, since all .request_fn support has been removed from block
core, remove legacy dm-table code that differentiated between blk-mq and
.request_fn request-based.

Signed-off-by: Mike Snitzer 
Signed-off-by: Sasha Levin 
---
 drivers/md/dm-table.c | 27 ---
 1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 52e049554f5cd..2ae0c19137667 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -918,21 +918,15 @@ bool dm_table_supports_dax(struct dm_table *t,
 
 static bool dm_table_does_not_support_partial_completion(struct dm_table *t);
 
-struct verify_rq_based_data {
-   unsigned sq_count;
-   unsigned mq_count;
-};
-
-static int device_is_rq_based(struct dm_target *ti, struct dm_dev *dev,
- sector_t start, sector_t len, void *data)
+static int device_is_rq_stackable(struct dm_target *ti, struct dm_dev *dev,
+ sector_t start, sector_t len, void *data)
 {
-   struct request_queue *q = bdev_get_queue(dev->bdev);
-   struct verify_rq_based_data *v = data;
+   struct block_device *bdev = dev->bdev;
+   struct request_queue *q = bdev_get_queue(bdev);
 
-   if (queue_is_mq(q))
-   v->mq_count++;
-   else
-   v->sq_count++;
+   /* request-based cannot stack on partitions! */
+   if (bdev != bdev->bd_contains)
+   return false;
 
return queue_is_mq(q);
 }
@@ -941,7 +935,6 @@ static int dm_table_determine_type(struct dm_table *t)
 {
unsigned i;
unsigned bio_based = 0, request_based = 0, hybrid = 0;
-   struct verify_rq_based_data v = {.sq_count = 0, .mq_count = 0};
struct dm_target *tgt;
struct list_head *devices = dm_table_get_devices(t);
enum dm_queue_mode live_md_type = dm_get_md_type(t->md);
@@ -1045,14 +1038,10 @@ verify_rq_based:
 
/* Non-request-stackable devices can't be used for request-based dm */
if (!tgt->type->iterate_devices ||
-   !tgt->type->iterate_devices(tgt, device_is_rq_based, )) {
+   !tgt->type->iterate_devices(tgt, device_is_rq_stackable, NULL)) {
DMERR("table load rejected: including non-request-stackable 
devices");
return -EINVAL;
}
-   if (v.sq_count > 0) {
-   DMERR("table load rejected: not all devices are blk-mq 
request-stackable");
-   return -EINVAL;
-   }
 
return 0;
 }
-- 
2.25.1



[PATCH AUTOSEL 5.4 029/330] scsi: lpfc: Fix kernel crash at lpfc_nvme_info_show during remote port bounce

2020-09-17 Thread Sasha Levin
From: James Smart 

[ Upstream commit 6c1e803eac846f886cd35131e6516fc51a8414b9 ]

When reading sysfs nvme_info file while a remote port leaves and comes
back, a NULL pointer is encountered. The issue is due to ndlp list
corruption as the the nvme_info_show does not use the same lock as the rest
of the code.

Correct by removing the rcu_xxx_lock calls and replace by the host_lock and
phba->hbaLock spinlocks that are used by the rest of the driver.  Given
we're called from sysfs, we are safe to use _irq rather than _irqsave.

Link: https://lore.kernel.org/r/20191105005708.7399-4-jsmart2...@gmail.com
Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/lpfc/lpfc_attr.c | 40 +--
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 25aa7a53d255e..bb973901b672d 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -176,7 +176,6 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
int i;
int len = 0;
char tmp[LPFC_MAX_NVME_INFO_TMP_LEN] = {0};
-   unsigned long iflags = 0;
 
if (!(vport->cfg_enable_fc4_type & LPFC_ENABLE_NVME)) {
len = scnprintf(buf, PAGE_SIZE, "NVME Disabled\n");
@@ -347,7 +346,6 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
if (strlcat(buf, "\nNVME Initiator Enabled\n", PAGE_SIZE) >= PAGE_SIZE)
goto buffer_done;
 
-   rcu_read_lock();
scnprintf(tmp, sizeof(tmp),
  "XRI Dist lpfc%d Total %d IO %d ELS %d\n",
  phba->brd_no,
@@ -355,7 +353,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
  phba->sli4_hba.io_xri_max,
  lpfc_sli4_get_els_iocb_cnt(phba));
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto buffer_done;
 
/* Port state is only one of two values for now. */
if (localport->port_id)
@@ -371,15 +369,17 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
  wwn_to_u64(vport->fc_nodename.u.wwn),
  localport->port_id, statep);
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto buffer_done;
+
+   spin_lock_irq(shost->host_lock);
 
list_for_each_entry(ndlp, >fc_nodes, nlp_listp) {
nrport = NULL;
-   spin_lock_irqsave(>phba->hbalock, iflags);
+   spin_lock(>phba->hbalock);
rport = lpfc_ndlp_get_nrport(ndlp);
if (rport)
nrport = rport->remoteport;
-   spin_unlock_irqrestore(>phba->hbalock, iflags);
+   spin_unlock(>phba->hbalock);
if (!nrport)
continue;
 
@@ -398,39 +398,39 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
 
/* Tab in to show lport ownership. */
if (strlcat(buf, "NVME RPORT   ", PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
if (phba->brd_no >= 10) {
if (strlcat(buf, " ", PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
}
 
scnprintf(tmp, sizeof(tmp), "WWPN x%llx ",
  nrport->port_name);
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
 
scnprintf(tmp, sizeof(tmp), "WWNN x%llx ",
  nrport->node_name);
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
 
scnprintf(tmp, sizeof(tmp), "DID x%06x ",
  nrport->port_id);
if (strlcat(buf, tmp, PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
 
/* An NVME rport can have multiple roles. */
if (nrport->port_role & FC_PORT_ROLE_NVME_INITIATOR) {
if (strlcat(buf, "INITIATOR ", PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto unlock_buf_done;
}
if (nrport->port_role & FC_PORT_ROLE_NVME_TARGET) {
if (strlcat(buf, "TARGET ", PAGE_SIZE) >= PAGE_SIZE)
-   goto rcu_unlock_buf_done;
+   goto 

[PATCH AUTOSEL 5.4 019/330] ASoC: kirkwood: fix IRQ error handling

2020-09-17 Thread Sasha Levin
From: Russell King 

[ Upstream commit 175fc928198236037174e5c5c066fe3c4691903e ]

Propagate the error code from request_irq(), rather than returning
-EBUSY.

Signed-off-by: Russell King 
Link: https://lore.kernel.org/r/e1iniqh-tw...@rmk-pc.armlinux.org.uk
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/kirkwood/kirkwood-dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/kirkwood/kirkwood-dma.c 
b/sound/soc/kirkwood/kirkwood-dma.c
index 6f69f314f2c2a..d2d5c25bf5502 100644
--- a/sound/soc/kirkwood/kirkwood-dma.c
+++ b/sound/soc/kirkwood/kirkwood-dma.c
@@ -132,7 +132,7 @@ static int kirkwood_dma_open(struct snd_pcm_substream 
*substream)
err = request_irq(priv->irq, kirkwood_dma_irq, IRQF_SHARED,
  "kirkwood-i2s", priv);
if (err)
-   return -EBUSY;
+   return err;
 
/*
 * Enable Error interrupts. We're only ack'ing them but
-- 
2.25.1



[PATCH AUTOSEL 5.4 030/330] powerpc/64s: Always disable branch profiling for prom_init.o

2020-09-17 Thread Sasha Levin
From: Michael Ellerman 

[ Upstream commit 6266a4dadb1d0976490fdf5af4f7941e36f64e80 ]

Otherwise the build fails because prom_init is calling symbols it's
not allowed to, eg:

  Error: External symbol 'ftrace_likely_update' referenced from prom_init.c
  make[3]: *** [arch/powerpc/kernel/Makefile:197: 
arch/powerpc/kernel/prom_init_check] Error 1

Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20191106051129.7626-1-...@ellerman.id.au
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index dc0780f930d5b..59260eb962916 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -19,6 +19,7 @@ CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
 
 CFLAGS_prom_init.o += $(call cc-option, -fno-stack-protector)
+CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace early boot code
@@ -36,7 +37,6 @@ KASAN_SANITIZE_btext.o := n
 ifdef CONFIG_KASAN
 CFLAGS_early_32.o += -DDISABLE_BRANCH_PROFILING
 CFLAGS_cputable.o += -DDISABLE_BRANCH_PROFILING
-CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
 CFLAGS_btext.o += -DDISABLE_BRANCH_PROFILING
 endif
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 031/330] net: silence data-races on sk_backlog.tail

2020-09-17 Thread Sasha Levin
From: Eric Dumazet 

[ Upstream commit 9ed498c6280a2f2b51d02df96df53037272ede49 ]

sk->sk_backlog.tail might be read without holding the socket spinlock,
we need to add proper READ_ONCE()/WRITE_ONCE() to silence the warnings.

KCSAN reported :

BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg

write to 0x8881265109f8 of 8 bytes by interrupt on cpu 1:
 __sk_add_backlog include/net/sock.h:907 [inline]
 sk_add_backlog include/net/sock.h:938 [inline]
 tcp_add_backlog+0x476/0xce0 net/ipv4/tcp_ipv4.c:1759
 tcp_v4_rcv+0x1a70/0x1bd0 net/ipv4/tcp_ipv4.c:1947
 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:442 [inline]
 ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:4929
 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5043
 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5133
 napi_skb_finish net/core/dev.c:5596 [inline]
 napi_gro_receive+0x28f/0x330 net/core/dev.c:5629
 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
 virtnet_receive drivers/net/virtio_net.c:1323 [inline]
 virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
 napi_poll net/core/dev.c:6311 [inline]
 net_rx_action+0x3ae/0xa90 net/core/dev.c:6379
 __do_softirq+0x115/0x33f kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0xbb/0xe0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 do_IRQ+0xa6/0x180 arch/x86/kernel/irq.c:263
 ret_from_intr+0x0/0x19
 native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71
 arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571
 default_idle_call+0x1e/0x40 kernel/sched/idle.c:94
 cpuidle_idle_call kernel/sched/idle.c:154 [inline]
 do_idle+0x1af/0x280 kernel/sched/idle.c:263
 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355
 start_secondary+0x208/0x260 arch/x86/kernel/smpboot.c:264
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241

read to 0x8881265109f8 of 8 bytes by task 8057 on cpu 0:
 tcp_recvmsg+0x46e/0x1b40 net/ipv4/tcp.c:2050
 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
 sock_recvmsg_nosec net/socket.c:871 [inline]
 sock_recvmsg net/socket.c:889 [inline]
 sock_recvmsg+0x92/0xb0 net/socket.c:885
 sock_read_iter+0x15f/0x1e0 net/socket.c:967
 call_read_iter include/linux/fs.h:1889 [inline]
 new_sync_read+0x389/0x4f0 fs/read_write.c:414
 __vfs_read+0xb1/0xc0 fs/read_write.c:427
 vfs_read fs/read_write.c:461 [inline]
 vfs_read+0x143/0x2c0 fs/read_write.c:446
 ksys_read+0xd5/0x1b0 fs/read_write.c:587
 __do_sys_read fs/read_write.c:597 [inline]
 __se_sys_read fs/read_write.c:595 [inline]
 __x64_sys_read+0x4c/0x60 fs/read_write.c:595
 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 8057 Comm: syz-fuzzer Not tainted 5.4.0-rc6+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/crypto/chelsio/chtls/chtls_io.c | 10 +-
 include/net/sock.h  |  4 ++--
 net/ipv4/tcp.c  |  2 +-
 net/llc/af_llc.c|  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/chelsio/chtls/chtls_io.c 
b/drivers/crypto/chelsio/chtls/chtls_io.c
index ce1f1d5d7cd5a..c403d6b64e087 100644
--- a/drivers/crypto/chelsio/chtls/chtls_io.c
+++ b/drivers/crypto/chelsio/chtls/chtls_io.c
@@ -1437,7 +1437,7 @@ static int chtls_pt_recvmsg(struct sock *sk, struct 
msghdr *msg, size_t len,
  csk->wr_max_credits))
sk->sk_write_space(sk);
 
-   if (copied >= target && !sk->sk_backlog.tail)
+   if (copied >= target && !READ_ONCE(sk->sk_backlog.tail))
break;
 
if (copied) {
@@ -1470,7 +1470,7 @@ static int chtls_pt_recvmsg(struct sock *sk, struct 
msghdr *msg, size_t len,
break;
}
}
-   if (sk->sk_backlog.tail) {
+   if (READ_ONCE(sk->sk_backlog.tail)) {
release_sock(sk);
lock_sock(sk);
chtls_cleanup_rbuf(sk, copied);
@@ -1615,7 +1615,7 @@ static int peekmsg(struct sock *sk, struct msghdr *msg,
break;
}
 
-   if (sk->sk_backlog.tail) {
+   if (READ_ONCE(sk->sk_backlog.tail)) {
/* Do not sleep, just process backlog. */
  

[PATCH AUTOSEL 5.4 028/330] scsi: fnic: fix use after free

2020-09-17 Thread Sasha Levin
From: Pan Bian 

[ Upstream commit ec990306f77fd4c58c3b27cc3b3c53032d6e6670 ]

The memory chunk io_req is released by mempool_free. Accessing
io_req->start_time will result in a use after free bug. The variable
start_time is a backup of the timestamp. So, use start_time here to
avoid use after free.

Link: 
https://lore.kernel.org/r/1572881182-37664-1-git-send-email-bianpan2...@163.com
Signed-off-by: Pan Bian 
Reviewed-by: Satish Kharat 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/fnic/fnic_scsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index e3f5c91d5e4fe..b60795893994c 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -1027,7 +1027,8 @@ static void fnic_fcpio_icmnd_cmpl_handler(struct fnic 
*fnic,
atomic64_inc(_stats->io_stats.io_completions);
 
 
-   io_duration_time = jiffies_to_msecs(jiffies) - 
jiffies_to_msecs(io_req->start_time);
+   io_duration_time = jiffies_to_msecs(jiffies) -
+   jiffies_to_msecs(start_time);
 
if(io_duration_time <= 10)
atomic64_inc(_stats->io_stats.io_btw_0_to_10_msec);
-- 
2.25.1



[PATCH AUTOSEL 5.4 033/330] iomap: Fix overflow in iomap_page_mkwrite

2020-09-17 Thread Sasha Levin
From: Andreas Gruenbacher 

[ Upstream commit add66fcbd3fbe5aa0dd4dddfa23e119c12989a27 ]

On architectures where loff_t is wider than pgoff_t, the expression
((page->index + 1) << PAGE_SHIFT) can overflow.  Rewrite to use the page
offset, which we already compute here anyway.

Signed-off-by: Andreas Gruenbacher 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
Signed-off-by: Sasha Levin 
---
 fs/iomap/buffered-io.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e25901ae3ff44..a30ea7ecb790a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1040,20 +1040,19 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, 
const struct iomap_ops *ops)
 
lock_page(page);
size = i_size_read(inode);
-   if ((page->mapping != inode->i_mapping) ||
-   (page_offset(page) > size)) {
+   offset = page_offset(page);
+   if (page->mapping != inode->i_mapping || offset > size) {
/* We overload EFAULT to mean page got truncated */
ret = -EFAULT;
goto out_unlock;
}
 
/* page is wholly or partially inside EOF */
-   if (((page->index + 1) << PAGE_SHIFT) > size)
+   if (offset > size - PAGE_SIZE)
length = offset_in_page(size);
else
length = PAGE_SIZE;
 
-   offset = page_offset(page);
while (length > 0) {
ret = iomap_apply(inode, offset, length,
IOMAP_WRITE | IOMAP_FAULT, ops, page,
-- 
2.25.1



[PATCH AUTOSEL 5.4 022/330] drm/amd/display: Free gamma after calculating legacy transfer function

2020-09-17 Thread Sasha Levin
From: Nicholas Kazlauskas 

[ Upstream commit 0e3a7c2ec93b15f43a2653e52e9608484391aeaf ]

[Why]
We're leaking memory by not freeing the gamma used to calculate the
transfer function for legacy gamma.

[How]
Release the gamma after we're done with it.

Signed-off-by: Nicholas Kazlauskas 
Reviewed-by: Leo Li 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index b43bb7f90e4e9..2233d293a707a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -210,6 +210,8 @@ static int __set_legacy_tf(struct dc_transfer_func *func,
res = mod_color_calculate_regamma_params(func, gamma, true, has_rom,
 NULL);
 
+   dc_gamma_release();
+
return res ? 0 : -ENOMEM;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 017/330] xfs: fix inode fork extent count overflow

2020-09-17 Thread Sasha Levin
From: Dave Chinner 

[ Upstream commit 3f8a4f1d876d3e3e49e50b0396eaffcc4ba71b08 ]

[commit message is verbose for discussion purposes - will trim it
down later. Some questions about implementation details at the end.]

Zorro Lang recently ran a new test to stress single inode extent
counts now that they are no longer limited by memory allocation.
The test was simply:

# xfs_io -f -c "falloc 0 40t" /mnt/scratch/big-file
# ~/src/xfstests-dev/punch-alternating /mnt/scratch/big-file

This test uncovered a problem where the hole punching operation
appeared to finish with no error, but apparently only created 268M
extents instead of the 10 billion it was supposed to.

Further, trying to punch out extents that should have been present
resulted in success, but no change in the extent count. It looked
like a silent failure.

While running the test and observing the behaviour in real time,
I observed the extent coutn growing at ~2M extents/minute, and saw
this after about an hour:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next ; \
> sleep 60 ; \
> xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 127657993
fsxattr.nextents = 129683339
#

And a few minutes later this:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 4177861124
#

Ah, what? Where did that 4 billion extra extents suddenly come from?

Stop the workload, unmount, mount:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 166044375
#

And it's back at the expected number. i.e. the extent count is
correct on disk, but it's screwed up in memory. I loaded up the
extent list, and immediately:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 4192576215
#

It's bad again. So, where does that number come from?
xfs_fill_fsxattr():

if (ip->i_df.if_flags & XFS_IFEXTENTS)
fa->fsx_nextents = xfs_iext_count(>i_df);
else
fa->fsx_nextents = ip->i_d.di_nextents;

And that's the behaviour I just saw in a nutshell. The on disk count
is correct, but once the tree is loaded into memory, it goes whacky.
Clearly there's something wrong with xfs_iext_count():

inline xfs_extnum_t xfs_iext_count(struct xfs_ifork *ifp)
{
return ifp->if_bytes / sizeof(struct xfs_iext_rec);
}

Simple enough, but 134M extents is 2**27, and that's right about
where things went wrong. A struct xfs_iext_rec is 16 bytes in size,
which means 2**27 * 2**4 = 2**31 and we're right on target for an
integer overflow. And, sure enough:

struct xfs_ifork {
int if_bytes;   /* bytes in if_u1 */


Once we get 2**27 extents in a file, we overflow if_bytes and the
in-core extent count goes wrong. And when we reach 2**28 extents,
if_bytes wraps back to zero and things really start to go wrong
there. This is where the silent failure comes from - only the first
2**28 extents can be looked up directly due to the overflow, all the
extents above this index wrap back to somewhere in the first 2**28
extents. Hence with a regular pattern, trying to punch a hole in the
range that didn't have holes mapped to a hole in the first 2**28
extents and so "succeeded" without changing anything. Hence "silent
failure"...

Fix this by converting if_bytes to a int64_t and converting all the
index variables and size calculations to use int64_t types to avoid
overflows in future. Signed integers are still used to enable easy
detection of extent count underflows. This enables scalability of
extent counts to the limits of the on-disk format - MAXEXTNUM
(2**31) extents.

Current testing is at over 500M extents and still going:

fsxattr.nextents = 517310478

Reported-by: Zorro Lang 
Signed-off-by: Dave Chinner 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
Signed-off-by: Sasha Levin 
---
 fs/xfs/libxfs/xfs_attr_leaf.c  | 18 ++
 fs/xfs/libxfs/xfs_dir2_sf.c|  2 +-
 fs/xfs/libxfs/xfs_iext_tree.c  |  2 +-
 fs/xfs/libxfs/xfs_inode_fork.c |  8 
 fs/xfs/libxfs/xfs_inode_fork.h | 14 --
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index fe277ee5ec7c4..b133209f3aa6a 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -453,13 +453,15 @@ xfs_attr_copy_value(
  * special case for dev/uuid inodes, they have fixed size data forks.
  */
 int
-xfs_attr_shortform_bytesfit(xfs_inode_t *dp, int bytes)
+xfs_attr_shortform_bytesfit(
+   struct xfs_inode*dp,
+   int bytes)
 {
-   int offset;
-   int minforkoff; /* lower limit on valid forkoff locations */
-   int maxforkoff; /* upper limit on valid forkoff locations */
-   int dsize;
-   xfs_mount_t *mp = dp->i_mount;
+   struct xfs_mount*mp = dp->i_mount;
+   int64_t dsize;
+   int minforkoff;
+   int

[PATCH AUTOSEL 5.4 035/330] clk/ti/adpll: allocate room for terminating null

2020-09-17 Thread Sasha Levin
From: Stephen Kitt 

[ Upstream commit 7f6ac72946b88b89ee44c1c527aa8591ac5ffcbe ]

The buffer allocated in ti_adpll_clk_get_name doesn't account for the
terminating null. This patch switches to devm_kasprintf to avoid
overflowing.

Signed-off-by: Stephen Kitt 
Link: https://lkml.kernel.org/r/20191019140634.15596-1-st...@sk2.org
Acked-by: Tony Lindgren 
Signed-off-by: Stephen Boyd 
Signed-off-by: Sasha Levin 
---
 drivers/clk/ti/adpll.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/clk/ti/adpll.c b/drivers/clk/ti/adpll.c
index fdfb90058504c..bb2f2836dab22 100644
--- a/drivers/clk/ti/adpll.c
+++ b/drivers/clk/ti/adpll.c
@@ -194,15 +194,8 @@ static const char *ti_adpll_clk_get_name(struct 
ti_adpll_data *d,
if (err)
return NULL;
} else {
-   const char *base_name = "adpll";
-   char *buf;
-
-   buf = devm_kzalloc(d->dev, 8 + 1 + strlen(base_name) + 1 +
-   strlen(postfix), GFP_KERNEL);
-   if (!buf)
-   return NULL;
-   sprintf(buf, "%08lx.%s.%s", d->pa, base_name, postfix);
-   name = buf;
+   name = devm_kasprintf(d->dev, GFP_KERNEL, "%08lx.adpll.%s",
+ d->pa, postfix);
}
 
return name;
-- 
2.25.1



[PATCH AUTOSEL 5.4 036/330] drm/amdgpu/powerplay: fix AVFS handling with custom powerplay table

2020-09-17 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 53dbc27ad5a93932ff1892a8e4ef266827d74a0f ]

When a custom powerplay table is provided, we need to update
the OD VDDC flag to avoid AVFS being enabled when it shouldn't be.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205393
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index beacfffbdc3eb..ecbc9daea57e0 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -3691,6 +3691,13 @@ static int vega10_set_power_state_tasks(struct pp_hwmgr 
*hwmgr,
PP_ASSERT_WITH_CODE(!result,
"Failed to upload PPtable!", return result);
 
+   /*
+* If a custom pp table is loaded, set DPMTABLE_OD_UPDATE_VDDC flag.
+* That effectively disables AVFS feature.
+*/
+   if(hwmgr->hardcode_pp_table != NULL)
+   data->need_update_dpm_table |= DPMTABLE_OD_UPDATE_VDDC;
+
vega10_update_avfs(hwmgr);
 
/*
-- 
2.25.1



[PATCH AUTOSEL 5.4 037/330] ice: Fix to change Rx/Tx ring descriptor size via ethtool with DCBx

2020-09-17 Thread Sasha Levin
From: Usha Ketineni 

[ Upstream commit c0a3665f71a2f086800abea4d9d14d28269089d6 ]

This patch fixes the call trace caused by the kernel when the Rx/Tx
descriptor size change request is initiated via ethtool when DCB is
configured. ice_set_ringparam() should use vsi->num_txq instead of
vsi->alloc_txq as it represents the queues that are enabled in the
driver when DCB is enabled/disabled. Otherwise, queue index being
used can go out of range.

For example, when vsi->alloc_txq has 104 queues and with 3 TCS enabled
via DCB, each TC gets 34 queues, vsi->num_txq will be 102 and only 102
queues will be enabled.

Signed-off-by: Usha Ketineni 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/intel/ice/ice_ethtool.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c 
b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 62673e27af0e8..fc9ff985a62bd 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -2635,14 +2635,14 @@ ice_set_ringparam(struct net_device *netdev, struct 
ethtool_ringparam *ring)
netdev_info(netdev, "Changing Tx descriptor count from %d to %d\n",
vsi->tx_rings[0]->count, new_tx_cnt);
 
-   tx_rings = devm_kcalloc(>pdev->dev, vsi->alloc_txq,
+   tx_rings = devm_kcalloc(>pdev->dev, vsi->num_txq,
sizeof(*tx_rings), GFP_KERNEL);
if (!tx_rings) {
err = -ENOMEM;
goto done;
}
 
-   for (i = 0; i < vsi->alloc_txq; i++) {
+   ice_for_each_txq(vsi, i) {
/* clone ring and setup updated count */
tx_rings[i] = *vsi->tx_rings[i];
tx_rings[i].count = new_tx_cnt;
@@ -2667,14 +2667,14 @@ process_rx:
netdev_info(netdev, "Changing Rx descriptor count from %d to %d\n",
vsi->rx_rings[0]->count, new_rx_cnt);
 
-   rx_rings = devm_kcalloc(>pdev->dev, vsi->alloc_rxq,
+   rx_rings = devm_kcalloc(>pdev->dev, vsi->num_rxq,
sizeof(*rx_rings), GFP_KERNEL);
if (!rx_rings) {
err = -ENOMEM;
goto done;
}
 
-   for (i = 0; i < vsi->alloc_rxq; i++) {
+   ice_for_each_rxq(vsi, i) {
/* clone ring and setup updated count */
rx_rings[i] = *vsi->rx_rings[i];
rx_rings[i].count = new_rx_cnt;
@@ -2712,7 +2712,7 @@ process_link:
ice_down(vsi);
 
if (tx_rings) {
-   for (i = 0; i < vsi->alloc_txq; i++) {
+   ice_for_each_txq(vsi, i) {
ice_free_tx_ring(vsi->tx_rings[i]);
*vsi->tx_rings[i] = tx_rings[i];
}
@@ -2720,7 +2720,7 @@ process_link:
}
 
if (rx_rings) {
-   for (i = 0; i < vsi->alloc_rxq; i++) {
+   ice_for_each_rxq(vsi, i) {
ice_free_rx_ring(vsi->rx_rings[i]);
/* copy the real tail offset */
rx_rings[i].tail = vsi->rx_rings[i]->tail;
@@ -2744,7 +2744,7 @@ process_link:
 free_tx:
/* error cleanup if the Rx allocations failed after getting Tx */
if (tx_rings) {
-   for (i = 0; i < vsi->alloc_txq; i++)
+   ice_for_each_txq(vsi, i)
ice_free_tx_ring(_rings[i]);
devm_kfree(>pdev->dev, tx_rings);
}
-- 
2.25.1



[PATCH v3 3/6] Bluetooth: Interleave with allowlist scan

2020-09-17 Thread Howard Chung
This patch implements the interleaving between allowlist scan and
no-filter scan. It'll be used to save power when at least one monitor is
registered and at least one pending connection or one device to be
scanned for.

The durations of the allowlist scan and the no-filter scan are
controlled by MGMT command: Set Default System Configuration. The
default values are set randomly for now.

Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
Reviewed-by: Manish Mandlik 
---

(no changes since v2)

Changes in v2:
- remove 'case 0x001c' in mgmt_config.c

 include/net/bluetooth/hci_core.h |  10 +++
 net/bluetooth/hci_core.c |   4 +
 net/bluetooth/hci_request.c  | 137 +--
 net/bluetooth/mgmt_config.c  |  12 +++
 4 files changed, 155 insertions(+), 8 deletions(-)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index 9873e1c8cd163..179350f869fdb 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -361,6 +361,8 @@ struct hci_dev {
__u8ssp_debug_mode;
__u8hw_error_code;
__u32   clock;
+   __u16   advmon_allowlist_duration;
+   __u16   advmon_no_filter_duration;
 
__u16   devid_source;
__u16   devid_vendor;
@@ -542,6 +544,14 @@ struct hci_dev {
struct delayed_work rpa_expired;
bdaddr_trpa;
 
+   enum {
+   ADV_MONITOR_SCAN_NONE,
+   ADV_MONITOR_SCAN_NO_FILTER,
+   ADV_MONITOR_SCAN_ALLOWLIST
+   } adv_monitor_scan_state;
+
+   struct delayed_work interleave_adv_monitor_scan;
+
 #if IS_ENABLED(CONFIG_BT_LEDS)
struct led_trigger  *power_led;
 #endif
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index f30a1f5950e15..6c8850149265a 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3592,6 +3592,10 @@ struct hci_dev *hci_alloc_dev(void)
hdev->cur_adv_instance = 0x00;
hdev->adv_instance_timeout = 0;
 
+   /* The default values will be chosen in the future */
+   hdev->advmon_allowlist_duration = 300;
+   hdev->advmon_no_filter_duration = 500;
+
hdev->sniff_max_interval = 800;
hdev->sniff_min_interval = 80;
 
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index d2b06f5c93804..89443b48d90ce 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -378,6 +378,57 @@ void __hci_req_write_fast_connectable(struct hci_request 
*req, bool enable)
hci_req_add(req, HCI_OP_WRITE_PAGE_SCAN_TYPE, 1, );
 }
 
+static void start_interleave_scan(struct hci_dev *hdev)
+{
+   hdev->adv_monitor_scan_state = ADV_MONITOR_SCAN_NO_FILTER;
+   queue_delayed_work(hdev->req_workqueue,
+  >interleave_adv_monitor_scan, 0);
+}
+
+static bool is_interleave_scanning(struct hci_dev *hdev)
+{
+   return hdev->adv_monitor_scan_state != ADV_MONITOR_SCAN_NONE;
+}
+
+static void cancel_interleave_scan(struct hci_dev *hdev)
+{
+   bt_dev_dbg(hdev, "%s cancelling interleave scan", hdev->name);
+
+   cancel_delayed_work_sync(>interleave_adv_monitor_scan);
+
+   hdev->adv_monitor_scan_state = ADV_MONITOR_SCAN_NONE;
+}
+
+/* Return true if interleave_scan is running after exiting this function,
+ * otherwise, return false
+ */
+static bool update_adv_monitor_scan_state(struct hci_dev *hdev)
+{
+   if (!hci_is_adv_monitoring(hdev) ||
+   (list_empty(>pend_le_conns) &&
+list_empty(>pend_le_reports))) {
+   if (is_interleave_scanning(hdev)) {
+   /* If the interleave condition no longer holds, cancel
+* the existed interleave scan.
+*/
+   cancel_interleave_scan(hdev);
+   }
+   return false;
+   }
+
+   if (!is_interleave_scanning(hdev)) {
+   /* If there is at least one ADV monitors and one pending LE
+* connection or one device to be scanned for, we should
+* alternate between allowlist scan and one without any filters
+* to save power.
+*/
+   start_interleave_scan(hdev);
+   bt_dev_dbg(hdev, "%s starting interleave scan", hdev->name);
+   }
+
+   return true;
+}
+
 /* This function controls the background scanning based on hdev->pend_le_conns
  * list. If there are pending LE connection we start the background scanning,
  * otherwise we stop it.
@@ -449,9 +500,11 @@ static void __hci_update_background_scan(struct 
hci_request *req)
if (hci_dev_test_flag(hdev, HCI_LE_SCAN))
hci_req_add_le_scan_disable(req, false);
 
-   hci_req_add_le_passive_scan(req);
-
-   BT_DBG("%s starting background scanning", hdev->name);
+   

[PATCH AUTOSEL 5.4 039/330] mfd: mfd-core: Protect against NULL call-back function pointer

2020-09-17 Thread Sasha Levin
From: Lee Jones 

[ Upstream commit b195e101580db390f50b0d587b7f66f241d2bc88 ]

If a child device calls mfd_cell_{en,dis}able() without an appropriate
call-back being set, we are likely to encounter a panic.  Avoid this
by adding suitable checking.

Signed-off-by: Lee Jones 
Reviewed-by: Daniel Thompson 
Reviewed-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 drivers/mfd/mfd-core.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
index 23276a80e3b48..96d02b6f06fd8 100644
--- a/drivers/mfd/mfd-core.c
+++ b/drivers/mfd/mfd-core.c
@@ -28,6 +28,11 @@ int mfd_cell_enable(struct platform_device *pdev)
const struct mfd_cell *cell = mfd_get_cell(pdev);
int err = 0;
 
+   if (!cell->enable) {
+   dev_dbg(>dev, "No .enable() call-back registered\n");
+   return 0;
+   }
+
/* only call enable hook if the cell wasn't previously enabled */
if (atomic_inc_return(cell->usage_count) == 1)
err = cell->enable(pdev);
@@ -45,6 +50,11 @@ int mfd_cell_disable(struct platform_device *pdev)
const struct mfd_cell *cell = mfd_get_cell(pdev);
int err = 0;
 
+   if (!cell->disable) {
+   dev_dbg(>dev, "No .disable() call-back registered\n");
+   return 0;
+   }
+
/* only disable if no other clients are using it */
if (atomic_dec_return(cell->usage_count) == 0)
err = cell->disable(pdev);
-- 
2.25.1



[PATCH AUTOSEL 5.4 046/330] dmaengine: mediatek: hsdma_probe: fixed a memory leak when devm_request_irq fails

2020-09-17 Thread Sasha Levin
From: Satendra Singh Thakur 

[ Upstream commit 1ff95243257fad07290dcbc5f7a6ad79d6e703e2 ]

When devm_request_irq fails, currently, the function
dma_async_device_unregister gets called. This doesn't free
the resources allocated by of_dma_controller_register.
Therefore, we have called of_dma_controller_free for this purpose.

Signed-off-by: Satendra Singh Thakur 
Link: https://lore.kernel.org/r/20191109113523.6067-1-sst2...@gmail.com
Signed-off-by: Vinod Koul 
Signed-off-by: Sasha Levin 
---
 drivers/dma/mediatek/mtk-hsdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/mediatek/mtk-hsdma.c b/drivers/dma/mediatek/mtk-hsdma.c
index 1a2028e1c29e9..4c58da7421432 100644
--- a/drivers/dma/mediatek/mtk-hsdma.c
+++ b/drivers/dma/mediatek/mtk-hsdma.c
@@ -997,7 +997,7 @@ static int mtk_hsdma_probe(struct platform_device *pdev)
if (err) {
dev_err(>dev,
"request_irq failed with err %d\n", err);
-   goto err_unregister;
+   goto err_free;
}
 
platform_set_drvdata(pdev, hsdma);
@@ -1006,6 +1006,8 @@ static int mtk_hsdma_probe(struct platform_device *pdev)
 
return 0;
 
+err_free:
+   of_dma_controller_free(pdev->dev.of_node);
 err_unregister:
dma_async_device_unregister(dd);
 
-- 
2.25.1



[PATCH v3 4/6] Bluetooth: Handle system suspend resume case

2020-09-17 Thread Howard Chung
This patch adds code to handle the system suspension during interleave
scan. The interleave scan will be canceled when the system is going to
sleep, and will be restarted after waking up.

Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
Reviewed-by: Manish Mandlik 
Reviewed-by: Abhishek Pandit-Subedi 
Reviewed-by: Miao-chen Chou 
---

(no changes since v1)

 net/bluetooth/hci_request.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index 89443b48d90ce..d9082019b6386 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -1081,6 +1081,9 @@ void hci_req_add_le_passive_scan(struct hci_request *req)
filter_policy |= 0x02;
 
if (hdev->suspended) {
+   /* Block suspend notifier on response */
+   set_bit(SUSPEND_SCAN_ENABLE, hdev->suspend_tasks);
+
window = hdev->le_scan_window_suspend;
interval = hdev->le_scan_int_suspend;
} else if (hci_is_le_conn_scanning(hdev)) {
@@ -1167,10 +1170,8 @@ static void hci_req_config_le_suspend_scan(struct 
hci_request *req)
hci_req_add_le_scan_disable(req, false);
 
/* Configure params and enable scanning */
-   hci_req_add_le_passive_scan(req);
+   __hci_update_background_scan(req);
 
-   /* Block suspend notifier on response */
-   set_bit(SUSPEND_SCAN_ENABLE, req->hdev->suspend_tasks);
 }
 
 static void cancel_adv_timeout(struct hci_dev *hdev)
@@ -1282,8 +1283,10 @@ void hci_req_prepare_suspend(struct hci_dev *hdev, enum 
suspended_state next)
hci_req_add(, HCI_OP_WRITE_SCAN_ENABLE, 1, _scan);
 
/* Disable LE passive scan if enabled */
-   if (hci_dev_test_flag(hdev, HCI_LE_SCAN))
+   if (hci_dev_test_flag(hdev, HCI_LE_SCAN)) {
+   cancel_interleave_scan(hdev);
hci_req_add_le_scan_disable(, false);
+   }
 
/* Mark task needing completion */
set_bit(SUSPEND_SCAN_DISABLE, hdev->suspend_tasks);
-- 
2.28.0.681.g6f77f65b4e-goog



[PATCH AUTOSEL 5.4 040/330] drm/amdgpu/powerplay/smu7: fix AVFS handling with custom powerplay table

2020-09-17 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 901245624c7812b6c95d67177bae850e783b5212 ]

When a custom powerplay table is provided, we need to update
the OD VDDC flag to avoid AVFS being enabled when it shouldn't be.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205393
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index e6da53e9c3f46..edd6d4912edeb 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3986,6 +3986,13 @@ static int smu7_set_power_state_tasks(struct pp_hwmgr 
*hwmgr, const void *input)
"Failed to populate and upload SCLK MCLK DPM levels!",
result = tmp_result);
 
+   /*
+* If a custom pp table is loaded, set DPMTABLE_OD_UPDATE_VDDC flag.
+* That effectively disables AVFS feature.
+*/
+   if (hwmgr->hardcode_pp_table != NULL)
+   data->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_VDDC;
+
tmp_result = smu7_update_avfs(hwmgr);
PP_ASSERT_WITH_CODE((0 == tmp_result),
"Failed to update avfs voltages!",
-- 
2.25.1



[PATCH v3 1/6] Bluetooth: Update Adv monitor count upon removal

2020-09-17 Thread Howard Chung
From: Miao-chen Chou 

This fixes the count of Adv monitor upon monitor removal.

The following test was performed.
- Start two btmgmt consoles, issue a btmgmt advmon-remove command on one
console and observe a MGMT_EV_ADV_MONITOR_REMOVED event on the other.

Signed-off-by: Miao-chen Chou 
Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
---

Changes in v3:
- Remove 'Bluez' prefix

Changes in v2:
- delete 'case 0x001c' in mgmt_config.c

 net/bluetooth/hci_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 8a2645a833013..f30a1f5950e15 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3061,6 +3061,7 @@ static int free_adv_monitor(int id, void *ptr, void *data)
 
idr_remove(>adv_monitors_idr, monitor->handle);
hci_free_adv_monitor(monitor);
+   hdev->adv_monitors_cnt--;
 
return 0;
 }
@@ -3077,6 +3078,7 @@ int hci_remove_adv_monitor(struct hci_dev *hdev, u16 
handle)
 
idr_remove(>adv_monitors_idr, monitor->handle);
hci_free_adv_monitor(monitor);
+   hdev->adv_monitors_cnt--;
} else {
/* Remove all monitors if handle is 0. */
idr_for_each(>adv_monitors_idr, _adv_monitor, hdev);
-- 
2.28.0.681.g6f77f65b4e-goog



[PATCH v3 6/6] Bluetooth: Add toggle to switch off interleave scan

2020-09-17 Thread Howard Chung
This patch add a configurable parameter to switch off the interleave
scan feature.

Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
---

(no changes since v1)

 include/net/bluetooth/hci_core.h | 1 +
 net/bluetooth/hci_core.c | 1 +
 net/bluetooth/hci_request.c  | 3 ++-
 net/bluetooth/mgmt_config.c  | 6 ++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index 179350f869fdb..c3253f1cac0c2 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -363,6 +363,7 @@ struct hci_dev {
__u32   clock;
__u16   advmon_allowlist_duration;
__u16   advmon_no_filter_duration;
+   __u16   enable_advmon_interleave_scan;
 
__u16   devid_source;
__u16   devid_vendor;
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 6c8850149265a..4608715860cce 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3595,6 +3595,7 @@ struct hci_dev *hci_alloc_dev(void)
/* The default values will be chosen in the future */
hdev->advmon_allowlist_duration = 300;
hdev->advmon_no_filter_duration = 500;
+   hdev->enable_advmon_interleave_scan = 0x0001;   /* Default to enable */
 
hdev->sniff_max_interval = 800;
hdev->sniff_min_interval = 80;
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index 1fcf6736811e4..bb38e1dead68f 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -500,7 +500,8 @@ static void __hci_update_background_scan(struct hci_request 
*req)
if (hci_dev_test_flag(hdev, HCI_LE_SCAN))
hci_req_add_le_scan_disable(req, false);
 
-   if (!update_adv_monitor_scan_state(hdev)) {
+   if (!hdev->enable_advmon_interleave_scan ||
+   !update_adv_monitor_scan_state(hdev)) {
hci_req_add_le_passive_scan(req);
bt_dev_dbg(hdev, "%s starting background scanning",
   hdev->name);
diff --git a/net/bluetooth/mgmt_config.c b/net/bluetooth/mgmt_config.c
index 1802f7023158c..b4198c33a1b72 100644
--- a/net/bluetooth/mgmt_config.c
+++ b/net/bluetooth/mgmt_config.c
@@ -69,6 +69,7 @@ int read_def_system_config(struct sock *sk, struct hci_dev 
*hdev, void *data,
def_le_autoconnect_timeout),
HDEV_PARAM_U16(0x001d, advmon_allowlist_duration),
HDEV_PARAM_U16(0x001e, advmon_no_filter_duration),
+   HDEV_PARAM_U16(0x001f, enable_advmon_interleave_scan),
};
struct mgmt_rp_read_def_system_config *rp = (void *)params;
 
@@ -142,6 +143,7 @@ int set_def_system_config(struct sock *sk, struct hci_dev 
*hdev, void *data,
case 0x001b:
case 0x001d:
case 0x001e:
+   case 0x001f:
if (len != sizeof(u16)) {
bt_dev_warn(hdev, "invalid length %d, exp %zu 
for type %d",
len, sizeof(u16), type);
@@ -263,6 +265,10 @@ int set_def_system_config(struct sock *sk, struct hci_dev 
*hdev, void *data,
hdev->advmon_no_filter_duration =
TLV_GET_LE16(buffer);
break;
+   case 0x0001f:
+   hdev->enable_advmon_interleave_scan =
+   TLV_GET_LE16(buffer);
+   break;
default:
bt_dev_warn(hdev, "unsupported parameter %u", type);
break;
-- 
2.28.0.681.g6f77f65b4e-goog



[PATCH AUTOSEL 5.4 048/330] RDMA/qedr: Fix potential use after free

2020-09-17 Thread Sasha Levin
From: Pan Bian 

[ Upstream commit 960657b732e1ce21b07be5ab48a7ad3913d72ba4 ]

Move the release operation after error log to avoid possible use after
free.

Link: 
https://lore.kernel.org/r/1573021434-18768-1-git-send-email-bianpan2...@163.com
Signed-off-by: Pan Bian 
Acked-by: Michal Kalderon 
Reviewed-by: Jason Gunthorpe 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Sasha Levin 
---
 drivers/infiniband/hw/qedr/qedr_iw_cm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/qedr/qedr_iw_cm.c 
b/drivers/infiniband/hw/qedr/qedr_iw_cm.c
index a7a926b7b5628..6dea49e11f5f0 100644
--- a/drivers/infiniband/hw/qedr/qedr_iw_cm.c
+++ b/drivers/infiniband/hw/qedr/qedr_iw_cm.c
@@ -490,10 +490,10 @@ qedr_addr6_resolve(struct qedr_dev *dev,
 
if ((!dst) || dst->error) {
if (dst) {
-   dst_release(dst);
DP_ERR(dev,
   "ip6_route_output returned dst->error = %d\n",
   dst->error);
+   dst_release(dst);
}
return -EINVAL;
}
-- 
2.25.1



[PATCH v3 2/6] Bluetooth: Set scan parameters for ADV Monitor

2020-09-17 Thread Howard Chung
Set scan parameters when there is at least one Advertisement monitor.

Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
Reviewed-by: Manish Mandlik 
Reviewed-by: Abhishek Pandit-Subedi 
Reviewed-by: Miao-chen Chou 
---

(no changes since v1)

 net/bluetooth/hci_request.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index 413e3a5aabf54..d2b06f5c93804 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -1027,6 +1027,9 @@ void hci_req_add_le_passive_scan(struct hci_request *req)
} else if (hci_is_le_conn_scanning(hdev)) {
window = hdev->le_scan_window_connect;
interval = hdev->le_scan_int_connect;
+   } else if (hci_is_adv_monitoring(hdev)) {
+   window = hdev->le_scan_window_adv_monitor;
+   interval = hdev->le_scan_int_adv_monitor;
} else {
window = hdev->le_scan_window;
interval = hdev->le_scan_interval;
-- 
2.28.0.681.g6f77f65b4e-goog



[PATCH v3 5/6] Bluetooth: Handle active scan case

2020-09-17 Thread Howard Chung
This patch adds code to handle the active scan during interleave
scan. The interleave scan will be canceled when users start active scan,
and it will be restarted after active scan stopped.

Signed-off-by: Howard Chung 
Reviewed-by: Alain Michaud 
Reviewed-by: Manish Mandlik 
---

(no changes since v1)

 net/bluetooth/hci_request.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index d9082019b6386..1fcf6736811e4 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -3085,8 +3085,10 @@ static int active_scan(struct hci_request *req, unsigned 
long opt)
 * running. Thus, we should temporarily stop it in order to set the
 * discovery scanning parameters.
 */
-   if (hci_dev_test_flag(hdev, HCI_LE_SCAN))
+   if (hci_dev_test_flag(hdev, HCI_LE_SCAN)) {
hci_req_add_le_scan_disable(req, false);
+   cancel_interleave_scan(hdev);
+   }
 
/* All active scans will be done with either a resolvable private
 * address (when privacy feature has been enabled) or non-resolvable
-- 
2.28.0.681.g6f77f65b4e-goog



[PATCH AUTOSEL 5.4 052/330] xfs: fix attr leaf header freemap.size underflow

2020-09-17 Thread Sasha Levin
From: Brian Foster 

[ Upstream commit 2a2b5932db67586bacc560cc065d62faece5b996 ]

The leaf format xattr addition helper xfs_attr3_leaf_add_work()
adjusts the block freemap in a couple places. The first update drops
the size of the freemap that the caller had already selected to
place the xattr name/value data. Before the function returns, it
also checks whether the entries array has encroached on a freemap
range by virtue of the new entry addition. This is necessary because
the entries array grows from the start of the block (but end of the
block header) towards the end of the block while the name/value data
grows from the end of the block in the opposite direction. If the
associated freemap is already empty, however, size is zero and the
subtraction underflows the field and causes corruption.

This is reproduced rarely by generic/070. The observed behavior is
that a smaller sized freemap is aligned to the end of the entries
list, several subsequent xattr additions land in larger freemaps and
the entries list expands into the smaller freemap until it is fully
consumed and then underflows. Note that it is not otherwise a
corruption for the entries array to consume an empty freemap because
the nameval list (i.e. the firstused pointer in the xattr header)
starts beyond the end of the corrupted freemap.

Update the freemap size modification to account for the fact that
the freemap entry can be empty and thus stale.

Signed-off-by: Brian Foster 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
Signed-off-by: Sasha Levin 
---
 fs/xfs/libxfs/xfs_attr_leaf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index b133209f3aa6a..f1535549d1ced 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -1451,7 +1451,9 @@ xfs_attr3_leaf_add_work(
for (i = 0; i < XFS_ATTR_LEAF_MAPSIZE; i++) {
if (ichdr->freemap[i].base == tmp) {
ichdr->freemap[i].base += sizeof(xfs_attr_leaf_entry_t);
-   ichdr->freemap[i].size -= sizeof(xfs_attr_leaf_entry_t);
+   ichdr->freemap[i].size -=
+   min_t(uint16_t, ichdr->freemap[i].size,
+   sizeof(xfs_attr_leaf_entry_t));
}
}
ichdr->usedbytes += xfs_attr_leaf_entsize(leaf, args->index);
-- 
2.25.1



[PATCH AUTOSEL 5.4 050/330] PCI: Avoid double hpmemsize MMIO window assignment

2020-09-17 Thread Sasha Levin
From: Nicholas Johnson 

[ Upstream commit c13704f5685deb7d6eb21e293233e0901ed77377 ]

Previously, the kernel sometimes assigned more MMIO or MMIO_PREF space than
desired.  For example, if the user requested 128M of space with
"pci=realloc,hpmemsize=128M", we sometimes assigned 256M:

  pci :06:01.0: BAR 14: assigned [mem 0x9010-0xa00f] = 256M
  pci :06:04.0: BAR 14: assigned [mem 0xa020-0xb01f] = 256M

With this patch applied:

  pci :06:01.0: BAR 14: assigned [mem 0x9010-0x980f] = 128M
  pci :06:04.0: BAR 14: assigned [mem 0x9820-0xa01f] = 128M

This happened when in the first pass, the MMIO_PREF succeeded but the MMIO
failed. In the next pass, because MMIO_PREF was already assigned, the
attempt to assign MMIO_PREF returned an error code instead of success
(nothing more to do, already allocated). Hence, the size which was actually
allocated, but thought to have failed, was placed in the MMIO window.

The bug resulted in the MMIO_PREF being added to the MMIO window, which
meant doubling if MMIO_PREF size = MMIO size. With a large MMIO_PREF, the
MMIO window would likely fail to be assigned altogether due to lack of
32-bit address space.

Change find_free_bus_resource() to do the following:

  - Return first unassigned resource of the correct type.
  - If there is none, return first assigned resource of the correct type.
  - If none of the above, return NULL.

Returning an assigned resource of the correct type allows the caller to
distinguish between already assigned and no resource of the correct type.

Add checks in pbus_size_io() and pbus_size_mem() to return success if
resource returned from find_free_bus_resource() is already allocated.

This avoids pbus_size_io() and pbus_size_mem() returning error code to
__pci_bus_size_bridges() when a resource has been successfully assigned in
a previous pass. This fixes the existing behaviour where space for a
resource could be reserved multiple times in different parent bridge
windows.

Link: 
https://lore.kernel.org/lkml/20190531171216.20532-2-log...@deltatee.com/T/#u
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203243
Link: 
https://lore.kernel.org/r/ps2p216mb075563aa6ad242aa666edc6a80...@ps2p216mb0755.korp216.prod.outlook.com
Reported-by: Kit Chow 
Reported-by: Nicholas Johnson 
Signed-off-by: Nicholas Johnson 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Mika Westerberg 
Reviewed-by: Logan Gunthorpe 
Signed-off-by: Sasha Levin 
---
 drivers/pci/setup-bus.c | 38 +++---
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 5356630e0e483..44f4866d95d8c 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -752,24 +752,32 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
 }
 
 /*
- * Helper function for sizing routines: find first available bus resource
- * of a given type.  Note: we intentionally skip the bus resources which
- * have already been assigned (that is, have non-NULL parent resource).
+ * Helper function for sizing routines.  Assigned resources have non-NULL
+ * parent resource.
+ *
+ * Return first unassigned resource of the correct type.  If there is none,
+ * return first assigned resource of the correct type.  If none of the
+ * above, return NULL.
+ *
+ * Returning an assigned resource of the correct type allows the caller to
+ * distinguish between already assigned and no resource of the correct type.
  */
-static struct resource *find_free_bus_resource(struct pci_bus *bus,
-  unsigned long type_mask,
-  unsigned long type)
+static struct resource *find_bus_resource_of_type(struct pci_bus *bus,
+ unsigned long type_mask,
+ unsigned long type)
 {
+   struct resource *r, *r_assigned = NULL;
int i;
-   struct resource *r;
 
pci_bus_for_each_resource(bus, r, i) {
if (r == _resource || r == _resource)
continue;
if (r && (r->flags & type_mask) == type && !r->parent)
return r;
+   if (r && (r->flags & type_mask) == type && !r_assigned)
+   r_assigned = r;
}
-   return NULL;
+   return r_assigned;
 }
 
 static resource_size_t calculate_iosize(resource_size_t size,
@@ -866,8 +874,8 @@ static void pbus_size_io(struct pci_bus *bus, 
resource_size_t min_size,
 struct list_head *realloc_head)
 {
struct pci_dev *dev;
-   struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,
-   IORESOURCE_IO);
+   struct resource *b_res = find_bus_resource_of_type(bus, IORESOURCE_IO,
+  IORESOURCE_IO);

[PATCH AUTOSEL 5.4 057/330] scsi: pm80xx: Cleanup command when a reset times out

2020-09-17 Thread Sasha Levin
From: peter chang 

[ Upstream commit 51c1c5f6ed64c2b65a8cf89dac136273d25ca540 ]

Added the fix so the if driver properly sent the abort it tries to remove
it from the firmware's list of outstanding commands regardless of the abort
status. This means that the task gets freed 'now' rather than possibly
getting freed later when the scsi layer thinks it's leaked but still valid.

Link: https://lore.kernel.org/r/20191114100910.6153-10-deepak.u...@microchip.com
Acked-by: Jack Wang 
Signed-off-by: peter chang 
Signed-off-by: Deepak Ukey 
Signed-off-by: Viswas G 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/pm8001/pm8001_sas.c | 50 +++-
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index 7e48154e11c36..7912ed64d3b9c 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -1202,8 +1202,8 @@ int pm8001_abort_task(struct sas_task *task)
pm8001_dev = dev->lldd_dev;
pm8001_ha = pm8001_find_ha_by_dev(dev);
phy_id = pm8001_dev->attached_phy;
-   rc = pm8001_find_tag(task, );
-   if (rc == 0) {
+   ret = pm8001_find_tag(task, );
+   if (ret == 0) {
pm8001_printk("no tag for task:%p\n", task);
return TMF_RESP_FUNC_FAILED;
}
@@ -1241,26 +1241,50 @@ int pm8001_abort_task(struct sas_task *task)
 
/* 2. Send Phy Control Hard Reset */
reinit_completion();
+   phy->port_reset_status = PORT_RESET_TMO;
phy->reset_success = false;
phy->enable_completion = 
phy->reset_completion = _reset;
ret = PM8001_CHIP_DISP->phy_ctl_req(pm8001_ha, phy_id,
PHY_HARD_RESET);
-   if (ret)
-   goto out;
-   PM8001_MSG_DBG(pm8001_ha,
-   pm8001_printk("Waiting for local phy ctl\n"));
-   wait_for_completion();
-   if (!phy->reset_success)
+   if (ret) {
+   phy->enable_completion = NULL;
+   phy->reset_completion = NULL;
goto out;
+   }
 
-   /* 3. Wait for Port Reset complete / Port reset TMO */
+   /* In the case of the reset timeout/fail we still
+* abort the command at the firmware. The assumption
+* here is that the drive is off doing something so
+* that it's not processing requests, and we want to
+* avoid getting a completion for this and either
+* leaking the task in libsas or losing the race and
+* getting a double free.
+*/
PM8001_MSG_DBG(pm8001_ha,
+   pm8001_printk("Waiting for local phy ctl\n"));
+   ret = wait_for_completion_timeout(,
+   PM8001_TASK_TIMEOUT * HZ);
+   if (!ret || !phy->reset_success) {
+   phy->enable_completion = NULL;
+   phy->reset_completion = NULL;
+   } else {
+   /* 3. Wait for Port Reset complete or
+* Port reset TMO
+*/
+   PM8001_MSG_DBG(pm8001_ha,
pm8001_printk("Waiting for Port reset\n"));
-   wait_for_completion(_reset);
-   if (phy->port_reset_status) {
-   pm8001_dev_gone_notify(dev);
-   goto out;
+   ret = wait_for_completion_timeout(
+   _reset,
+   PM8001_TASK_TIMEOUT * HZ);
+   if (!ret)
+   phy->reset_completion = NULL;
+   WARN_ON(phy->port_reset_status ==
+   PORT_RESET_TMO);
+   if (phy->port_reset_status == PORT_RESET_TMO) {
+   pm8001_dev_gone_notify(dev);
+   goto out;
+   }
}
 
/*
-- 
2.25.1



[PATCH AUTOSEL 5.4 060/330] debugfs: Fix !DEBUG_FS debugfs_create_automount

2020-09-17 Thread Sasha Levin
From: Kusanagi Kouichi 

[ Upstream commit 4250b047039d324e0ff65267c8beb5bad5052a86 ]

If DEBUG_FS=n, compile fails with the following error:

kernel/trace/trace.c: In function 'tracing_init_dentry':
kernel/trace/trace.c:8658:9: error: passing argument 3 of 
'debugfs_create_automount' from incompatible pointer type 
[-Werror=incompatible-pointer-types]
 8658 | trace_automount, NULL);
  | ^~~
  | |
  | struct vfsmount * (*)(struct dentry *, void *)
In file included from kernel/trace/trace.c:24:
./include/linux/debugfs.h:206:25: note: expected 'struct vfsmount * (*)(void 
*)' but argument is of type 'struct vfsmount * (*)(struct dentry *, void *)'
  206 |  struct vfsmount *(*f)(void *),
  |  ~~~^~

Signed-off-by: Kusanagi Kouichi 
Link: 
https://lore.kernel.org/r/20191121102021787.mlmy.25002.ppp.dion.ne...@dmta0003.auone-net.jp
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 include/linux/debugfs.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index 58424eb3b3291..798f0b9b43aee 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -54,6 +54,8 @@ static const struct file_operations __fops = {
\
.llseek  = no_llseek,   \
 }
 
+typedef struct vfsmount *(*debugfs_automount_t)(struct dentry *, void *);
+
 #if defined(CONFIG_DEBUG_FS)
 
 struct dentry *debugfs_lookup(const char *name, struct dentry *parent);
@@ -75,7 +77,6 @@ struct dentry *debugfs_create_dir(const char *name, struct 
dentry *parent);
 struct dentry *debugfs_create_symlink(const char *name, struct dentry *parent,
  const char *dest);
 
-typedef struct vfsmount *(*debugfs_automount_t)(struct dentry *, void *);
 struct dentry *debugfs_create_automount(const char *name,
struct dentry *parent,
debugfs_automount_t f,
@@ -203,7 +204,7 @@ static inline struct dentry *debugfs_create_symlink(const 
char *name,
 
 static inline struct dentry *debugfs_create_automount(const char *name,
struct dentry *parent,
-   struct vfsmount *(*f)(void *),
+   debugfs_automount_t f,
void *data)
 {
return ERR_PTR(-ENODEV);
-- 
2.25.1



[PATCH AUTOSEL 5.4 054/330] ubi: Fix producing anchor PEBs

2020-09-17 Thread Sasha Levin
From: Sascha Hauer 

[ Upstream commit f9c34bb529975fe9f85b870a80c53a83a3c5a182 ]

When a new fastmap is about to be written UBI must make sure it has a
free block for a fastmap anchor available. For this ubi_update_fastmap()
calls ubi_ensure_anchor_pebs(). This stopped working with 2e8f08deabbc
("ubi: Fix races around ubi_refill_pools()"), with this commit the wear
leveling code is blocked and can no longer produce free PEBs. UBI then
more often than not falls back to write the new fastmap anchor to the
same block it was already on which means the same erase block gets
erased during each fastmap write and wears out quite fast.

As the locking prevents us from producing the anchor PEB when we
actually need it, this patch changes the strategy for creating the
anchor PEB. We no longer create it on demand right before we want to
write a fastmap, but instead we create an anchor PEB right after we have
written a fastmap. This gives us enough time to produce a new anchor PEB
before it is needed. To make sure we have an anchor PEB for the very
first fastmap write we call ubi_ensure_anchor_pebs() during
initialisation as well.

Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()")
Signed-off-by: Sascha Hauer 
Signed-off-by: Richard Weinberger 
Signed-off-by: Sasha Levin 
---
 drivers/mtd/ubi/fastmap-wl.c | 31 ++-
 drivers/mtd/ubi/fastmap.c| 14 +-
 drivers/mtd/ubi/ubi.h|  6 --
 drivers/mtd/ubi/wl.c | 32 ++--
 drivers/mtd/ubi/wl.h |  1 -
 5 files changed, 41 insertions(+), 43 deletions(-)

diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index c44c8470247e1..426820ab9afe1 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -57,18 +57,6 @@ static void return_unused_pool_pebs(struct ubi_device *ubi,
}
 }
 
-static int anchor_pebs_available(struct rb_root *root)
-{
-   struct rb_node *p;
-   struct ubi_wl_entry *e;
-
-   ubi_rb_for_each_entry(p, e, root, u.rb)
-   if (e->pnum < UBI_FM_MAX_START)
-   return 1;
-
-   return 0;
-}
-
 /**
  * ubi_wl_get_fm_peb - find a physical erase block with a given maximal number.
  * @ubi: UBI device description object
@@ -277,8 +265,26 @@ static struct ubi_wl_entry *get_peb_for_wl(struct 
ubi_device *ubi)
 int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
 {
struct ubi_work *wrk;
+   struct ubi_wl_entry *anchor;
 
spin_lock(>wl_lock);
+
+   /* Do we already have an anchor? */
+   if (ubi->fm_anchor) {
+   spin_unlock(>wl_lock);
+   return 0;
+   }
+
+   /* See if we can find an anchor PEB on the list of free PEBs */
+   anchor = ubi_wl_get_fm_peb(ubi, 1);
+   if (anchor) {
+   ubi->fm_anchor = anchor;
+   spin_unlock(>wl_lock);
+   return 0;
+   }
+
+   /* No luck, trigger wear leveling to produce a new anchor PEB */
+   ubi->fm_do_produce_anchor = 1;
if (ubi->wl_scheduled) {
spin_unlock(>wl_lock);
return 0;
@@ -294,7 +300,6 @@ int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
return -ENOMEM;
}
 
-   wrk->anchor = 1;
wrk->func = _leveling_worker;
__schedule_ubi_work(ubi, wrk);
return 0;
diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index 604772fc4a965..53f448e7433a9 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -1543,14 +1543,6 @@ int ubi_update_fastmap(struct ubi_device *ubi)
return 0;
}
 
-   ret = ubi_ensure_anchor_pebs(ubi);
-   if (ret) {
-   up_write(>fm_eba_sem);
-   up_write(>work_sem);
-   up_write(>fm_protect);
-   return ret;
-   }
-
new_fm = kzalloc(sizeof(*new_fm), GFP_KERNEL);
if (!new_fm) {
up_write(>fm_eba_sem);
@@ -1621,7 +1613,8 @@ int ubi_update_fastmap(struct ubi_device *ubi)
}
 
spin_lock(>wl_lock);
-   tmp_e = ubi_wl_get_fm_peb(ubi, 1);
+   tmp_e = ubi->fm_anchor;
+   ubi->fm_anchor = NULL;
spin_unlock(>wl_lock);
 
if (old_fm) {
@@ -1673,6 +1666,9 @@ out_unlock:
up_write(>work_sem);
up_write(>fm_protect);
kfree(old_fm);
+
+   ubi_ensure_anchor_pebs(ubi);
+
return ret;
 
 err:
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 721b6aa7936cf..a173eb707bddb 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -491,6 +491,8 @@ struct ubi_debug_info {
  * @fm_work: fastmap work queue
  * @fm_work_scheduled: non-zero if fastmap work was scheduled
  * @fast_attach: non-zero if UBI was attached by fastmap
+ * @fm_anchor: The next anchor PEB to use for fastmap
+ * @fm_do_produce_anchor: If true produce an anchor PEB in wl
  *
  * @used: RB-tree of used physical eraseblocks
  * 

[PATCH AUTOSEL 5.4 049/330] RDMA/i40iw: Fix potential use after free

2020-09-17 Thread Sasha Levin
From: Pan Bian 

[ Upstream commit da046d5f895fca18d63b15ac8faebd5bf784e23a ]

Release variable dst after logging dst->error to avoid possible use after
free.

Link: 
https://lore.kernel.org/r/1573022651-37171-1-git-send-email-bianpan2...@163.com
Signed-off-by: Pan Bian 
Reviewed-by: Jason Gunthorpe 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Sasha Levin 
---
 drivers/infiniband/hw/i40iw/i40iw_cm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c 
b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index b1df93b69df44..fa7a5ff498c73 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -2074,9 +2074,9 @@ static int i40iw_addr_resolve_neigh_ipv6(struct 
i40iw_device *iwdev,
dst = i40iw_get_dst_ipv6(_addr, _addr);
if (!dst || dst->error) {
if (dst) {
-   dst_release(dst);
i40iw_pr_err("ip6_route_output returned dst->error = 
%d\n",
 dst->error);
+   dst_release(dst);
}
return rc;
}
-- 
2.25.1



[PATCH AUTOSEL 5.4 055/330] mmc: core: Fix size overflow for mmc partitions

2020-09-17 Thread Sasha Levin
From: Bradley Bolen 

[ Upstream commit f3d7c2292d104519195fdb11192daec13229c219 ]

With large eMMC cards, it is possible to create general purpose
partitions that are bigger than 4GB.  The size member of the mmc_part
struct is only an unsigned int which overflows for gp partitions larger
than 4GB.  Change this to a u64 to handle the overflow.

Signed-off-by: Bradley Bolen 
Signed-off-by: Ulf Hansson 
Signed-off-by: Sasha Levin 
---
 drivers/mmc/core/mmc.c   | 9 -
 include/linux/mmc/card.h | 2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index b7159e243323b..de14b5845f525 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -297,7 +297,7 @@ static void mmc_manage_enhanced_area(struct mmc_card *card, 
u8 *ext_csd)
}
 }
 
-static void mmc_part_add(struct mmc_card *card, unsigned int size,
+static void mmc_part_add(struct mmc_card *card, u64 size,
 unsigned int part_cfg, char *name, int idx, bool ro,
 int area_type)
 {
@@ -313,7 +313,7 @@ static void mmc_manage_gp_partitions(struct mmc_card *card, 
u8 *ext_csd)
 {
int idx;
u8 hc_erase_grp_sz, hc_wp_grp_sz;
-   unsigned int part_size;
+   u64 part_size;
 
/*
 * General purpose partition feature support --
@@ -343,8 +343,7 @@ static void mmc_manage_gp_partitions(struct mmc_card *card, 
u8 *ext_csd)
(ext_csd[EXT_CSD_GP_SIZE_MULT + idx * 3 + 1]
<< 8) +
ext_csd[EXT_CSD_GP_SIZE_MULT + idx * 3];
-   part_size *= (size_t)(hc_erase_grp_sz *
-   hc_wp_grp_sz);
+   part_size *= (hc_erase_grp_sz * hc_wp_grp_sz);
mmc_part_add(card, part_size << 19,
EXT_CSD_PART_CONFIG_ACC_GP0 + idx,
"gp%d", idx, false,
@@ -362,7 +361,7 @@ static void mmc_manage_gp_partitions(struct mmc_card *card, 
u8 *ext_csd)
 static int mmc_decode_ext_csd(struct mmc_card *card, u8 *ext_csd)
 {
int err = 0, idx;
-   unsigned int part_size;
+   u64 part_size;
struct device_node *np;
bool broken_hpi = false;
 
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index e459b38ef33cc..cf3780a6ccc4b 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -226,7 +226,7 @@ struct mmc_queue_req;
  * MMC Physical partitions
  */
 struct mmc_part {
-   unsigned intsize;   /* partition size (in bytes) */
+   u64 size;   /* partition size (in bytes) */
unsigned intpart_cfg;   /* partition type */
charname[MAX_MMC_PART_NAME_LEN];
boolforce_ro;   /* to make boot parts RO by default */
-- 
2.25.1



[PATCH AUTOSEL 5.4 066/330] xtensa: fix system_call interaction with ptrace

2020-09-17 Thread Sasha Levin
From: Max Filippov 

[ Upstream commit 02ce94c229251555ac726ecfebe3458ef5905fa9 ]

Don't overwrite return value if system call was cancelled at entry by
ptrace. Return status code from do_syscall_trace_enter so that
pt_regs::syscall doesn't need to be changed to skip syscall.

Signed-off-by: Max Filippov 
Signed-off-by: Sasha Levin 
---
 arch/xtensa/kernel/entry.S  |  4 ++--
 arch/xtensa/kernel/ptrace.c | 18 --
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/xtensa/kernel/entry.S b/arch/xtensa/kernel/entry.S
index 59671603c9c62..1f07876ea2ed7 100644
--- a/arch/xtensa/kernel/entry.S
+++ b/arch/xtensa/kernel/entry.S
@@ -1897,6 +1897,7 @@ ENTRY(system_call)
 
mov a6, a2
call4   do_syscall_trace_enter
+   beqza6, .Lsyscall_exit
l32ia7, a2, PT_SYSCALL
 
 1:
@@ -1911,8 +1912,6 @@ ENTRY(system_call)
 
addx4   a4, a7, a4
l32ia4, a4, 0
-   movia5, sys_ni_syscall;
-   beq a4, a5, 1f
 
/* Load args: arg0 - arg5 are passed via regs. */
 
@@ -1932,6 +1931,7 @@ ENTRY(system_call)
 
s32ia6, a2, PT_AREG2
bneza3, 1f
+.Lsyscall_exit:
abi_ret(4)
 
 1:
diff --git a/arch/xtensa/kernel/ptrace.c b/arch/xtensa/kernel/ptrace.c
index b964f0b2d8864..145742d70a9f2 100644
--- a/arch/xtensa/kernel/ptrace.c
+++ b/arch/xtensa/kernel/ptrace.c
@@ -542,14 +542,28 @@ long arch_ptrace(struct task_struct *child, long request,
return ret;
 }
 
-void do_syscall_trace_enter(struct pt_regs *regs)
+void do_syscall_trace_leave(struct pt_regs *regs);
+int do_syscall_trace_enter(struct pt_regs *regs)
 {
+   if (regs->syscall == NO_SYSCALL)
+   regs->areg[2] = -ENOSYS;
+
if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-   tracehook_report_syscall_entry(regs))
+   tracehook_report_syscall_entry(regs)) {
+   regs->areg[2] = -ENOSYS;
regs->syscall = NO_SYSCALL;
+   return 0;
+   }
+
+   if (regs->syscall == NO_SYSCALL) {
+   do_syscall_trace_leave(regs);
+   return 0;
+   }
 
if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
trace_sys_enter(regs, syscall_get_nr(current, regs));
+
+   return 1;
 }
 
 void do_syscall_trace_leave(struct pt_regs *regs)
-- 
2.25.1



[PATCH AUTOSEL 5.4 056/330] gfs2: clean up iopen glock mess in gfs2_create_inode

2020-09-17 Thread Sasha Levin
From: Bob Peterson 

[ Upstream commit 2c47c1be51fbded1f7baa2ceaed90f97932f79be ]

Before this patch, gfs2_create_inode had a use-after-free for the
iopen glock in some error paths because it did this:

gfs2_glock_put(io_gl);
fail_gunlock2:
if (io_gl)
clear_bit(GLF_INODE_CREATING, _gl->gl_flags);

In some cases, the io_gl was used for create and only had one
reference, so the glock might be freed before the clear_bit().
This patch tries to straighten it out by only jumping to the
error paths where iopen is properly set, and moving the
gfs2_glock_put after the clear_bit.

Signed-off-by: Bob Peterson 
Signed-off-by: Andreas Gruenbacher 
Signed-off-by: Sasha Levin 
---
 fs/gfs2/inode.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 8466166f22e3d..988bb7b17ed8f 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -712,7 +712,7 @@ static int gfs2_create_inode(struct inode *dir, struct 
dentry *dentry,
 
error = gfs2_trans_begin(sdp, blocks, 0);
if (error)
-   goto fail_gunlock2;
+   goto fail_free_inode;
 
if (blocks > 1) {
ip->i_eattr = ip->i_no_addr + 1;
@@ -723,7 +723,7 @@ static int gfs2_create_inode(struct inode *dir, struct 
dentry *dentry,
 
error = gfs2_glock_get(sdp, ip->i_no_addr, _iopen_glops, CREATE, 
_gl);
if (error)
-   goto fail_gunlock2;
+   goto fail_free_inode;
 
BUG_ON(test_and_set_bit(GLF_INODE_CREATING, _gl->gl_flags));
 
@@ -732,7 +732,6 @@ static int gfs2_create_inode(struct inode *dir, struct 
dentry *dentry,
goto fail_gunlock2;
 
glock_set_object(ip->i_iopen_gh.gh_gl, ip);
-   gfs2_glock_put(io_gl);
gfs2_set_iop(inode);
insert_inode_hash(inode);
 
@@ -765,6 +764,8 @@ static int gfs2_create_inode(struct inode *dir, struct 
dentry *dentry,
 
mark_inode_dirty(inode);
d_instantiate(dentry, inode);
+   /* After instantiate, errors should result in evict which will destroy
+* both inode and iopen glocks properly. */
if (file) {
file->f_mode |= FMODE_CREATED;
error = finish_open(file, dentry, gfs2_open_common);
@@ -772,15 +773,15 @@ static int gfs2_create_inode(struct inode *dir, struct 
dentry *dentry,
gfs2_glock_dq_uninit(ghs);
gfs2_glock_dq_uninit(ghs + 1);
clear_bit(GLF_INODE_CREATING, _gl->gl_flags);
+   gfs2_glock_put(io_gl);
return error;
 
 fail_gunlock3:
glock_clear_object(io_gl, ip);
gfs2_glock_dq_uninit(>i_iopen_gh);
-   gfs2_glock_put(io_gl);
 fail_gunlock2:
-   if (io_gl)
-   clear_bit(GLF_INODE_CREATING, _gl->gl_flags);
+   clear_bit(GLF_INODE_CREATING, _gl->gl_flags);
+   gfs2_glock_put(io_gl);
 fail_free_inode:
if (ip->i_gl) {
glock_clear_object(ip->i_gl, ip);
-- 
2.25.1



[PATCH AUTOSEL 5.4 058/330] mt76: do not use devm API for led classdev

2020-09-17 Thread Sasha Levin
From: Felix Fietkau 

[ Upstream commit 36f7e2b2bb1de86f0072cd49ca93d82b9e8fd894 ]

With the devm API, the unregister happens after the device cleanup is done,
after which the struct mt76_dev which contains the led_cdev has already been
freed. This leads to a use-after-free bug that can crash the system.

Signed-off-by: Felix Fietkau 
Signed-off-by: Sasha Levin 
---
 drivers/net/wireless/mediatek/mt76/mac80211.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c 
b/drivers/net/wireless/mediatek/mt76/mac80211.c
index 1a2c143b34d01..7be5806a1c398 100644
--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
+++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
@@ -105,7 +105,15 @@ static int mt76_led_init(struct mt76_dev *dev)
dev->led_al = of_property_read_bool(np, "led-active-low");
}
 
-   return devm_led_classdev_register(dev->dev, >led_cdev);
+   return led_classdev_register(dev->dev, >led_cdev);
+}
+
+static void mt76_led_cleanup(struct mt76_dev *dev)
+{
+   if (!dev->led_cdev.brightness_set && !dev->led_cdev.blink_set)
+   return;
+
+   led_classdev_unregister(>led_cdev);
 }
 
 static void mt76_init_stream_cap(struct mt76_dev *dev,
@@ -360,6 +368,7 @@ void mt76_unregister_device(struct mt76_dev *dev)
 {
struct ieee80211_hw *hw = dev->hw;
 
+   mt76_led_cleanup(dev);
mt76_tx_status_check(dev, NULL, true);
ieee80211_unregister_hw(hw);
 }
-- 
2.25.1



[PATCH AUTOSEL 5.4 077/330] ALSA: hda: enable regmap internal locking

2020-09-17 Thread Sasha Levin
From: Kai Vehmanen 

[ Upstream commit 8e85def5723eccea30ebf22645673692ab8cb3e2 ]

This reverts commit 42ec336f1f9d ("ALSA: hda: Disable regmap
internal locking").

Without regmap locking, there is a race between snd_hda_codec_amp_init()
and PM callbacks issuing regcache_sync(). This was caught by
following kernel warning trace:

<4> [358.080081] WARNING: CPU: 2 PID: 4157 at 
drivers/base/regmap/regcache.c:498 regcache_cache_only+0xf5/0x130
[...]
<4> [358.080148] Call Trace:
<4> [358.080158]  snd_hda_codec_amp_init+0x4e/0x100 [snd_hda_codec]
<4> [358.080169]  snd_hda_codec_amp_init_stereo+0x40/0x80 [snd_hda_codec]

Suggested-by: Takashi Iwai 
BugLink: https://gitlab.freedesktop.org/drm/intel/issues/592
Signed-off-by: Kai Vehmanen 
Link: 
https://lore.kernel.org/r/20200108180856.5194-1-kai.vehma...@linux.intel.com
Signed-off-by: Takashi Iwai 
Signed-off-by: Sasha Levin 
---
 sound/hda/hdac_regmap.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c
index 2596a881186fa..49780399c2849 100644
--- a/sound/hda/hdac_regmap.c
+++ b/sound/hda/hdac_regmap.c
@@ -363,7 +363,6 @@ static const struct regmap_config hda_regmap_cfg = {
.reg_write = hda_reg_write,
.use_single_read = true,
.use_single_write = true,
-   .disable_locking = true,
 };
 
 /**
-- 
2.25.1



[PATCH AUTOSEL 5.4 059/330] mt76: add missing locking around ampdu action

2020-09-17 Thread Sasha Levin
From: Felix Fietkau 

[ Upstream commit 1a817fa73c3b27a593aadf0029de24db1bbc1a3e ]

This is needed primarily to avoid races in dealing with rx aggregation
related data structures

Signed-off-by: Felix Fietkau 
Signed-off-by: Sasha Levin 
---
 drivers/net/wireless/mediatek/mt76/mt7603/main.c  | 2 ++
 drivers/net/wireless/mediatek/mt76/mt7615/main.c  | 2 ++
 drivers/net/wireless/mediatek/mt76/mt76x02_util.c | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/main.c 
b/drivers/net/wireless/mediatek/mt76/mt7603/main.c
index 25d5b1608bc91..0a5695c3d9241 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7603/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7603/main.c
@@ -561,6 +561,7 @@ mt7603_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
 
mtxq = (struct mt76_txq *)txq->drv_priv;
 
+   mutex_lock(>mt76.mutex);
switch (action) {
case IEEE80211_AMPDU_RX_START:
mt76_rx_aggr_start(>mt76, >wcid, tid, ssn,
@@ -590,6 +591,7 @@ mt7603_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid);
break;
}
+   mutex_unlock(>mt76.mutex);
 
return 0;
 }
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/main.c 
b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
index 87c748715b5d7..38183aef0eb92 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
@@ -455,6 +455,7 @@ mt7615_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
 
mtxq = (struct mt76_txq *)txq->drv_priv;
 
+   mutex_lock(>mt76.mutex);
switch (action) {
case IEEE80211_AMPDU_RX_START:
mt76_rx_aggr_start(>mt76, >wcid, tid, ssn,
@@ -485,6 +486,7 @@ mt7615_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid);
break;
}
+   mutex_unlock(>mt76.mutex);
 
return 0;
 }
diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_util.c 
b/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
index aec73a0295e86..de0d6f21c621c 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
+++ b/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
@@ -371,6 +371,7 @@ int mt76x02_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
 
mtxq = (struct mt76_txq *)txq->drv_priv;
 
+   mutex_lock(>mt76.mutex);
switch (action) {
case IEEE80211_AMPDU_RX_START:
mt76_rx_aggr_start(>mt76, >wcid, tid,
@@ -400,6 +401,7 @@ int mt76x02_ampdu_action(struct ieee80211_hw *hw, struct 
ieee80211_vif *vif,
ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid);
break;
}
+   mutex_unlock(>mt76.mutex);
 
return 0;
 }
-- 
2.25.1



Re: [PATCH] kprobes: Do not disarm disabled ftrace kprobe

2020-09-17 Thread Steven Rostedt
On Fri, 18 Sep 2020 11:01:22 +0900
Masami Hiramatsu  wrote:

> Hi Steve,
> 
> Ah, this seems to fix same issue which I sent.
> 
> https://lkml.kernel.org/r/159888672694.1411785.5987998076694782591.stgit@devnote2
> 
> Could you confirm it?

Ah, OK. I'm going through my backlog (which was created by Linux
Plumbers, and then me going on vacation for 10 days) and I'm only at
Aug 19th :-p

If that patch fixes the issue, I'll drop mine in favor of yours.

Thanks, and sorry for the noise.

-- Steve


[PATCH AUTOSEL 5.4 063/330] CIFS: Properly process SMB3 lease breaks

2020-09-17 Thread Sasha Levin
From: Pavel Shilovsky 

[ Upstream commit 9bd4540836684013aaad6070a65d6fcdd9006625 ]

Currenly we doesn't assume that a server may break a lease
from RWH to RW which causes us setting a wrong lease state
on a file and thus mistakenly flushing data and byte-range
locks and purging cached data on the client. This leads to
performance degradation because subsequent IOs go directly
to the server.

Fix this by propagating new lease state and epoch values
to the oplock break handler through cifsFileInfo structure
and removing the use of cifsInodeInfo flags for that. It
allows to avoid some races of several lease/oplock breaks
using those flags in parallel.

Signed-off-by: Pavel Shilovsky 
Signed-off-by: Steve French 
Signed-off-by: Sasha Levin 
---
 fs/cifs/cifsglob.h |  9 ++---
 fs/cifs/file.c | 10 +++---
 fs/cifs/misc.c | 17 +++--
 fs/cifs/smb1ops.c  |  8 +++-
 fs/cifs/smb2misc.c | 32 +++-
 fs/cifs/smb2ops.c  | 44 ++--
 fs/cifs/smb2pdu.h  |  2 +-
 7 files changed, 57 insertions(+), 65 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index f9cbdfc1591b1..b16c994414ab0 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -268,8 +268,9 @@ struct smb_version_operations {
int (*check_message)(char *, unsigned int, struct TCP_Server_Info *);
bool (*is_oplock_break)(char *, struct TCP_Server_Info *);
int (*handle_cancelled_mid)(char *, struct TCP_Server_Info *);
-   void (*downgrade_oplock)(struct TCP_Server_Info *,
-   struct cifsInodeInfo *, bool);
+   void (*downgrade_oplock)(struct TCP_Server_Info *server,
+struct cifsInodeInfo *cinode, __u32 oplock,
+unsigned int epoch, bool *purge_cache);
/* process transaction2 response */
bool (*check_trans2)(struct mid_q_entry *, struct TCP_Server_Info *,
 char *, int);
@@ -1261,6 +1262,8 @@ struct cifsFileInfo {
unsigned int f_flags;
bool invalidHandle:1;   /* file closed via session abend */
bool oplock_break_cancelled:1;
+   unsigned int oplock_epoch; /* epoch from the lease break */
+   __u32 oplock_level; /* oplock/lease level from the lease break */
int count;
spinlock_t file_info_lock; /* protects four flag/count fields above */
struct mutex fh_mutex; /* prevents reopen race after dead ses*/
@@ -1408,7 +1411,7 @@ struct cifsInodeInfo {
unsigned int epoch; /* used to track lease state changes */
 #define CIFS_INODE_PENDING_OPLOCK_BREAK   (0) /* oplock break in progress */
 #define CIFS_INODE_PENDING_WRITERS   (1) /* Writes in progress */
-#define CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2 (2) /* Downgrade oplock to L2 */
+#define CIFS_INODE_FLAG_UNUSED   (2) /* Unused flag */
 #define CIFS_INO_DELETE_PENDING  (3) /* delete pending on 
server */
 #define CIFS_INO_INVALID_MAPPING (4) /* pagecache is invalid */
 #define CIFS_INO_LOCK(5) /* lock bit for synchronization */
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4959dbe740f71..14ae341755d47 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -4675,12 +4675,13 @@ void cifs_oplock_break(struct work_struct *work)
struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
struct TCP_Server_Info *server = tcon->ses->server;
int rc = 0;
+   bool purge_cache = false;
 
wait_on_bit(>flags, CIFS_INODE_PENDING_WRITERS,
TASK_UNINTERRUPTIBLE);
 
-   server->ops->downgrade_oplock(server, cinode,
-   test_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2, >flags));
+   server->ops->downgrade_oplock(server, cinode, cfile->oplock_level,
+ cfile->oplock_epoch, _cache);
 
if (!CIFS_CACHE_WRITE(cinode) && CIFS_CACHE_READ(cinode) &&
cifs_has_mand_locks(cinode)) {
@@ -4695,18 +4696,21 @@ void cifs_oplock_break(struct work_struct *work)
else
break_lease(inode, O_WRONLY);
rc = filemap_fdatawrite(inode->i_mapping);
-   if (!CIFS_CACHE_READ(cinode)) {
+   if (!CIFS_CACHE_READ(cinode) || purge_cache) {
rc = filemap_fdatawait(inode->i_mapping);
mapping_set_error(inode->i_mapping, rc);
cifs_zap_mapping(inode);
}
cifs_dbg(FYI, "Oplock flush inode %p rc %d\n", inode, rc);
+   if (CIFS_CACHE_WRITE(cinode))
+   goto oplock_break_ack;
}
 
rc = cifs_push_locks(cfile);
if (rc)
cifs_dbg(VFS, "Push locks rc = %d\n", rc);
 
+oplock_break_ack:
/*
 * releasing stale oplock after recent reconnect of smb session using

[PATCH AUTOSEL 5.4 082/330] ipv6_route_seq_next should increase position index

2020-09-17 Thread Sasha Levin
From: Vasily Averin 

[ Upstream commit 4fc427e0515811250647d44de38d87d7b0e0790f ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 net/ipv6/ip6_fib.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 7a0c877ca306c..7662de1bd7fd2 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -2474,14 +2474,13 @@ static void *ipv6_route_seq_next(struct seq_file *seq, 
void *v, loff_t *pos)
struct net *net = seq_file_net(seq);
struct ipv6_route_iter *iter = seq->private;
 
+   ++(*pos);
if (!v)
goto iter_table;
 
n = rcu_dereference_bh(((struct fib6_info *)v)->fib6_next);
-   if (n) {
-   ++*pos;
+   if (n)
return n;
-   }
 
 iter_table:
ipv6_route_check_sernum(iter);
@@ -2489,8 +2488,6 @@ iter_table:
r = fib6_walk_continue(>w);
spin_unlock_bh(>tbl->tb6_lock);
if (r > 0) {
-   if (v)
-   ++*pos;
return iter->w.leaf;
} else if (r < 0) {
fib6_walker_unlink(net, >w);
-- 
2.25.1



  1   2   3   4   5   6   7   8   9   10   >