Re: Possible null pointer dereference in rcar-dmac.ko

2017-08-21 Thread Kuninori Morimoto

Hi Laurent

> I don't think this fully fixes the problem, as the rcar_dmac_isr_error() IRQ 
> handler is still registered before all this. Furthermore, at least some of 
> the 
> initialization at the end of rcar_dmac_chan_probe() has to be moved before 
> the 
> rcar_dmac_isr_channel() IRQ handler registration.
> 
> Let's not commit a quick hack but fix the problem correctly, we should ensure 
> that all the initialization needed by IRQ handlers is performed before they 
> get registered.

Yeah, indeed.
We need v2 patch

Best regards
---
Kuninori Morimoto


Re: Do we really need d_weak_revalidate???

2017-08-21 Thread NeilBrown
On Fri, Aug 18 2017, Ian Kent wrote:

> On 18/08/17 13:24, NeilBrown wrote:
>> On Thu, Aug 17 2017, Ian Kent wrote:
>> 
>>> On 16/08/17 19:34, Jeff Layton wrote:
 On Wed, 2017-08-16 at 12:43 +1000, NeilBrown wrote:
> On Mon, Aug 14 2017, Jeff Layton wrote:
>
>> On Mon, 2017-08-14 at 09:36 +1000, NeilBrown wrote:
>>> On Fri, Aug 11 2017, Jeff Layton wrote:
>>>
 On Fri, 2017-08-11 at 05:55 +, Trond Myklebust wrote:
> On Fri, 2017-08-11 at 14:31 +1000, NeilBrown wrote:
>> Funny story.  4.5 years ago we discarded the FS_REVAL_DOT superblock
>> flag and introduced the d_weak_revalidate dentry operation instead.
>> We duly removed the flag from NFS superblocks and NFSv4 superblocks,
>> and added the new dentry operation to NFS dentries  but not to
>> NFSv4
>> dentries.
>>
>> And nobody noticed.
>>
>> Until today.
>>
>> A customer reports a situation where mount(,MS_REMOUNT,..) on an
>> NFS
>> filesystem hangs because the network has been deconfigured.  This
>> makes
>> perfect sense and I suggested a code change to fix the problem.
>> However when a colleague was trying to reproduce the problem to
>> validate
>> the fix, he couldn't.  Then nor could I.
>>
>> The problem is trivially reproducible with NFSv3, and not at all with
>> NFSv4.  The reason is the missing d_weak_revalidate.
>>
>> We could simply add d_weak_revalidate for NFSv4, but given that it
>> has been missing for 4.5 years, and the only time anyone noticed was
>> when the ommission resulted in a better user experience, I do wonder
>> if
>> we need to.  Can we just discard d_weak_revalidate?  What purpose
>> does
>> it serve?  I couldn't find one.
>>
>> Thanks,
>> NeilBrown
>>
>> For reference, see
>> Commit: ecf3d1f1aa74 ("vfs: kill FS_REVAL_DOT by adding a
>> d_weak_revalidate dentry op")
>>
>>
>>
>> To reproduce the problem at home, on a system that uses systemd:
>> 1/ place (or find) a filesystem image in a file on an NFS filesystem.
>> 2/ mount the nfs filesystem with "noac" - choose v3 or v4
>> 3/ loop-mount the filesystem image read-only somewhere
>> 4/ reboot
>>
>> If you choose v4, the reboot will succeed, possibly after a 90second
>> timeout.
>> If you choose v3, the reboot will hang indefinitely in systemd-
>> shutdown while
>> remounting the nfs filesystem read-only.
>>
>> If you don't use "noac" it can still hang, but only if something
>> slows
>> down the reboot enough that attributes have timed out by the time
>> that
>> systemd-shutdown runs.  This happens for our customer.
>>
>> If the loop-mounted filesystem is not read-only, you get other
>> problems.
>>
>> We really want systemd to figure out that the loop-mount needs to be
>> unmounted first.  I have ideas concerning that, but it is messy.  But
>> that isn't the only bug here.
>
> The main purpose of d_weak_revalidate() was to catch the issues that
> arise when someone changes the contents of the current working
> directory or its parent on the server. Since '.' and '..' are treated
> specially in the lookup code, they would not be revalidated without
> special treatment. That leads to issues when looking up files as
> ./ or ../, since the client won't detect that its
> dcache is stale until it tries to use the cached dentry+inode.
>
> The one thing that has changed since its introduction is, I believe,
> the ESTALE handling in the VFS layer. That might fix a lot of the
> dcache lookup bugs that were previously handled by 
> d_weak_revalidate().
> I haven't done an audit to figure out if it actually can handle all of
> them.
>

 It may also be related to 8033426e6bdb2690d302872ac1e1fadaec1a5581:

 vfs: allow umount to handle mountpoints without revalidating them
>>>
>>> You say in the comment for that commit:
>>>
>>>  but there
>>> are cases where we do want to revalidate the root of the fs.
>>>
>>> Do you happen to remember what those cases are?
>>>
>>
>> Not exactly, but I _think_ I might have been assuming that we needed to
>> ensure that the inode attrs on the root were up to date after the
>> pathwalk.
>>
>> I think that was probably wrong. d_revalidate is really intended to
>> ensure that the dentry in question still points to the same inode. In
>> the case of the root of the mount 

Re: [PATCH v14 4/5] mm: support reporting free page blocks

2017-08-21 Thread Michal Hocko
On Fri 18-08-17 20:23:05, Michael S. Tsirkin wrote:
> On Thu, Aug 17, 2017 at 11:26:55AM +0800, Wei Wang wrote:
[...]
> > +void walk_free_mem_block(void *opaque1,
> > +unsigned int min_order,
> > +void (*visit)(void *opaque2,
> 
> You can just avoid opaque2 completely I think, then opaque1 can
> be renamed opaque.
> 
> > +  unsigned long pfn,
> > +  unsigned long nr_pages))
> > +{
> > +   struct zone *zone;
> > +   struct page *page;
> > +   struct list_head *list;
> > +   unsigned int order;
> > +   enum migratetype mt;
> > +   unsigned long pfn, flags;
> > +
> > +   for_each_populated_zone(zone) {
> > +   for (order = MAX_ORDER - 1;
> > +order < MAX_ORDER && order >= min_order; order--) {
> > +   for (mt = 0; mt < MIGRATE_TYPES; mt++) {
> > +   spin_lock_irqsave(>lock, flags);
> > +   list = >free_area[order].free_list[mt];
> > +   list_for_each_entry(page, list, lru) {
> > +   pfn = page_to_pfn(page);
> > +   visit(opaque1, pfn, 1 << order);
> 
> My only concern here is inability of callback to
> 1. break out of list
> 2. remove page from the list

As I've said before this has to be a read only API. You cannot simply
fiddle with the page allocator internals under its feet.

> So I would make the callback bool, and I would use
> list_for_each_entry_safe.

If a bool would tell to break out of the loop then I agree. This sounds
useful.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH v2 00/20] Speculative page faults

2017-08-21 Thread Anshuman Khandual
On 08/18/2017 03:34 AM, Laurent Dufour wrote:
> This is a port on kernel 4.13 of the work done by Peter Zijlstra to
> handle page fault without holding the mm semaphore [1].
> 
> The idea is to try to handle user space page faults without holding the
> mmap_sem. This should allow better concurrency for massively threaded
> process since the page fault handler will not wait for other threads memory
> layout change to be done, assuming that this change is done in another part
> of the process's memory space. This type page fault is named speculative
> page fault. If the speculative page fault fails because of a concurrency is
> detected or because underlying PMD or PTE tables are not yet allocating, it
> is failing its processing and a classic page fault is then tried.
> 
> The speculative page fault (SPF) has to look for the VMA matching the fault
> address without holding the mmap_sem, so the VMA list is now managed using
> SRCU allowing lockless walking. The only impact would be the deferred file
> derefencing in the case of a file mapping, since the file pointer is
> released once the SRCU cleaning is done.  This patch relies on the change
> done recently by Paul McKenney in SRCU which now runs a callback per CPU
> instead of per SRCU structure [1].
> 
> The VMA's attributes checked during the speculative page fault processing
> have to be protected against parallel changes. This is done by using a per
> VMA sequence lock. This sequence lock allows the speculative page fault
> handler to fast check for parallel changes in progress and to abort the
> speculative page fault in that case.
> 
> Once the VMA is found, the speculative page fault handler would check for
> the VMA's attributes to verify that the page fault has to be handled
> correctly or not. Thus the VMA is protected through a sequence lock which
> allows fast detection of concurrent VMA changes. If such a change is
> detected, the speculative page fault is aborted and a *classic* page fault
> is tried.  VMA sequence locks are added when VMA attributes which are
> checked during the page fault are modified.
> 
> When the PTE is fetched, the VMA is checked to see if it has been changed,
> so once the page table is locked, the VMA is valid, so any other changes
> leading to touching this PTE will need to lock the page table, so no
> parallel change is possible at this time.
> 
> Compared to the Peter's initial work, this series introduces a spin_trylock
> when dealing with speculative page fault. This is required to avoid dead
> lock when handling a page fault while a TLB invalidate is requested by an
> other CPU holding the PTE. Another change due to a lock dependency issue
> with mapping->i_mmap_rwsem.
> 
> In addition some VMA field values which are used once the PTE is unlocked
> at the end the page fault path are saved into the vm_fault structure to
> used the values matching the VMA at the time the PTE was locked.
> 
> This series builds on top of v4.13-rc5 and is functional on x86 and
> PowerPC.
> 
> Tests have been made using a large commercial in-memory database on a
> PowerPC system with 752 CPU using RFC v5. The results are very encouraging
> since the loading of the 2TB database was faster by 14% with the
> speculative page fault.
> 

You specifically mention loading as most of the page faults will
happen at that time and then the working set will settle down with
very less page faults there after ? That means unless there is
another wave of page faults we wont notice performance improvement
during the runtime.

> Using ebizzy test [3], which spreads a lot of threads, the result are good
> when running on both a large or a small system. When using kernbench, the

The performance improvements are greater as there is a lot of creation
and destruction of anon mappings which generates constant flow of page
faults to be handled.

> result are quite similar which expected as not so much multi threaded
> processes are involved. But there is no performance degradation neither
> which is good.

If we compile with 'make -j N' there would be a lot of threads but I
guess the problem is SPF does not support handling file mapping IIUC
which limits the performance improvement for some workloads.

> 
> --
> Benchmarks results
> 
> Note these test have been made on top of 4.13-rc3 with the following patch
> from Paul McKenney applied: 
>  "srcu: Provide ordering for CPU not involved in grace period" [5]

Is this patch an improvement for SRCU which we are using for walking VMAs.

> 
> Ebizzy:
> ---
> The test is counting the number of records per second it can manage, the
> higher is the best. I run it like this 'ebizzy -mTRp'. To get consistent
> result I repeated the test 100 times and measure the average result, mean
> deviation, max and min.
> 
> - 16 CPUs x86 VM
> Records/s 4.13-rc54.13-rc5-spf
> Average   11350.2921760.36
> Mean deviation396.56  881.40
> Max   13773 

Re: [PATCH 1/2] pwm: tiehrpwm: fix runtime pm imbalance at unbind

2017-08-21 Thread Thierry Reding
On Thu, Jul 20, 2017 at 12:48:16PM +0200, Johan Hovold wrote:
> Remove unbalanced RPM put at driver unbind which resulted in a negative
> usage count.
> 
> Fixes: 19891b20e7c2 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
> Signed-off-by: Johan Hovold 
> ---
>  drivers/pwm/pwm-tiehrpwm.c | 1 -
>  1 file changed, 1 deletion(-)

Both patches applied to for-4.14/drivers, thanks.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v14 4/5] mm: support reporting free page blocks

2017-08-21 Thread Wei Wang

On 08/18/2017 09:46 PM, Michal Hocko wrote:

On Thu 17-08-17 11:26:55, Wei Wang wrote:

This patch adds support to walk through the free page blocks in the
system and report them via a callback function. Some page blocks may
leave the free list after zone->lock is released, so it is the caller's
responsibility to either detect or prevent the use of such pages.

This could see more details to be honest. Especially the usecase you are
going to use this for. This will help us to understand the motivation
in future when the current user might be gone a new ones largely diverge
into a different usage. This wouldn't be the first time I have seen
something like that.


OK, I will more details here about how it's used to accelerate live 
migration.



Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michal Hocko 
Cc: Michael S. Tsirkin 
---
  include/linux/mm.h |  6 ++
  mm/page_alloc.c| 44 
  2 files changed, 50 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 46b9ac5..cd29b9f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1835,6 +1835,12 @@ extern void free_area_init_node(int nid, unsigned long * 
zones_size,
unsigned long zone_start_pfn, unsigned long *zholes_size);
  extern void free_initmem(void);
  
+extern void walk_free_mem_block(void *opaque1,

+   unsigned int min_order,
+   void (*visit)(void *opaque2,
+ unsigned long pfn,
+ unsigned long nr_pages));
+
  /*
   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
   * into the buddy system. The freed pages will be poisoned with pattern
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d00f74..a721a35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,6 +4762,50 @@ void show_free_areas(unsigned int filter, nodemask_t 
*nodemask)
show_swap_cache_info();
  }
  
+/**

+ * walk_free_mem_block - Walk through the free page blocks in the system
+ * @opaque1: the context passed from the caller
+ * @min_order: the minimum order of free lists to check
+ * @visit: the callback function given by the caller

The original suggestion for using visit was motivated by a visit design
pattern but I can see how this can be confusing. Maybe a more explicit
name wold be better. What about report_free_range.



I'm afraid that name would be too long to fit in nicely.
How about simply naming it "report"?





+ *
+ * The function is used to walk through the free page blocks in the system,
+ * and each free page block is reported to the caller via the @visit callback.
+ * Please note:
+ * 1) The function is used to report hints of free pages, so the caller should
+ * not use those reported pages after the callback returns.
+ * 2) The callback is invoked with the zone->lock being held, so it should not
+ * block and should finish as soon as possible.

I think that the explicit note about zone->lock is not really need. This
can change in future and I would even bet that somebody might rely on
the lock being held for some purpose and silently get broken with the
change. Instead I would much rather see something like the following:
"
Please note that there are no locking guarantees for the callback


Just a little confused with this one:

The callback is invoked within zone->lock, why would we claim it "no
locking guarantees for the callback"?


and
that the reported pfn range might be freed or disappear after the
callback returns so the caller has to be very careful how it is used.

The callback itself must not sleep or perform any operations which would
require any memory allocations directly (not even GFP_NOWAIT/GFP_ATOMIC)
or via any lock dependency. It is generally advisable to implement
the callback as simple as possible and defer any heavy lifting to a
different context.

There is no guarantee that each free range will be reported only once
during one walk_free_mem_block invocation.

pfn_to_page on the given range is strongly discouraged and if there is
an absolute need for that make sure to contact MM people to discuss
potential problems.

The function itself might sleep so it cannot be called from atomic
contexts.

In general low orders tend to be very volatile and so it makes more
sense to query larger ones for various optimizations which like
ballooning etc... This will reduce the overhead as well.
"


I think it looks quite comprehensive. Thanks.


Best,
Wei


[PATCH] drm: mxsfb: constify drm_simple_display_pipe_funcs

2017-08-21 Thread Arvind Yadav
drm_simple_display_pipe_funcs are not supposed to change at runtime.
All functions working with drm_simple_display_pipe_funcs provided
by  work with const
drm_simple_display_pipe_funcs. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 drivers/gpu/drm/mxsfb/mxsfb_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.c 
b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
index d1b9c34..13e7ad8 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_drv.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
@@ -130,7 +130,7 @@ static int mxsfb_pipe_prepare_fb(struct 
drm_simple_display_pipe *pipe,
return drm_fb_cma_prepare_fb(>plane, plane_state);
 }
 
-static struct drm_simple_display_pipe_funcs mxsfb_funcs = {
+static const struct drm_simple_display_pipe_funcs mxsfb_funcs = {
.enable = mxsfb_pipe_enable,
.disable= mxsfb_pipe_disable,
.update = mxsfb_pipe_update,
-- 
1.9.1



Re: [PATCH] pwm: Kconfig: Enable pwm-tiecap to be built for Keystone

2017-08-21 Thread Thierry Reding
On Wed, Aug 02, 2017 at 11:43:44AM +0530, Vignesh R wrote:
> 66AK2G SoC has ECAP subsystem that is used as pwm-backlight provider for
> display. Hence, enable pwm-tiecap driver to be built for Keystone
> architecture.
> 
> Signed-off-by: Vignesh R 
> ---
>  drivers/pwm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied to for-4.14/drivers, thanks.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH 1/3] soc: qcom: smem: Support global partition

2017-08-21 Thread Arun Kumar Neelakantam



On 8/18/2017 6:45 AM, Chris Lew wrote:

@@ -782,7 +855,10 @@ static int qcom_smem_probe(struct platform_device *pdev)
}
  
  	version = qcom_smem_get_sbl_version(smem);

-   if (version >> 16 != SMEM_EXPECTED_VERSION) {
+   switch (version >> 16) {
+   case SMEM_GLOBAL_PART_VERSION:
+   case SMEM_GLOBAL_HEAP_VERSION:

break statement is needed for supported versions

+   default:
dev_err(>dev, "Unsupported SMEM version 0x%x\n", version);
return -EINVAL;
}




Re: [PATCH v1 2/6] fs: use on-stack-bio if backing device has BDI_CAP_SYNC capability

2017-08-21 Thread Minchan Kim
Hi Jens,

On Wed, Aug 16, 2017 at 09:56:12AM -0600, Jens Axboe wrote:
> On 08/15/2017 10:48 PM, Minchan Kim wrote:
> > Hi Jens,
> > 
> > On Mon, Aug 14, 2017 at 10:17:09AM -0600, Jens Axboe wrote:
> >> On 08/14/2017 09:38 AM, Jens Axboe wrote:
> >>> On 08/14/2017 09:31 AM, Minchan Kim wrote:
> > Secondly, generally you don't have slow devices and fast devices
> > intermingled when running workloads. That's the rare case.
> 
>  Not true. zRam is really popular swap for embedded devices where
>  one of low cost product has a really poor slow nand compared to
>  lz4/lzo [de]comression.
> >>>
> >>> I guess that's true for some cases. But as I said earlier, the recycling
> >>> really doesn't care about this at all. They can happily coexist, and not
> >>> step on each others toes.
> >>
> >> Dusted it off, result is here against -rc5:
> >>
> >> http://git.kernel.dk/cgit/linux-block/log/?h=cpu-alloc-cache
> >>
> >> I'd like to split the amount of units we cache and the amount of units
> >> we free, right now they are both CPU_ALLOC_CACHE_SIZE. This means that
> >> once we hit that count, we free all of the, and then store the one we
> >> were asked to free. That always keeps 1 local, but maybe it'd make more
> >> sense to cache just free CPU_ALLOC_CACHE_SIZE/2 (or something like that)
> >> so that we retain more than 1 per cpu in case and app preempts when
> >> sleeping for IO and the new task on that CPU then issues IO as well.
> >> Probably minor.
> >>
> >> Ran a quick test on nullb0 with 32 sync readers. The test was O_DIRECT
> >> on the block device, so I disabled the __blkdev_direct_IO_simple()
> >> bypass. With the above branch, we get ~18.0M IOPS, and without we get
> >> ~14M IOPS. Both ran with iostats disabled, to avoid any interference
> >> from that.
> > 
> > Looks promising.
> > If recycling bio works well enough, I think we don't need to introduce
> > new split in the path for on-stack bio.
> > I will test your version on zram-swap!
> 
> Thanks, let me know how it goes. It's quite possible that we'll need
> a few further tweaks, but at least the basis should be there.

Sorry for my late reply.

I just finished the swap-in testing in with zram-swap which is critical
for the latency.

For the testing, I made a memcc and put $NR_CPU(mine is 12) processes
in there and each processes consumes 1G so total is 12G while my system
has 16GB memory so there was no global reclaim.
Then, echo 1 > /mnt/memcg/group/force.empty to swap all pages out and
then the programs wait my signal to swap in and I trigger the signal
to every processes to swap in every pages and measures elapsed time
for the swapin.

the value is average usec time elapsed swap-in 1G pages for each process
and I repeated it 10times and stddev is very stable.

swapin:
base(with rw_page)  1100806.73(100.00%)
no-rw_page  1146856.95(104.18%)
Jens's pcp  1146910.00(104.19%)
onstack-bio 1114872.18(101.28%)

In my test, there is no difference between dynamic bio allocation
(i.e., no-rwpage) and pcp approch but onstack-bio is much faster
so it's almost same with rw_page.

swapout test is to measure elapsed time for "echo 1 > 
/mnt/memcg/test_group/force.empty'
so it's sec unit.

swapout:
base(with rw_page)  7.72(100.00%)
no-rw_page  8.36(108.29%)
Jens's pcp  8.31(107.64%)
onstack-bio 8.19(106.09%)

rw_page's swapout is 6% or more than faster than else.

I tried pmbenchmak with no memcg to see the performance in global reclaim.
Also, I executed background IO job which reads data from HDD.
The value is average usec time elapsed for a page access so smaller is
better.

base(with rw_page)  14.42(100.00%)
no-rw_page  15.66(108.60%)
Jens's pcp  15.81(109.64%)
onstack-bio 15.42(106.93%)

It's similar to swapout test in memcg.
6% or more is not trivial so I doubt we can remove rw_page
at this moment. :(

I will look into the detail with perf.
If you have further optimizations or suggestions, Feel free to
say that. I am happy to test it.

Thanks.


Re: [PATCH v14 4/5] mm: support reporting free page blocks

2017-08-21 Thread Michal Hocko
On Mon 21-08-17 14:12:47, Wei Wang wrote:
> On 08/18/2017 09:46 PM, Michal Hocko wrote:
[...]
> >>+/**
> >>+ * walk_free_mem_block - Walk through the free page blocks in the system
> >>+ * @opaque1: the context passed from the caller
> >>+ * @min_order: the minimum order of free lists to check
> >>+ * @visit: the callback function given by the caller
> >The original suggestion for using visit was motivated by a visit design
> >pattern but I can see how this can be confusing. Maybe a more explicit
> >name wold be better. What about report_free_range.
> 
> 
> I'm afraid that name would be too long to fit in nicely.
> How about simply naming it "report"?

I do not have a strong opinion on this. I wouldn't be afraid of using
slightly longer name here for the clarity sake, though.
 
> >>+ *
> >>+ * The function is used to walk through the free page blocks in the system,
> >>+ * and each free page block is reported to the caller via the @visit 
> >>callback.
> >>+ * Please note:
> >>+ * 1) The function is used to report hints of free pages, so the caller 
> >>should
> >>+ * not use those reported pages after the callback returns.
> >>+ * 2) The callback is invoked with the zone->lock being held, so it should 
> >>not
> >>+ * block and should finish as soon as possible.
> >I think that the explicit note about zone->lock is not really need. This
> >can change in future and I would even bet that somebody might rely on
> >the lock being held for some purpose and silently get broken with the
> >change. Instead I would much rather see something like the following:
> >"
> >Please note that there are no locking guarantees for the callback
> 
> Just a little confused with this one:
> 
> The callback is invoked within zone->lock, why would we claim it "no
> locking guarantees for the callback"?

Because we definitely do not want anybody to rely on that fact and
(ab)use it. This might change in future and it would be better to be
clear about that.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/4] pwm: pwm-tiecap: Add TI 66AK2G SoC specific compatible

2017-08-21 Thread Thierry Reding
On Mon, Aug 07, 2017 at 05:19:40PM +0530, Vignesh R wrote:
> Add a new compatible string "ti,k2g-ecap" to support PWM ECAP IP of
> TI 66AK2G SoC.
> 
> Signed-off-by: Vignesh R 
> ---
>  Documentation/devicetree/bindings/pwm/pwm-tiecap.txt | 1 +
>  1 file changed, 1 insertion(+)

Applied to for-4.14/drivers, thanks.

Thierry


signature.asc
Description: PGP signature


[PATCH v2] rcar-dmac: initialize all data before registering IRQ handler

2017-08-21 Thread Kuninori Morimoto

From: Kuninori Morimoto 

Anton Volkov noticed that engine->dev is NULL before
of_dma_controller_register() in probe.
Thus there might be a NULL pointer dereference in
rcar_dmac_chan_start_xfer while accessing chan->chan.device->dev which
is equal to (>engine)->dev.
On same reason, same and similar things will happen if we didn't
initialize all necessary data before calling register irq function.
To be more safety code, this patch initialize all necessary data
before calling register irq function.

Reported-by: Anton Volkov 
Signed-off-by: Kuninori Morimoto 
---
v1 -> v2

 - care devm_request_threaded_irq(xxx) on rcar_dmac_chan_probe()
 - care devm_request_irq(xx) on rcar_dmac_probe()

 drivers/dma/sh/rcar-dmac.c | 85 +++---
 1 file changed, 43 insertions(+), 42 deletions(-)

diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c
index ffcadca..2b2c7db 100644
--- a/drivers/dma/sh/rcar-dmac.c
+++ b/drivers/dma/sh/rcar-dmac.c
@@ -1690,6 +1690,15 @@ static int rcar_dmac_chan_probe(struct rcar_dmac *dmac,
if (!irqname)
return -ENOMEM;
 
+   /*
+* Initialize the DMA engine channel and add it to the DMA engine
+* channels list.
+*/
+   chan->device = >engine;
+   dma_cookie_init(chan);
+
+   list_add_tail(>device_node, >engine.channels);
+
ret = devm_request_threaded_irq(dmac->dev, rchan->irq,
rcar_dmac_isr_channel,
rcar_dmac_isr_channel_thread, 0,
@@ -1700,15 +1709,6 @@ static int rcar_dmac_chan_probe(struct rcar_dmac *dmac,
return ret;
}
 
-   /*
-* Initialize the DMA engine channel and add it to the DMA engine
-* channels list.
-*/
-   chan->device = >engine;
-   dma_cookie_init(chan);
-
-   list_add_tail(>device_node, >engine.channels);
-
return 0;
 }
 
@@ -1794,14 +1794,6 @@ static int rcar_dmac_probe(struct platform_device *pdev)
if (!irqname)
return -ENOMEM;
 
-   ret = devm_request_irq(>dev, irq, rcar_dmac_isr_error, 0,
-  irqname, dmac);
-   if (ret) {
-   dev_err(>dev, "failed to request IRQ %u (%d)\n",
-   irq, ret);
-   return ret;
-   }
-
/* Enable runtime PM and initialize the device. */
pm_runtime_enable(>dev);
ret = pm_runtime_get_sync(>dev);
@@ -1818,8 +1810,32 @@ static int rcar_dmac_probe(struct platform_device *pdev)
goto error;
}
 
-   /* Initialize the channels. */
-   INIT_LIST_HEAD(>engine.channels);
+   /* Initialize engine */
+   engine = >engine;
+
+   dma_cap_set(DMA_MEMCPY, engine->cap_mask);
+   dma_cap_set(DMA_SLAVE, engine->cap_mask);
+
+   engine->dev = >dev;
+   engine->copy_align  = ilog2(RCAR_DMAC_MEMCPY_XFER_SIZE);
+
+   engine->src_addr_widths = widths;
+   engine->dst_addr_widths = widths;
+   engine->directions  = BIT(DMA_MEM_TO_DEV) | BIT(DMA_DEV_TO_MEM);
+   engine->residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
+
+   engine->device_alloc_chan_resources = 
rcar_dmac_alloc_chan_resources;
+   engine->device_free_chan_resources  = rcar_dmac_free_chan_resources;
+   engine->device_prep_dma_memcpy  = rcar_dmac_prep_dma_memcpy;
+   engine->device_prep_slave_sg= rcar_dmac_prep_slave_sg;
+   engine->device_prep_dma_cyclic  = rcar_dmac_prep_dma_cyclic;
+   engine->device_config   = rcar_dmac_device_config;
+   engine->device_terminate_all= rcar_dmac_chan_terminate_all;
+   engine->device_tx_status= rcar_dmac_tx_status;
+   engine->device_issue_pending= rcar_dmac_issue_pending;
+   engine->device_synchronize  = rcar_dmac_device_synchronize;
+
+   INIT_LIST_HEAD(>channels);
 
for (i = 0; i < dmac->n_channels; ++i) {
ret = rcar_dmac_chan_probe(dmac, >channels[i],
@@ -1828,6 +1844,14 @@ static int rcar_dmac_probe(struct platform_device *pdev)
goto error;
}
 
+   ret = devm_request_irq(>dev, irq, rcar_dmac_isr_error, 0,
+  irqname, dmac);
+   if (ret) {
+   dev_err(>dev, "failed to request IRQ %u (%d)\n",
+   irq, ret);
+   return ret;
+   }
+
/* Register the DMAC as a DMA provider for DT. */
ret = of_dma_controller_register(pdev->dev.of_node, rcar_dmac_of_xlate,
 NULL);
@@ -1839,29 +1863,6 @@ static int rcar_dmac_probe(struct platform_device *pdev)
 *
 * Default transfer size of 32 bytes requires 32-byte alignment.
 */
-   engine = 

Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

2017-08-21 Thread Sinan Kaya
On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
>> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> ...
>> if (pci_bus_crs_pending(id))
>>  return pci_bus_wait_crs(dev->bus, dev->devfn, , 6);
>>
>>> I think that makes sense.  We'd want to check for CRS SV being
>>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
>>> pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
>>> in the root port's pci_dev, and check it with something like what
>>> pcie_root_rcb_set() does?
>>>
>>
>> You can observe CRS under the following conditions
>>
>> 1. root port <-> endpoint 
>> 2. bridge <-> endpoint 
>> 3. root port<->bridge
>>
>> I was relying on the fact that we are reading 0x001 as an indication that
>> this device detected CRS. Maybe, this is too indirect.
>>
>> If we also want to capture the capability, I think the right thing is to
>> check the parent capability.
>>
>> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
>> {
>>  if (device type(bridge) == root port)
>>  return read(root_crs_register_reg);
>>
>>  if (device type(bridge) == switch)
>>  return read(switch_crs_register);
> 
> I don't understand this part.  AFAIK, CRS SV is only a feature of root
> ports.  The capability and enable bits are in the Root Capabilities
> and Root Control registers.
> 

No question about it.

> It's certainly true that a device below a switch can respond with a
> CRS completion, but the switch is not the requester, and my
> understanding is that it would not take any action on the completion
> other than passing it upstream.
> 

I saw some bridge references in the spec for CRS. I was going to do
some research for it. You answered my question. I was curious how this
would impact the behavior.

"Bridge Configuration Retry Enable – When Set, this bit enables PCI Express
to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
response to Configuration Requests that target devices below the bridge. 
Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
further details."

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


[PATCH RFC v3 3/9] KVM: remember position in kvm->vcpus array

2017-08-21 Thread Radim Krčmář
Signed-off-by: Radim Krčmář 
---
 include/linux/kvm_host.h | 11 +++
 virt/kvm/kvm_main.c  |  5 -
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6882538eda32..a8ff956616d2 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -220,7 +220,8 @@ struct kvm_vcpu {
struct preempt_notifier preempt_notifier;
 #endif
int cpu;
-   int vcpu_id;
+   int vcpu_id; /* id given by userspace at creation */
+   int vcpus_idx; /* index in kvm->vcpus array */
int srcu_idx;
int mode;
unsigned long requests;
@@ -516,13 +517,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu_by_id(struct 
kvm *kvm, int id)
 
 static inline int kvm_vcpu_get_idx(struct kvm_vcpu *vcpu)
 {
-   struct kvm_vcpu *tmp;
-   int idx;
-
-   kvm_for_each_vcpu(idx, tmp, vcpu->kvm)
-   if (tmp == vcpu)
-   return idx;
-   BUG();
+   return vcpu->vcpus_idx;
 }
 
 #define kvm_for_each_memslot(memslot, slots)   \
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e17c40d986f3..caf8323f7df7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2498,7 +2498,10 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 
id)
goto unlock_vcpu_destroy;
}
 
-   BUG_ON(kvm->vcpus[atomic_read(>online_vcpus)]);
+   vcpu->vcpus_idx = atomic_read(>online_vcpus);
+
+   BUG_ON(kvm->vcpus[vcpu->vcpus_idx]);
+
 
/* Now it's all set up, let userspace reach it */
kvm_get_kvm(kvm);
-- 
2.13.3



[PATCH 4/4] w1-masters: Improve a size determination in four functions

2017-08-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 21 Aug 2017 21:53:21 +0200

Replace the specification of data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/w1/masters/ds2482.c  | 3 ++-
 drivers/w1/masters/ds2490.c  | 2 +-
 drivers/w1/masters/mxc_w1.c  | 3 +--
 drivers/w1/masters/w1-gpio.c | 3 +--
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/w1/masters/ds2482.c b/drivers/w1/masters/ds2482.c
index d49681cd29af..7c3e25108285 100644
--- a/drivers/w1/masters/ds2482.c
+++ b/drivers/w1/masters/ds2482.c
@@ -451,7 +451,8 @@ static int ds2482_probe(struct i2c_client *client,
 I2C_FUNC_SMBUS_BYTE))
return -ENODEV;
 
-   if (!(data = kzalloc(sizeof(struct ds2482_data), GFP_KERNEL))) {
+   data = kzalloc(sizeof(*data), GFP_KERNEL);
+   if (!data) {
err = -ENOMEM;
goto exit;
}
diff --git a/drivers/w1/masters/ds2490.c b/drivers/w1/masters/ds2490.c
index c0ee6ca9ce93..1e5b81490ffe 100644
--- a/drivers/w1/masters/ds2490.c
+++ b/drivers/w1/masters/ds2490.c
@@ -994,5 +994,5 @@ static int ds_probe(struct usb_interface *intf,
struct ds_device *dev;
int i, err, alt;
 
-   dev = kzalloc(sizeof(struct ds_device), GFP_KERNEL);
+   dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
diff --git a/drivers/w1/masters/mxc_w1.c b/drivers/w1/masters/mxc_w1.c
index 74f2e6e6202a..40a34942d07f 100644
--- a/drivers/w1/masters/mxc_w1.c
+++ b/drivers/w1/masters/mxc_w1.c
@@ -103,6 +103,5 @@ static int mxc_w1_probe(struct platform_device *pdev)
unsigned int clkdiv;
int err;
 
-   mdev = devm_kzalloc(>dev, sizeof(struct mxc_w1_device),
-   GFP_KERNEL);
+   mdev = devm_kzalloc(>dev, sizeof(*mdev), GFP_KERNEL);
if (!mdev)
diff --git a/drivers/w1/masters/w1-gpio.c b/drivers/w1/masters/w1-gpio.c
index 6e8b18bf9fb1..a92eb1407f0f 100644
--- a/drivers/w1/masters/w1-gpio.c
+++ b/drivers/w1/masters/w1-gpio.c
@@ -128,6 +128,5 @@ static int w1_gpio_probe(struct platform_device *pdev)
return -ENXIO;
}
 
-   master = devm_kzalloc(>dev, sizeof(struct w1_bus_master),
-   GFP_KERNEL);
+   master = devm_kzalloc(>dev, sizeof(*master), GFP_KERNEL);
if (!master)
-- 
2.14.0



Re: [PATCH v3 net-next] bpf/verifier: track liveness for pruning

2017-08-21 Thread Alexei Starovoitov

On 8/21/17 2:00 PM, Daniel Borkmann wrote:

On 08/21/2017 10:44 PM, Edward Cree wrote:

On 21/08/17 21:27, Daniel Borkmann wrote:

On 08/21/2017 08:36 PM, Edward Cree wrote:

On 19/08/17 00:37, Alexei Starovoitov wrote:

[...]

I'm tempted to just rip out env->varlen_map_value_access and always
check
   the whole thing, because honestly I don't know what it was meant
to do
   originally or how it can ever do any useful pruning.  While
drastic, it
   does cause your test case to pass.


Original intention from 484611357c19 ("bpf: allow access into map
value arrays") was that it wouldn't potentially make pruning worse
if PTR_TO_MAP_VALUE_ADJ was not used, meaning that we wouldn't need
to take reg state's min_value and max_value into account for state
checking; this was basically due to min_value / max_value is being
adjusted/tracked on every alu/jmp ops for involved regs (e.g.
adjust_reg_min_max_vals() and others that mangle them) even if we
have the case that no actual dynamic map access is used throughout
the program. To give an example on net tree, the bpf_lxc.o prog's
section increases from 36,386 to 68,226 when
env->varlen_map_value_access
is always true, so it does have an effect. Did you do some checks
on this on net-next?

I tested with the cilium progs and saw no change in insn count.  I
  suspect that for the normal case I already killed this optimisation
  when I did my unification patch, it was previously about ignoring
  min/max values on all regs (including scalars), whereas on net-next
  it only ignores them on map_value pointers; in practice this is
  useless because we tend to still have the offset scalar sitting in
  a register somewhere.  (Come to think of it, this may have been
  behind a large chunk of the #insn increase that my patches caused.)


Yeah, this would seem plausible.


Since we use umax_value in find_good_pkt_pointers() now (to check
  against MAX_PACKET_OFF and ensure our reg->range is really ok), we
  can't just stop caring about all min/max values just because we
  haven't done any variable map accesses.
I don't see a way around this.


Agree, was thinking the same. If there's not really a regression in
terms of complexity, then lets kill the flag.


+1

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2489e67b65f6..908d13b2a2aa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3582,7 +3582,7 @@ static int do_check(struct bpf_verifier_env *env)
init_reg_state(regs);
state->parent = NULL;
insn_idx = 0;
-   env->varlen_map_value_access = false;
+   env->varlen_map_value_access = true;

makes _zero_ difference on cilium*.o tests, so let's just kill
that workaround.



Re: [PATCH v2] mm/hugetlb.c: make huge_pte_offset() consistent and document behaviour

2017-08-21 Thread Mike Kravetz
On 08/21/2017 11:07 AM, Catalin Marinas wrote:
> On Fri, Aug 18, 2017 at 02:29:18PM -0700, Mike Kravetz wrote:
>> On 08/18/2017 07:54 AM, Punit Agrawal wrote:
>>> When walking the page tables to resolve an address that points to
>>> !p*d_present() entry, huge_pte_offset() returns inconsistent values
>>> depending on the level of page table (PUD or PMD).
>>>
>>> It returns NULL in the case of a PUD entry while in the case of a PMD
>>> entry, it returns a pointer to the page table entry.
>>>
>>> A similar inconsitency exists when handling swap entries - returns NULL
>>> for a PUD entry while a pointer to the pte_t is retured for the PMD entry.
>>>
>>> Update huge_pte_offset() to make the behaviour consistent - return a
>>> pointer to the pte_t for hugepage or swap entries. Only return NULL in
>>> instances where we have a p*d_none() entry and the size parameter
>>> doesn't match the hugepage size at this level of the page table.
>>>
>>> Document the behaviour to clarify the expected behaviour of this function.
>>> This is to set clear semantics for architecture specific implementations
>>> of huge_pte_offset().
>>>
>>> Signed-off-by: Punit Agrawal 
>>> Cc: Catalin Marinas 
>>> Cc: Naoya Horiguchi 
>>> Cc: Steve Capper 
>>> Cc: Will Deacon 
>>> Cc: Kirill A. Shutemov 
>>> Cc: Michal Hocko 
>>> Cc: Mike Kravetz 
>>> ---
>>>
>>> Hi Andrew,
>>>
>>> From discussions on the arm64 implementation of huge_pte_offset()[0]
>>> we realised that there is benefit from returning a pte_t* in the case
>>> of p*d_none().
>>>
>>> The fault handling code in hugetlb_fault() can handle p*d_none()
>>> entries and saves an extra round trip to huge_pte_alloc(). Other
>>> callers of huge_pte_offset() should be ok as well.
>>
>> Yes, this change would eliminate that call to huge_pte_alloc() in
>> hugetlb_fault().  However, huge_pte_offset() is now returning a pointer
>> to a p*d_none() pte in some instances where it would have previously
>> returned NULL.  Correct?
> 
> Yes (whether it was previously the right thing to return is a different
> matter; that's what we are trying to clarify in the generic code so that
> we can have similar semantics on arm64).
> 
>> I went through the callers, and like you am fairly confident that they
>> can handle this situation.  But, returning  p*d_none() instead of NULL
>> does change the execution path in several routines such as
>> copy_hugetlb_page_range, __unmap_hugepage_range hugetlb_change_protection,
>> and follow_hugetlb_page.  If huge_pte_alloc() returns NULL to these
>> routines, they do a quick continue, exit, etc.  If they are returned
>> a pointer, they typically lock the page table(s) and then check for
>> p*d_none() before continuing, exiting, etc.  So, it appears that these
>> routines could potentially slow down a bit with this change (in the specific
>> case of p*d_none).
> 
> Arguably (well, my interpretation), it should return a NULL only if the
> entry is a table entry, potentially pointing to a next level (pmd). In
> the pud case, this means that sz < PUD_SIZE.
> 
> If the pud is a last level huge page entry (either present or !present),
> huge_pte_offset() should return the pointer to it and never NULL. If the
> entry is a swap or migration one (pte_present() == false) with the
> current code we don't even enter the corresponding checks in
> copy_hugetlb_page_range().
> 
> I also assume that the ptl __unmap_hugepage_range() is taken to avoid
> some race when the entry is a huge page (present or not). If such race
> doesn't exist, we could as well check the huge_pte_none() outside the
> locked region (which is what the current huge_pte_offset() does with
> !pud_present()).
> 
> IMHO, while the current generic huge_pte_offset() avoids some code paths
> in the functions you mentioned, the results are not always correct
> (missing swap/migration entries or potentially racy).

Thanks Catalin,

The more I look at this code and think about it, the more I like it.  As
Michal previously mentioned, changes in this area can break things in subtle
ways.  That is why I was cautious and asked for more people to look at it.
My primary concerns with these changes in this area were:
- Any potential changes in behavior.  I think this has been sufficiently
  explored.  While there may be small differences in behavior (for the
  better), this change should not introduce any bugs/breakage.
- Other arch specific implementations are not aligned with the new
  behavior.  Again, this should not cause any issues.  Punit (and I) have
  looked at the arch specific implementations for issues and found none.
  In addition, since we are not changing any of the 'calling code', no
  issues should be introduced for arch specific implementations.

I like the new semantics and did not find any issues.

Reviewed-by: Mike 

Re: [PATCH v4 6/9] ASoC: rockchip: Parse dai links from dts

2017-08-21 Thread Matthias Kaehlcke
El Fri, Aug 18, 2017 at 11:11:44AM +0800 Jeffy Chen ha dit:

> Refactor rockchip_sound_probe, parse dai links from dts instead of
> hard coding them.
> 
> Signed-off-by: Jeffy Chen 
> ---
> 
> Changes in v4: None
> Changes in v3:
> Use compatible to match audio codecs
> -- Suggested-by Matthias Kaehlcke 
> 
> Changes in v2:
> Let rockchip,codec-names be a required property, because we plan to
> add more supported codecs to the fixed dai link list in the driver.
> 
>  sound/soc/rockchip/rk3399_gru_sound.c | 139 
> ++
>  1 file changed, 91 insertions(+), 48 deletions(-)

Looks good to me, though I'm by no means an audio expert. On kevin the
codecs are enumerated at boot.

FWIW:

Reviewed-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 

> diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
> b/sound/soc/rockchip/rk3399_gru_sound.c
> index 9b7e28703bfb..d532336871d7 100644
> --- a/sound/soc/rockchip/rk3399_gru_sound.c
> +++ b/sound/soc/rockchip/rk3399_gru_sound.c
> @@ -235,14 +235,42 @@ static const struct snd_soc_ops 
> rockchip_sound_da7219_ops = {
>   .hw_params = rockchip_sound_da7219_hw_params,
>  };
>  
> +static struct snd_soc_card rockchip_sound_card = {
> + .name = "rk3399-gru-sound",
> + .owner = THIS_MODULE,
> + .dapm_widgets = rockchip_dapm_widgets,
> + .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
> + .dapm_routes = rockchip_dapm_routes,
> + .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
> + .controls = rockchip_controls,
> + .num_controls = ARRAY_SIZE(rockchip_controls),
> +};
> +
>  enum {
> + DAILINK_DA7219,
>   DAILINK_MAX98357A,
>   DAILINK_RT5514,
> - DAILINK_DA7219,
>   DAILINK_RT5514_DSP,
>  };
>  
> -static struct snd_soc_dai_link rockchip_dailinks[] = {
> +static const char * const dailink_compat[] = {
> + [DAILINK_DA7219] = "dlg,da7219",
> + [DAILINK_MAX98357A] = "maxim,max98357a",
> + [DAILINK_RT5514] = "realtek,rt5514-i2c",
> + [DAILINK_RT5514_DSP] = "realtek,rt5514-spi",
> +};
> +
> +static const struct snd_soc_dai_link rockchip_dais[] = {
> + [DAILINK_DA7219] = {
> + .name = "DA7219",
> + .stream_name = "DA7219 PCM",
> + .codec_dai_name = "da7219-hifi",
> + .init = rockchip_sound_da7219_init,
> + .ops = _sound_da7219_ops,
> + /* set da7219 as slave */
> + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
> + SND_SOC_DAIFMT_CBS_CFS,
> + },
>   [DAILINK_MAX98357A] = {
>   .name = "MAX98357A",
>   .stream_name = "MAX98357A PCM",
> @@ -261,16 +289,6 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
>   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
>   SND_SOC_DAIFMT_CBS_CFS,
>   },
> - [DAILINK_DA7219] = {
> - .name = "DA7219",
> - .stream_name = "DA7219 PCM",
> - .codec_dai_name = "da7219-hifi",
> - .init = rockchip_sound_da7219_init,
> - .ops = _sound_da7219_ops,
> - /* set da7219 as slave */
> - .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
> - SND_SOC_DAIFMT_CBS_CFS,
> - },
>   /* RT5514 DSP for voice wakeup via spi bus */
>   [DAILINK_RT5514_DSP] = {
>   .name = "RT5514 DSP",
> @@ -279,53 +297,78 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
>   },
>  };
>  
> -static struct snd_soc_card rockchip_sound_card = {
> - .name = "rk3399-gru-sound",
> - .owner = THIS_MODULE,
> - .dai_link = rockchip_dailinks,
> - .num_links =  ARRAY_SIZE(rockchip_dailinks),
> - .dapm_widgets = rockchip_dapm_widgets,
> - .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
> - .dapm_routes = rockchip_dapm_routes,
> - .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
> - .controls = rockchip_controls,
> - .num_controls = ARRAY_SIZE(rockchip_controls),
> -};
> -
> -static int rockchip_sound_probe(struct platform_device *pdev)
> +static int rockchip_sound_codec_node_match(struct device_node *np_codec)
>  {
> - struct snd_soc_card *card = _sound_card;
> - struct device_node *cpu_node;
> - int i, ret;
> + int i;
>  
> - cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0);
> - if (!cpu_node) {
> - dev_err(>dev, "Property 'rockchip,cpu' missing or 
> invalid\n");
> - return -EINVAL;
> + for (i = 0; i < ARRAY_SIZE(dailink_compat); i++) {
> + if (of_device_is_compatible(np_codec, dailink_compat[i]))
> + return i;
>   }
> + return -1;
> +}
>  
> - for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) {
> - rockchip_dailinks[i].platform_of_node = cpu_node;
> - 

[PATCH net-next,2/4] hv_netvsc: Clean up unused parameter from netvsc_get_rss_hash_opts()

2017-08-21 Thread Haiyang Zhang
From: Haiyang Zhang 

The parameter "nvdev" is not in use.

Signed-off-by: Haiyang Zhang 
---
 drivers/net/hyperv/netvsc_drv.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 4677d21..d8612b1 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1228,8 +1228,7 @@ static void netvsc_get_strings(struct net_device *dev, 
u32 stringset, u8 *data)
 }
 
 static int
-netvsc_get_rss_hash_opts(struct netvsc_device *nvdev,
-struct ethtool_rxnfc *info)
+netvsc_get_rss_hash_opts(struct ethtool_rxnfc *info)
 {
info->data = RXH_IP_SRC | RXH_IP_DST;
 
@@ -1267,7 +1266,7 @@ static void netvsc_get_strings(struct net_device *dev, 
u32 stringset, u8 *data)
return 0;
 
case ETHTOOL_GRXFH:
-   return netvsc_get_rss_hash_opts(nvdev, info);
+   return netvsc_get_rss_hash_opts(info);
}
return -EOPNOTSUPP;
 }
-- 
1.7.1



[PATCH net-next,1/4] hv_netvsc: Clean up unused parameter from netvsc_get_hash()

2017-08-21 Thread Haiyang Zhang
From: Haiyang Zhang 

The parameter "sk" is not in use.

Signed-off-by: Haiyang Zhang 
---
 drivers/net/hyperv/netvsc_drv.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index b33f050..4677d21 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -193,7 +193,7 @@ static int netvsc_close(struct net_device *net)
 /* Azure hosts don't support non-TCP port numbers in hashing yet. We compute
  * hash for non-TCP traffic with only IP numbers.
  */
-static inline u32 netvsc_get_hash(struct sk_buff *skb, struct sock *sk)
+static inline u32 netvsc_get_hash(struct sk_buff *skb)
 {
struct flow_keys flow;
u32 hash;
@@ -227,7 +227,7 @@ static inline int netvsc_get_tx_queue(struct net_device 
*ndev,
struct sock *sk = skb->sk;
int q_idx;
 
-   q_idx = ndc->tx_send_table[netvsc_get_hash(skb, sk) &
+   q_idx = ndc->tx_send_table[netvsc_get_hash(skb) &
   (VRSS_SEND_TAB_SIZE - 1)];
 
/* If queue index changed record the new value */
-- 
1.7.1



Re: [PATCH v2] PM / AVS: rockchip-io: add io selectors and supplies for RV1108

2017-08-21 Thread Heiko Stuebner
Am Montag, 21. August 2017, 18:58:33 CEST schrieb David Wu:
> This adds the necessary data for handling io voltage domains on the RV1108.
> 
> Signed-off-by: David Wu 

Reviewed-by: Heiko Stuebner 


Re: [PATCH v3 1/5] ACPI / blacklist: add acpi_match_platform_list()

2017-08-21 Thread Rafael J. Wysocki
On Tue, Aug 22, 2017 at 12:21 AM, Kani, Toshimitsu  wrote:
> On Mon, 2017-08-21 at 23:49 +0200, Rafael J. Wysocki wrote:
>> On Mon, Aug 21, 2017 at 11:06 PM, Kani, Toshimitsu > m> wrote:
>> > On Mon, 2017-08-21 at 22:31 +0200, Rafael J. Wysocki wrote:
>> > > On Mon, Aug 21, 2017 at 7:36 PM, Borislav Petkov 
>> > > wrote:
>> > > > On Mon, Aug 21, 2017 at 05:23:37PM +, Kani, Toshimitsu
>> > > > wrote:
>> > > > > > > 'data' here is private to the caller.  So, I do not think
>> > > > > > > we need to define the bits.  Shall I change the name to
>> > > > > > > 'driver_data' to make it more explicit?
>> > > > > >
>> > > > > > You changed it to 'data'. It was a u32-used-as-boolean
>> > > > > > is_critical_error before.
>> > > > > >
>> > > > > > So you can just as well make it into flags and people can
>> > > > > > extend those flags if needed. A flag bit should be enough
>> > > > > > in most cases anyway. If they really need driver_data, then
>> > > > > > they can add a void *member.
>> > > > >
>> > > > > Hmm.. In patch 2, intel_pstate_platform_pwr_mgmt_exists()
>> > > > > uses this field for PSS and PCC, which are enum values.  I
>> > > > > think we should allow drivers to set any values here.  I
>> > > > > agree that it may need to be void * if we also allow drivers
>> > > > > to set a pointer here.
>> > > >
>> > > > Let's see what Rafael prefers.
>> > >
>> > > I would retain the is_critical_error field and use that for
>> > > printing the recoverable / non-recoverable message.  This is kind
>> > > of orthogonal to whether or not any extra data is needed and that
>> > > can be an additional field.  In that case unsigned long should be
>> > > sufficient to accommodate a pointer if need be.
>> >
>> > Yes, we will retain the field.  The question is whether this field
>> > should be retained as a driver's private data or ACPI-managed
>> > flags.
>>
>> Thanks for the clarification.
>>
>> > My patch implements the former, which lets the callers to define
>> > the data values.  For instance, acpi_blacklisted() uses this field
>> > as is_critical_error value, and
>> > intel_pstate_platform_pwr_mgmt_exists() uses it as oem_pwr_table
>> > value.
>> >
>> > Boris suggested the latter, which lets ACPI to define the flags,
>> > which are then used by the callers.  For instance, he suggested
>> > ACPI to define bit0 as is_critical_error.
>> >
>> > #define ACPI_PLAT_IS_CRITICAL_ERROR BIT(0)
>>
>> So my point is that we can have both the ACPI-managed flags and the
>> the caller-defined data at the same time as separate items.
>>
>> That would allow of maximum flexibility IMO.
>
> I agree in general.  Driver private data allows flexibility to drivers
> when the values are driver-private.  ACPI-managed flags allows ACPI to
> control the interfaces based on the flags.
>
> Since we do not have use-case of the latter case yet, i.e.
> acpi_match_platform_list() does not need to check the flags, I'd
> suggest that we keep 'data' as driver-private.  We can add 'flags' as a
> separate member to the structure when we find the latter use-case.

OK

Thanks,
Rafael


Re: [PATCH v1 1/2] clk: rockchip: add rk3228 sclk_sdio_src ID

2017-08-21 Thread Heiko Stuebner
Am Freitag, 18. August 2017, 11:49:24 CEST schrieb Elaine Zhang:
> This patch exports sdio src clock for dts reference.
> 
> Signed-off-by: Elaine Zhang 

applied for 4.14


Thanks
Heiko


Re: [PATCH v2] perf tools: Add ARM Statistical Profiling Extensions (SPE) support

2017-08-21 Thread Kim Phillips
On Fri, 18 Aug 2017 18:36:09 +0100
Mark Rutland  wrote:

> Hi Kim,

Hi Mark,

> On Thu, Aug 17, 2017 at 10:11:50PM -0500, Kim Phillips wrote:
> > Hi Mark, I've tried to proceed as much as possible without your
> > response, so if you still have comments to my above comments, please
> > comment in-line above, otherwise review the v2 patch below?
> 
> Apologies again for the late response, and thanks for the updated patch!

Thanks for your prompt response this time around.

> > .
> > . ... ARM SPE data: size 536432 bytes
> > .  :  4a 01   B COND
> > .  0002:  b1 00 00 00 00 00 00 00 80  TGT 0 el0 ns=1
> > .  000b:  42 42   RETIRED 
> > NOT-TAKEN
> > .  000d:  b0 20 41 c0 ad ff ff 00 80  PC 
> > adc04120 el0 ns=1
> > .  0016:  98 00 00LAT 0 TOT
> > .  0019:  71 80 3e f7 46 e9 01 00 00  TS 
> > 2101429616256
> > .  0022:  49 01   ST
> > .  0024:  b2 50 bd ba 73 00 80 ff ff  VA 
> > 800073babd50
> > .  002d:  b3 50 bd ba f3 00 00 00 80  PA f3babd50 
> > ns=1
> > .  0036:  9a 00 00LAT 0 XLAT
> > .  0039:  42 16   RETIRED 
> > L1D-ACCESS TLB-ACCESS
> > .  003b:  b0 8c b4 1e 08 00 00 ff ff  PC 
> > ff081eb48c el3 ns=1
> > .  0044:  98 00 00LAT 0 TOT
> > .  0047:  71 cc 44 f7 46 e9 01 00 00  TS 
> > 2101429617868
> > .  0050:  48 00   INSN-OTHER
> > .  0052:  42 02   RETIRED
> > .  0054:  b0 58 54 1f 08 00 00 ff ff  PC 
> > ff081f5458 el3 ns=1
> > .  005d:  98 00 00LAT 0 TOT
> > .  0060:  71 cc 44 f7 46 e9 01 00 00  TS 
> > 2101429617868
> 
> So FWIW, I think this is a good example of why that padding I requested
> last time round matters.
> 
> For the first PC packet, I had to count the number of characters to see
> that it was a TTBR0 address, which is made much clearer with leading
> padding, as adc04120. With the addresses padded, the EL and NS
> fields would also be aligned, making it *much* easier to scan by eye.

See my response in my prior email.

> > - multiple SPE clusters/domains support pending potential driver changes?
> 
> As covered in my other reply, I don't believe that the driver is going
> to change in this regard. Userspace will need to handle multiple SPE
> instances.
> 
> I'll ignore that in the code below for now.

Please let's continue the discussion in one place, and again in this
case, in the last email.

> > - CPU mask / new record behaviour bisected to commit e3ba76deef23064 "perf
> >   tools: Force uncore events to system wide monitoring".  Waiting to hear 
> > back
> >   on why driver can't do system wide monitoring, even across PPIs, by e.g.,
> >   sharing the SPE interrupts in one handler (SPE's don't differ in this 
> > record
> >   regard).
> 
> Could you elaborate on this? I don't follow the interrupt handler
> comments.

Would it be possible for the driver to request the IRQs with
IRQF_SHARED, in order to be able to operate across the multiple PPIs?

> > +static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused)
> > +{
> > +   u64 ts;
> > +
> > +   asm volatile ("isb; mrs %0, cntvct_el0" : "=r" (ts));
> > +
> > +   return ts;
> > +}
> 
> As covered in my other reply, please don't use the counter for this.
> 
> It sounds like we need a simple/generic function to get a nonce, that
> we could share with the ETM code.

I've switched to using clock_gettime(CLOCK_MONOTONIC_RAW, ...).  The
ETM code uses two rand() calls, which, according to some minor
benchmarking on Juno, is almost twice as slow as clock_gettime. It's
three lines still, so I'll update the ETM code in-place independently
of this patch, and after the gettime implementation is reviewed.

> > +int arm_spe_get_packet(const unsigned char *buf, size_t len,
> > +  struct arm_spe_pkt *packet)
> > +{
> > +   int ret;
> > +
> > +   ret = arm_spe_do_get_packet(buf, len, packet);
> > +   if (ret > 0 && packet->type == ARM_SPE_PAD) {
> > +   while (ret < 16 && len > (size_t)ret && !buf[ret])
> > +   ret += 1;
> > +   }
> > +   return ret;
> > +}
> 
> What's this doing? Skipping padding? What's the significance of 16?

I'll repeat the relevant part of the v2 changelog here:

- do_get_packet fixed to handle excessive, successive PADding from a new source
  of raw SPE data, so instead of:

.  11ae:  00  PAD
.  11af:  00   

Re: [PATCH V2] spmi: pmic-arb: Enforce the ownership check optionally

2017-08-21 Thread Stephen Boyd
On 08/18/2017 08:28 AM, Kiran Gunda wrote:
> The peripheral ownership check is not necessary on single master
> platforms. Hence, enforce the peripheral ownership check optionally.
>
> Signed-off-by: Kiran Gunda 
> Tested-by: Shawn Guo 
> ---

This sounds like a band-aid. Isn't the gpio driver going to keep probing
all the pins that are not supposed to be accessed due to security
constraints? What exactly is failing in the gpio case?

Also, I thought we were getting rid of the ownership checks? Or at
least, putting them behind some debug kernel feature check or something?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH v5 00/16] Add QCOM QPIC NAND support

2017-08-21 Thread Boris Brezillon
Le Thu, 17 Aug 2017 17:37:38 +0530,
Abhishek Sahu  a écrit :

> * v5:
> 
> 1. Removed the patches already applied to linux-next and rebased the
>remaining patches on [3]
> 2. Addressed the review comments in v4 and Added Archit Reviewed
>by tag.
> 
> [3] http://git.infradead.org/l2-mtd.git/shortlog/refs/heads/nand/next
> 
> * v4:
> 
> 1. Added Acked-by from Rob for DT documentation patches
> 2. Removed ipq8074 compatible string from ipq4019 DT example.
> 2. Used the BIT macro for NAND_CMD_VLD bits and consistent names
>as suggested by Boris in v3.
> 
> * v3:
> 
> 1. Removed the patches already applied to linux-next and
>rebased the remaining patches on [1]
> 2. Reordered the patches and put the BAM DMA changes [2]
>dependent patches and compatible string patches in last
> 3. Removed the register offsets array and used the dev_cmd offsets
> 4. Changed some macro names to small letters for better code readability
> 5. Changed the compatible string to SoC specific
> 6. Did minor code changes for adding comment, error handling, structure names
> 7. Combined raw write (patch#18) and passing flag parameter (patch#22) patch
>into one
> 8. Made separate patch for compatible string and NAND properties
> 9. Made separate patch for BAM command descriptor and data descriptors 
> handling
> 10. Changed commit message for some of the patches
> 11. Addressed review comments given in v2
> 12. Added Reviewed-by of Archit for some of the patches from v2
> 13. All the MTD tests are working fine for IPQ8064 AP148, IPQ4019 DK04 and
> IPQ8074 HK01 boards for v3 patches
> 
> [1] http://git.infradead.org/l2-mtd.git/shortlog/refs/heads/nand/next
> [2] http://www.spinics.net/lists/dmaengine/msg13662.html
> 
> * v2:
> 
> 1. Addressed the review comments given in v1
> 2. Removed the DMA coherent buffer for register read and used
>streaming DMA API’s
> 3. Reorganized the NAND read and write functions
> 4. Separated patch for driver and documentation changes
> 5. Changed the compatible string for EBI2
> 
> * v1:
> 
> http://www.spinics.net/lists/devicetree/msg183706.html
> 
> 
> Abhishek Sahu (16):
>   mtd: nand: qcom: DMA mapping support for register read buffer
>   mtd: nand: qcom: allocate BAM transaction
>   mtd: nand: qcom: add BAM DMA descriptor handling
>   mtd: nand: qcom: support for passing flags in DMA helper functions
>   mtd: nand: qcom: support for read location registers
>   mtd: nand: qcom: erased codeword detection configuration
>   mtd: nand: qcom: enable BAM or ADM mode
>   mtd: nand: qcom: QPIC data descriptors handling
>   mtd: nand: qcom: support for different DEV_CMD register offsets
>   mtd: nand: qcom: add command elements in BAM transaction
>   mtd: nand: qcom: support for command descriptor formation
>   dt-bindings: qcom_nandc: fix the ipq806x device tree example
>   dt-bindings: qcom_nandc: IPQ4019 QPIC NAND documentation
>   dt-bindings: qcom_nandc: IPQ8074 QPIC NAND documentation
>   mtd: nand: qcom: support for IPQ4019 QPIC NAND controller
>   mtd: nand: qcom: Support for IPQ8074 QPIC NAND controller

Applied everything to nand/next except patch 10 and 11. Let me know
when the dmaengine dependency is merged and I'll take the remaining
patches.

Thanks,

Boris

> 
>  .../devicetree/bindings/mtd/qcom_nandc.txt |  63 +-
>  drivers/mtd/nand/qcom_nandc.c  | 746 
> ++---
>  2 files changed, 722 insertions(+), 87 deletions(-)
> 



Re: [PATCH] PM / Hibernate: Feed the wathdog when creating snapshot

2017-08-21 Thread Andrew Morton
On Mon, 21 Aug 2017 23:08:18 +0800 Chen Yu  wrote:

> There is a problem that when counting the pages for creating
> the hibernation snapshot will take significant amount of
> time, especially on system with large memory. Since the counting
> job is performed with irq disabled, this might lead to NMI lockup.
> The following warning were found on a system with 1.5TB DRAM:
> 
> ...
> 
> It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup
> was triggered. In case the timeout of the NMI watch dog has been
> set to 1 second, a safe interval should be 6590003/20 = 320k pages
> in theory. However there might also be some platforms running at a
> lower frequency, so feed the watchdog every 100k pages.
> 
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2531,9 +2532,12 @@ void drain_all_pages(struct zone *zone)
>  
>  #ifdef CONFIG_HIBERNATION
>  
> +/* Touch watchdog for every WD_INTERVAL_PAGE pages. */
> +#define WD_INTERVAL_PAGE (100*1024)
> +
>  void mark_free_pages(struct zone *zone)
>  {
> - unsigned long pfn, max_zone_pfn;
> + unsigned long pfn, max_zone_pfn, page_num = 0;
>   unsigned long flags;
>   unsigned int order, t;
>   struct page *page;
> @@ -2548,6 +2552,9 @@ void mark_free_pages(struct zone *zone)
>   if (pfn_valid(pfn)) {
>   page = pfn_to_page(pfn);
>  
> + if (!((page_num++) % WD_INTERVAL_PAGE))
> + touch_nmi_watchdog();
> +
>   if (page_zone(page) != zone)
>   continue;
>  
> @@ -2561,8 +2568,11 @@ void mark_free_pages(struct zone *zone)
>   unsigned long i;
>  
>   pfn = page_to_pfn(page);
> - for (i = 0; i < (1UL << order); i++)
> + for (i = 0; i < (1UL << order); i++) {
> + if (!((page_num++) % WD_INTERVAL_PAGE))
> + touch_nmi_watchdog();
>   swsusp_set_page_free(pfn_to_page(pfn + i));
> + }
>   }
>   }
>   spin_unlock_irqrestore(>lock, flags);

hm, is it really worth all the WD_INTERVAL_PAGE stuff? 
touch_nmi_watchdog() is pretty efficient and calling it once-per-page
may not have a measurable effect.

And if we're really concerned about the performance impact it would be
better to make WD_INTERVAL_PAGE a power of 2 (128*1024?) to avoid the
modulus operation.



[PATCH RFC v3 0/9] KVM: allow dynamic kvm->vcpus array

2017-08-21 Thread Radim Krčmář
The only common part with v2 is [v3 5/9].

The crucial part of this series is adding a separate mechanism for
kvm_for_each_vcpu() [v3 8/9] and with that change, I think that the
dynamic array [v3 9/9] would be nicer if protected by RCU, like in v2:
The protection can be nicely hidden in kvm_get_vcpu().  I just had the
split done before implementing [v3 8/9] and presented it for
consideration.

Smoke tested on x86 only.


Radim Krčmář (9):
  KVM: s390: optimize detection of started vcpus
  KVM: arm/arm64: fix vcpu self-detection in vgic_v3_dispatch_sgi()
  KVM: remember position in kvm->vcpus array
  KVM: arm/arm64: use locking helpers in kvm_vgic_create()
  KVM: remove unused __KVM_HAVE_ARCH_VM_ALLOC
  KVM: rework kvm_vcpu_on_spin loop
  KVM: add kvm_free_vcpus and kvm_arch_free_vcpus
  KVM: implement kvm_for_each_vcpu with a list
  KVM: split kvm->vcpus into chunks

 arch/mips/kvm/mips.c|  19 ++
 arch/powerpc/kvm/book3s_32_mmu.c|   3 +-
 arch/powerpc/kvm/book3s_64_mmu.c|   3 +-
 arch/powerpc/kvm/book3s_hv.c|   7 +-
 arch/powerpc/kvm/book3s_pr.c|   5 +-
 arch/powerpc/kvm/book3s_xics.c  |   2 +-
 arch/powerpc/kvm/book3s_xics.h  |   3 +-
 arch/powerpc/kvm/book3s_xive.c  |  18 +++---
 arch/powerpc/kvm/book3s_xive.h  |   3 +-
 arch/powerpc/kvm/e500_emulate.c |   3 +-
 arch/powerpc/kvm/powerpc.c  |  16 ++---
 arch/s390/include/asm/kvm_host.h|   1 +
 arch/s390/kvm/interrupt.c   |   3 +-
 arch/s390/kvm/kvm-s390.c|  77 --
 arch/s390/kvm/kvm-s390.h|   6 +-
 arch/s390/kvm/sigp.c|   3 +-
 arch/x86/kvm/hyperv.c   |   3 +-
 arch/x86/kvm/i8254.c|   3 +-
 arch/x86/kvm/i8259.c|   7 +-
 arch/x86/kvm/ioapic.c   |   3 +-
 arch/x86/kvm/irq_comm.c |  10 +--
 arch/x86/kvm/lapic.c|   5 +-
 arch/x86/kvm/svm.c  |   3 +-
 arch/x86/kvm/vmx.c  |   5 +-
 arch/x86/kvm/x86.c  |  34 --
 include/linux/kvm_host.h|  81 ---
 virt/kvm/arm/arch_timer.c   |  10 ++-
 virt/kvm/arm/arm.c  |  25 
 virt/kvm/arm/pmu.c  |   3 +-
 virt/kvm/arm/psci.c |   7 +-
 virt/kvm/arm/vgic/vgic-init.c   |  31 -
 virt/kvm/arm/vgic/vgic-kvm-device.c |  30 +
 virt/kvm/arm/vgic/vgic-mmio-v2.c|   5 +-
 virt/kvm/arm/vgic/vgic-mmio-v3.c|  22 ---
 virt/kvm/arm/vgic/vgic.c|   3 +-
 virt/kvm/kvm_main.c | 124 +++-
 36 files changed, 278 insertions(+), 308 deletions(-)

-- 
2.13.3



[PATCH RFC v3 2/9] KVM: arm/arm64: fix vcpu self-detection in vgic_v3_dispatch_sgi()

2017-08-21 Thread Radim Krčmář
The index in kvm->vcpus array and vcpu->vcpu_id are very different
things.  Comparing struct kvm_vcpu pointers is a sure way to know.

Signed-off-by: Radim Krčmář 
---
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 408ef06638fc..9d4b69b766ec 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -797,7 +797,6 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
u16 target_cpus;
u64 mpidr;
int sgi, c;
-   int vcpu_id = vcpu->vcpu_id;
bool broadcast;
 
sgi = (reg & ICC_SGI1R_SGI_ID_MASK) >> ICC_SGI1R_SGI_ID_SHIFT;
@@ -821,7 +820,7 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
break;
 
/* Don't signal the calling VCPU */
-   if (broadcast && c == vcpu_id)
+   if (broadcast && c_vcpu == vcpu)
continue;
 
if (!broadcast) {
-- 
2.13.3



[GIT] Sparc

2017-08-21 Thread David Miller

Just a couple small fixes, two of which have to do with gcc-7:

1) Don't clobber kernel fixed registers in __multi4 libgcc helper.

2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
   from Thomas Petazzoni.

3) Adjust pmd_t initializer on sparc32 to make gcc happy.

4) If ATU isn't avoid, don't bark in the logs.  From Tushar Dave.

Please pull, thanks a lot.

The following changes since commit 26273939ace935dd7553b31d279eab30b40f7b9a:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-08-10 
10:30:29 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 

for you to fetch changes up to 2dc77533f1e495788d73ffa4bee4323b2646d2bb:

  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus() 
(2017-08-21 13:57:22 -0700)


David S. Miller (1):
  sparc64: Don't clibber fixed registers in __multi4.

Thomas Petazzoni (1):
  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()

Tushar Dave (1):
  sparc64: remove unnecessary log message

Zi Yan (1):
  mm: add pmd_t initializer __pmd() to work around a GCC bug.

 arch/sparc/include/asm/page_32.h |  2 ++
 arch/sparc/kernel/pci_sun4v.c|  2 --
 arch/sparc/kernel/pcic.c |  2 +-
 arch/sparc/lib/multi3.S  | 24 
 4 files changed, 15 insertions(+), 15 deletions(-)


Re: [PATCH V3 2/4] ARM: dts: rockchip: rk322x add iommu nodes

2017-08-21 Thread Heiko Stuebner
Am Montag, 24. Juli 2017, 10:32:08 CEST schrieb Simon Xue:
> Add VPU/VDEC/VOP/IEP iommu nodes
> 
> Signed-off-by: Simon Xue 

applied for 4.14


Thanks
Heiko


Re: [patch] fs, proc: unconditional cond_resched when reading smaps

2017-08-21 Thread Kirill A. Shutemov
On Mon, Aug 21, 2017 at 02:06:45PM -0700, David Rientjes wrote:
> If there are large numbers of hugepages to iterate while reading
> /proc/pid/smaps, the page walk never does cond_resched().  On archs
> without split pmd locks, there can be significant and observable
> contention on mm->page_table_lock which cause lengthy delays without
> rescheduling.
> 
> Always reschedule in smaps_pte_range() if necessary since the pagewalk
> iteration can be expensive.
> 
> Signed-off-by: David Rientjes 
> ---
>  fs/proc/task_mmu.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -599,11 +599,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long 
> addr, unsigned long end,
>   if (ptl) {
>   smaps_pmd_entry(pmd, addr, walk);
>   spin_unlock(ptl);
> - return 0;
> + goto out;
>   }
>  
>   if (pmd_trans_unstable(pmd))
> - return 0;
> + goto out;
>   /*
>* The mmap_sem held all the way back in m_start() is what
>* keeps khugepaged out of here and from collapsing things
> @@ -613,6 +613,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long 
> addr, unsigned long end,
>   for (; addr != end; pte++, addr += PAGE_SIZE)
>   smaps_pte_entry(pte, addr, walk);
>   pte_unmap_unlock(pte - 1, ptl);
> +out:
>   cond_resched();
>   return 0;
>  }

Maybe just call cond_resched() at the beginning of the function and don't
bother with gotos?

-- 
 Kirill A. Shutemov


Re: [PATCH] mt7601u: check memory allocation failure

2017-08-21 Thread Jakub Kicinski
On Mon, 21 Aug 2017 14:34:30 -0700, Jakub Kicinski wrote:
> On Mon, 21 Aug 2017 22:59:56 +0200, Christophe JAILLET wrote:
> > Check memory allocation failure and return -ENOMEM in such a case, as
> > already done a few lines below
> > 
> > Signed-off-by: Christophe JAILLET   
> 
> Acked-by: Jakub Kicinski 

Wait, I take that back.  This code is a bit weird.  We would return an
error, then mt7601u_dma_init() will call mt7601u_free_tx_queue() which
doesn't check for tx_q == NULL condition.

Looks like mt7601u_free_tx() has to check for dev->tx_q == NULL and
return early if that's the case.  Or mt7601u_alloc_tx() should really
clean things up on it's own on failure.  Ugh.


Re: [PATCH net-next 03/11] net: dsa: debugfs: add tree

2017-08-21 Thread Florian Fainelli
On 08/14/2017 03:22 PM, Vivien Didelot wrote:
> This commit adds the boiler plate to create a DSA related debug
> filesystem entry as well as a "tree" file, containing the tree index.
> 
> # cat switch1/tree
> 0
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Florian Fainelli 
-- 
Florian


[PATCH v2] mt7601u: check memory allocation failure

2017-08-21 Thread Christophe JAILLET
Check memory allocation failure and return -ENOMEM in such a case, as
already done a few lines below.

As 'dev->tx_q' can be NULL, we also need to check for that in
'mt7601u_free_tx()', and return early.

Signed-off-by: Christophe JAILLET 
---
v2: avoid another NULL pointer dereference in 'mt7601u_free_tx()' if the
allocation had failed.
---
 drivers/net/wireless/mediatek/mt7601u/dma.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c 
b/drivers/net/wireless/mediatek/mt7601u/dma.c
index 660267b359e4..7f3e3983b781 100644
--- a/drivers/net/wireless/mediatek/mt7601u/dma.c
+++ b/drivers/net/wireless/mediatek/mt7601u/dma.c
@@ -457,6 +457,9 @@ static void mt7601u_free_tx(struct mt7601u_dev *dev)
 {
int i;
 
+   if (!dev->tx_q)
+   return;
+
for (i = 0; i < __MT_EP_OUT_MAX; i++)
mt7601u_free_tx_queue(>tx_q[i]);
 }
@@ -484,6 +487,8 @@ static int mt7601u_alloc_tx(struct mt7601u_dev *dev)
 
dev->tx_q = devm_kcalloc(dev->dev, __MT_EP_OUT_MAX,
 sizeof(*dev->tx_q), GFP_KERNEL);
+   if (!dev->tx_q)
+   return -ENOMEM;
 
for (i = 0; i < __MT_EP_OUT_MAX; i++)
if (mt7601u_alloc_tx_queue(dev, >tx_q[i]))
-- 
2.11.0



[PATCH] Staging: greybus: Make string array const

2017-08-21 Thread Eames Trinh
Added const to string array.

Signed-off-by: Eames Trinh 
---
 drivers/staging/greybus/audio_manager_module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/greybus/audio_manager_module.c 
b/drivers/staging/greybus/audio_manager_module.c
index adc16977452d..73a3e2decb3a 100644
--- a/drivers/staging/greybus/audio_manager_module.c
+++ b/drivers/staging/greybus/audio_manager_module.c
@@ -159,7 +159,7 @@ static void send_add_uevent(struct gb_audio_manager_module 
*module)
char ip_devices_string[64];
char op_devices_string[64];
 
-   char *envp[] = {
+   const char *envp[] = {
name_string,
vid_string,
pid_string,
-- 
2.11.0



Re: [PATCH 1/2] scsi: Move scsi_cmd->jiffies_at_alloc initialization to allocation time

2017-08-21 Thread Brian King
Scratch this one... Version 2 on the way with the corresponding changes
in scsi_init_request...

-Brian


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH v2] mt7601u: check memory allocation failure

2017-08-21 Thread Jakub Kicinski
On Tue, 22 Aug 2017 00:06:17 +0200, Christophe JAILLET wrote:
> Check memory allocation failure and return -ENOMEM in such a case, as
> already done a few lines below.
> 
> As 'dev->tx_q' can be NULL, we also need to check for that in
> 'mt7601u_free_tx()', and return early.
> 
> Signed-off-by: Christophe JAILLET 

Acked-by: Jakub Kicinski 


Re: kvm splat in mmu_spte_clear_track_bits

2017-08-21 Thread Adam Borowski
On Mon, Aug 21, 2017 at 09:58:34PM +0200, Radim Krčmář wrote:
> 2017-08-21 21:12+0200, Adam Borowski:
> > On Mon, Aug 21, 2017 at 09:26:57AM +0800, Wanpeng Li wrote:
> > > 2017-08-21 7:13 GMT+08:00 Adam Borowski :
> > > > I'm afraid I keep getting a quite reliable, but random, splat when 
> > > > running
> > > > KVM:
> > > 
> > > I reported something similar before. https://lkml.org/lkml/2017/6/29/64
> > 
> > Your problem seems to require OOM; I don't have any memory pressure at all:
> > running a single 2GB guest while there's nothing big on the host (bloatfox,
> > xfce, xorg, terminals + some minor junk); 8GB + (untouched) swap.  There's
> > no memory pressure inside the guest either -- none was Linux (I wanted to
> > test something on hurd, kfreebsd) and I doubt they even got to use all of
> > their frames.
> 
> I even tried hurd, but couldn't reproduce ...

Also happens with a win10 guest, and with multiple Linuxes.

> what is your qemu command
> line and the output of host's `grep . /sys/module/kvm*/parameters/*`?

qemu-system-x86_64 -enable-kvm -m 2048 -vga qxl -usbdevice tablet \
 -net bridge -net nic \
 -drive file="$DISK",cache=writeback,index=0,media=disk,discard=on

qemu-system-x86_64 -enable-kvm -m 2048 -vga qxl -usbdevice tablet \
 -net bridge -net nic \
 -drive 
file="$DISK",cache=unsafe,index=0,media=disk,discard=on,if=virtio,format=raw

/sys/module/kvm/parameters/halt_poll_ns:20
/sys/module/kvm/parameters/halt_poll_ns_grow:2
/sys/module/kvm/parameters/halt_poll_ns_shrink:0
/sys/module/kvm/parameters/ignore_msrs:N
/sys/module/kvm/parameters/kvmclock_periodic_sync:Y
/sys/module/kvm/parameters/lapic_timer_advance_ns:0
/sys/module/kvm/parameters/min_timer_period_us:500
/sys/module/kvm/parameters/tsc_tolerance_ppm:250
/sys/module/kvm/parameters/vector_hashing:Y
/sys/module/kvm_amd/parameters/avic:0
/sys/module/kvm_amd/parameters/nested:1
/sys/module/kvm_amd/parameters/npt:1
/sys/module/kvm_amd/parameters/vls:0

> > Also, it doesn't reproduce for me on 4.12.
> 
> Great info ... the most suspicious between v4.12 and v4.13-rc5 is the
> series with dcdca5fed5f6 ("x86: kvm: mmu: make spte mmio mask more
> explicit"), does reverting it help?
> 
> `git revert 
> ce00053b1cfca312c22e2a6465451f1862561eab~1..995f00a619584e65e53eff372d9b73b121a7bad5`

Alas, doesn't seem to help.

I've first installed a Debian stretch guest, the host survived both the
installation and subsequent fooling around.  But then I started a win10
guest which splatted as soon as the initial screen.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!?
⢿⡄⠘⠷⠚⠋⠀-- Genghis Ht'rok'din
⠈⠳⣄ 


Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics

2017-08-21 Thread Florian Fainelli
On 08/04/2017 10:11 AM, Eric Dumazet wrote:
> On Fri, 2017-08-04 at 08:51 -0700, Florian Fainelli wrote:
>> On 08/03/2017 10:36 PM, Eric Dumazet wrote:
>>> On Thu, 2017-08-03 at 21:33 -0700, Florian Fainelli wrote:
 During testing with a background iperf pushing 1Gbit/sec worth of
 traffic and having both ifconfig and ethtool collect statistics, we
 could see quite frequent deadlocks. Convert the often accessed DSA slave
 network devices statistics to per-cpu 64-bit statistics to remove these
 deadlocks and provide fast efficient statistics updates.

>>>
>>> This seems to be a bug fix, it would be nice to get a proper tag like :
>>>
>>> Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
>>
>> Right, should have been added, thanks!
>>
>>>
>>> Problem here is that if multiple cpus can call dsa_switch_rcv() at the
>>> same time, then u64_stats_update_begin() contract is not respected.
>>
>> This is really where I struggled understanding what is wrong in the
>> non-per CPU version, my understanding is that we have:
>>
>> - writers for xmit executes in process context
>> - writers for receive executes from NAPI (from the DSA's master network
>> device through it's own NAPI doing netif_receive_skb -> netdev_uses_dsa
>> -> netif_receive_skb)
>>
>> readers should all execute in process context. The test scenario that
>> led to a deadlock involved running iperf in the background, having a
>> while loop with both ifconfig and ethtool reading stats, and somehow
>> when iperf exited, either reader would just be locked. So I guess this
>> leaves us with the two writers not being mutually excluded then, right?
> 
> You could add a debug version of u64_stats_update_begin()
> 
> doing 
> 
> int ret = atomic_inc((atomic_t *)syncp);
> 
> BUG_ON(ret & 1);>
> 
> And u64_stats_update_end()
> 
> int ret = atomic_inc((atomic_t *)syncp);

so with your revised suggested patch:

static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
int ret = atomic_inc_return((atomic_t *)syncp);
BUG_ON(ret & 1);
#endif
#if 0
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
write_seqcount_begin(>seq);
#endif
#endif
}

static inline void u64_stats_update_end(struct u64_stats_sync *syncp)
{
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
int ret = atomic_inc_return((atomic_t *)syncp);
BUG_ON(!(ret & 1));
#endif
#if 0
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
write_seqcount_end(>seq);
#endif
#endif
}

and this makes us choke pretty early in IRQ accounting, did I get your
suggestion right?

[0.015149] [ cut here ]
[0.020051] kernel BUG at ./include/linux/u64_stats_sync.h:82!
[0.026221] Internal error: Oops - BUG: 0 [#1] SMP ARM
[0.031661] Modules linked in:
[0.034970] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.13.0-rc5-01297-g7d3f0cd43fee-dirty #33
[0.043990] Hardware name: Broadcom STB (Flattened Device Tree)
[0.050237] task: c180a500 task.stack: c180
[0.055065] PC is at irqtime_account_delta+0xa4/0xa8
[0.060322] LR is at 0x1
[0.063057] pc : []lr : [<0001>]psr: 01d3
[0.069652] sp : c1801eec  ip : ee78b458  fp : c0e5ea48
[0.075212] r10: c18b4b40  r9 : f0803000  r8 : ee00a800
[0.080781] r7 : 0001  r6 : c180a500  r5 : c180  r4 : 
[0.087680] r3 :   r2 : ec8c  r1 : ee78b3c0  r0 : ee78b440
[0.094546] Flags: nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM
Segment user
[0.102314] Control: 30c5387d  Table: 3000  DAC: fffd
[0.108414] Process swapper/0 (pid: 0, stack limit = 0xc1800210)
[0.114791] Stack: (0xc1801eec to 0xc1802000)
[0.119431] 1ee0:ee78b440 c180
c180a500 0001 c02505c8
[0.128079] 1f00: 0004 ee00a800 e000  
c0227890 c17e6f20 c0278910
[0.136665] 1f20: c185724c c18079a0 f080200c c1801f58 f0802000
c0201494 c0e00c18 2053
[0.145303] 1f40:  c1801f8c  c180 c18b4b40
c020d238  001f
[0.153915] 1f60: 00040d00  efffc940  c18b4b40
c1807440  
[0.162571] 1f80: c18b4b40 c0e5ea48 0004 c1801fa8 c0322fb0
c0e00c18 2053 
[0.171226] 1fa0: c18b4b40    
c0e006c0  
[0.179890] 1fc0:  c1807448 c0e5ea48  
c18b4dd4 c180745c c0e5ea44
[0.188546] 1fe0: c180c0d0 7000 420f00f3  
8090  
[0.197165] [] (irqtime_account_delta) from []
(irqtime_account_irq+0xc0/0xc4)
[0.206664] [] (irqtime_account_irq) from []
(irq_exit+0x28/0x154)
[0.215012] [] (irq_exit) from []
(__handle_domain_irq+0x60/0xb4)
[0.223245] [] (__handle_domain_irq) from []
(gic_handle_irq+0x48/0x8c)
[0.232035] [] (gic_handle_irq) from []
(__irq_svc+0x58/0x74)
[0.239941] Exception stack(0xc1801f58 to 0xc1801fa0)
[

Re: [PATCH] ASoC: simple-scu-card: Parse off codec widgets

2017-08-21 Thread Kuninori Morimoto

Hi Daniel

> > > @@ -24,6 +24,7 @@ Optional subnode properties:
> > >  - simple-audio-card,convert-rate : platform specified sampling rate 
> > > convert
> > >  - simple-audio-card,convert-channels : platform specified converted 
> > > channel size (2 - 8 ch)
> > >  - simple-audio-card,prefix   : see routing
> > > +- simple-audio-card,widgets  : Please refer to widgets.txt.
> > >  - simple-audio-card,routing  : A list of the connections 
> > > between audio components.
> > >     Each entry is a pair of strings, the 
> > > first being the connection's sink,
> > >     the second being the connection's 
> > > source. Valid names for sources.
> > It can be "see simple-audio-card.txt" same as other properties.
> > Not a big deal though
> > 
> > Acked-by: Kuninori Morimoto 
> 
> Thanks for having a look. I don't have a strong preference, but given that
> the patch was already pushed and when you'll go to simple-audio-card.txt it
> will point you to widgets.txt we can leave it like that.

Thanks.
No problem for me, it is not a big deal :)

Best regards
---
Kuninori Morimoto


Re: [PATCH 06/15] mtd: make device_type const

2017-08-21 Thread Boris Brezillon
Le Sat, 19 Aug 2017 13:52:17 +0530,
Bhumika Goyal  a écrit :

> Make this const as it is only stored in the type field of a device
> structure, which is const.
> Done using Coccinelle.
> 

Applied to l2-mtd/master.

Thanks,

Boris

> Signed-off-by: Bhumika Goyal 
> ---
>  drivers/mtd/mtdcore.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index f872a99..e7ea842 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -340,7 +340,7 @@ static ssize_t mtd_bbtblocks_show(struct device *dev,
>  };
>  ATTRIBUTE_GROUPS(mtd);
>  
> -static struct device_type mtd_devtype = {
> +static const struct device_type mtd_devtype = {
>   .name   = "mtd",
>   .groups = mtd_groups,
>   .release= mtd_release,



Re: [PATCH] sched/fair: move definitions to fix !CONFIG_SMP

2017-08-21 Thread Josef Bacik
On Mon, Aug 21, 2017 at 04:03:05PM -0400, jo...@toxicpanda.com wrote:
> From: Josef Bacik 
> 
> The series of patches adding runnable_avg and subsequent supporting
> patches broke on !CONFIG_SMP.  Fix this by moving the definitions under
> the appropriate checks, and moving the !CONFIG_SMP definitions higher
> up.
> 
> Signed-off-by: Josef Bacik 

Sorry ignore this, it's still screwed up.  I'll send a new series, there's
multiple broken things here.  Thanks,

Josef


Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-21 Thread Martin K. Petersen

Brian,

>> Thanks for the detailed analysis. This is very helpful. Have you
>> considered to change the ipr driver such that it terminates REPORT
>> SUPPORTED OPERATION CODES commands with the appropriate check
>> condition code instead of DID_ERROR?
>
> Yes. That data is actually in the sense buffer, but since I'm also
> setting DID_ERROR, scsi_decide_disposition isn't using it. I've got a
> patch to do just as you suggest, to stop setting DID_ERROR when there
> is more detailed error data available, but it will need some
> additional testing before I submit, as it will impact much more than
> just this case.

I agree. In this case where a command is not supported, a check
condition would be a better way to signal the failure to the SCSI
midlayer.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v3 net-next] bpf/verifier: track liveness for pruning

2017-08-21 Thread Daniel Borkmann

On 08/21/2017 08:36 PM, Edward Cree wrote:

On 19/08/17 00:37, Alexei Starovoitov wrote:

[...]

I'm tempted to just rip out env->varlen_map_value_access and always check
  the whole thing, because honestly I don't know what it was meant to do
  originally or how it can ever do any useful pruning.  While drastic, it
  does cause your test case to pass.


Original intention from 484611357c19 ("bpf: allow access into map
value arrays") was that it wouldn't potentially make pruning worse
if PTR_TO_MAP_VALUE_ADJ was not used, meaning that we wouldn't need
to take reg state's min_value and max_value into account for state
checking; this was basically due to min_value / max_value is being
adjusted/tracked on every alu/jmp ops for involved regs (e.g.
adjust_reg_min_max_vals() and others that mangle them) even if we
have the case that no actual dynamic map access is used throughout
the program. To give an example on net tree, the bpf_lxc.o prog's
section increases from 36,386 to 68,226 when env->varlen_map_value_access
is always true, so it does have an effect. Did you do some checks
on this on net-next?


Re: [PATCH] XEN/xen-kbdfront: Enable auto repeat for xen keyboard front driver

2017-08-21 Thread Dmitry Torokhov
On Mon, Aug 21, 2017 at 12:30 PM, Boris Ostrovsky
 wrote:
>
> Adding maintainer (Dmitry).

I can't seem to find the original in my mailbox nor in patchwork. Can
you please resend?

>
>
> -boris
>
> On 08/21/2017 11:41 AM, Liang Yan wrote:
> > Long pressed key could not show right in XEN vncviewer after tigervnc
> > client changed the way how to send repeat keys, from "Down Up Down Up
> > ..." to "Down Down Dow." By enable EV_REP bit here, XEN keyboard
> > device will trigger default auto repeat process from input subsystem,
> > and make auto repeat keys work correctly.
> >
> > Signed-off-by: Liang Yan 
> > ---
> >  drivers/input/misc/xen-kbdfront.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/input/misc/xen-kbdfront.c
> > b/drivers/input/misc/xen-kbdfront.c
> > index fa130e7b734c..0dce9830e2f4 100644
> > --- a/drivers/input/misc/xen-kbdfront.c
> > +++ b/drivers/input/misc/xen-kbdfront.c
> > @@ -248,6 +248,7 @@ static int xenkbd_probe(struct xenbus_device *dev,
> >  kbd->id.product = 0x;
> >
> >  __set_bit(EV_KEY, kbd->evbit);
> > +__set_bit(EV_REP, kbd->evbit);
> >  for (i = KEY_ESC; i < KEY_UNKNOWN; i++)
> >  __set_bit(i, kbd->keybit);
> >  for (i = KEY_OK; i < KEY_MAX; i++)
> > --
> > 2.14.0
> >
>

Thanks.

-- 
Dmitry


Re: [PATCH v2 1/2] Input: Add driver for Cypress Generation 5 touchscreen

2017-08-21 Thread Maxime Ripard
On Fri, Aug 18, 2017 at 08:20:44AM +0200, Mylène Josserand wrote:
> This is the basic driver for the Cypress TrueTouch Gen5 touchscreen
> controllers. This driver supports only the I2C bus but it uses regmap
> so SPI support could be added later.
> The touchscreen can retrieve some defined zone that are handled as
> buttons (according to the hardware). That is why it handles
> button and multitouch events.
> 
> Signed-off-by: Mylène Josserand 

Reviewed-by: Maxime Ripard 

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


[PATCH] f2fs: issue discard commands if gc_urgent is set

2017-08-21 Thread Jaegeuk Kim
It's time to issue all the discard commands, if user sets the idle time.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/segment.c | 6 +-
 fs/f2fs/sysfs.c   | 5 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 1387925a0d83..e3922f902c8c 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -21,6 +21,7 @@
 #include "f2fs.h"
 #include "segment.h"
 #include "node.h"
+#include "gc.h"
 #include "trace.h"
 #include 
 
@@ -1194,8 +1195,11 @@ static int issue_discard_thread(void *data)
if (kthread_should_stop())
return 0;
 
-   if (dcc->discard_wake)
+   if (dcc->discard_wake) {
dcc->discard_wake = 0;
+   if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
+   mark_discard_range_all(sbi);
+   }
 
sb_start_intwrite(sbi->sb);
 
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 4bcaa9059026..b9ad9041559f 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -178,8 +178,13 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
f2fs_reset_iostat(sbi);
if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) {
+   struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
+
sbi->gc_thread->gc_wake = 1;
wake_up_interruptible_all(>gc_thread->gc_wait_queue_head);
+
+   dcc->discard_wake = 1;
+   wake_up_interruptible_all(>discard_wait_queue);
}
 
return count;
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[PATCH 3/4] w1-masters: Delete an error message for a failed memory allocation in four functions

2017-08-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 21 Aug 2017 21:40:29 +0200

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/w1/masters/ds2490.c| 5 ++---
 drivers/w1/masters/matrox_w1.c | 7 +--
 drivers/w1/masters/omap_hdq.c  | 4 +---
 drivers/w1/masters/w1-gpio.c   | 4 +---
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/w1/masters/ds2490.c b/drivers/w1/masters/ds2490.c
index 46ccb2fc4f60..c0ee6ca9ce93 100644
--- a/drivers/w1/masters/ds2490.c
+++ b/drivers/w1/masters/ds2490.c
@@ -998,7 +998,6 @@ static int ds_probe(struct usb_interface *intf,
-   if (!dev) {
-   pr_info("Failed to allocate new DS9490R structure.\n");
+   if (!dev)
return -ENOMEM;
-   }
+
dev->udev = usb_get_dev(udev);
if (!dev->udev) {
err = -ENOMEM;
diff --git a/drivers/w1/masters/matrox_w1.c b/drivers/w1/masters/matrox_w1.c
index d83d7c99d81d..62be2f9cdb4e 100644
--- a/drivers/w1/masters/matrox_w1.c
+++ b/drivers/w1/masters/matrox_w1.c
@@ -140,10 +140,5 @@ static int matrox_w1_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent
-   if (!dev) {
-   dev_err(>dev,
-   "%s: Failed to create new matrox_device object.\n",
-   __func__);
+   if (!dev)
return -ENOMEM;
-   }
-
 
dev->bus_master = (struct w1_bus_master *)(dev + 1);
 
diff --git a/drivers/w1/masters/omap_hdq.c b/drivers/w1/masters/omap_hdq.c
index 83fc9aab34e8..6349fcd650dc 100644
--- a/drivers/w1/masters/omap_hdq.c
+++ b/drivers/w1/masters/omap_hdq.c
@@ -669,7 +669,5 @@ static int omap_hdq_probe(struct platform_device *pdev)
-   if (!hdq_data) {
-   dev_dbg(>dev, "unable to allocate memory\n");
+   if (!hdq_data)
return -ENOMEM;
-   }
 
hdq_data->dev = dev;
platform_set_drvdata(pdev, hdq_data);
diff --git a/drivers/w1/masters/w1-gpio.c b/drivers/w1/masters/w1-gpio.c
index a90728ceec5a..6e8b18bf9fb1 100644
--- a/drivers/w1/masters/w1-gpio.c
+++ b/drivers/w1/masters/w1-gpio.c
@@ -133,7 +133,5 @@ static int w1_gpio_probe(struct platform_device *pdev)
-   if (!master) {
-   dev_err(>dev, "Out of memory\n");
+   if (!master)
return -ENOMEM;
-   }
 
err = devm_gpio_request(>dev, pdata->pin, "w1");
if (err) {
-- 
2.14.0



Re: [PATCH v3 1/5] ACPI / blacklist: add acpi_match_platform_list()

2017-08-21 Thread Kani, Toshimitsu
On Mon, 2017-08-21 at 22:31 +0200, Rafael J. Wysocki wrote:
> On Mon, Aug 21, 2017 at 7:36 PM, Borislav Petkov 
> wrote:
> > On Mon, Aug 21, 2017 at 05:23:37PM +, Kani, Toshimitsu wrote:
> > > > > 'data' here is private to the caller.  So, I do not think we
> > > > > need to define the bits.  Shall I change the name to
> > > > > 'driver_data' to make it more explicit?
> > > > 
> > > > You changed it to 'data'. It was a u32-used-as-boolean
> > > > is_critical_error before.
> > > > 
> > > > So you can just as well make it into flags and people can
> > > > extend those flags if needed. A flag bit should be enough in
> > > > most cases anyway. If they really need driver_data, then they
> > > > can add a void *member.
> > > 
> > > Hmm.. In patch 2, intel_pstate_platform_pwr_mgmt_exists() uses
> > > this field for PSS and PCC, which are enum values.  I think we
> > > should allow drivers to set any values here.  I agree that it may
> > > need to be void * if we also allow drivers to set a pointer here.
> > 
> > Let's see what Rafael prefers.
> 
> I would retain the is_critical_error field and use that for printing
> the recoverable / non-recoverable message.  This is kind of
> orthogonal to whether or not any extra data is needed and that can be
> an additional field.  In that case unsigned long should be sufficient
> to accommodate a pointer if need be.

Yes, we will retain the field.  The question is whether this field
should be retained as a driver's private data or ACPI-managed flags.  

My patch implements the former, which lets the callers to define the
data values.  For instance, acpi_blacklisted() uses this field as
is_critical_error value, and intel_pstate_platform_pwr_mgmt_exists()
uses it as oem_pwr_table value.

Boris suggested the latter, which lets ACPI to define the flags, which
are then used by the callers.  For instance, he suggested ACPI to
define bit0 as is_critical_error.

#define ACPI_PLAT_IS_CRITICAL_ERROR BIT(0)

Thanks,
-Toshi



Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group

2017-08-21 Thread Peter Zijlstra
On Mon, Aug 21, 2017 at 04:21:28PM +0100, Brendan Jackman wrote:
> The current use of returning NULL from find_idlest_group is broken in
> two cases:
> 
> a1) The local group is not allowed.
> 
>In this case, we currently do not change this_runnable_load or
>this_avg_load from its initial value of 0, which means we return
>NULL regardless of the load of the other, allowed groups. This
>results in pointlessly continuing the find_idlest_group search
>within the local group and then returning prev_cpu from
>select_task_rq_fair.

> b) smp_processor_id() is the "idlest" and != prev_cpu.
> 
>find_idlest_group also returns NULL when the local group is
>allowed and is the idlest. The caller then continues the
>find_idlest_group search at a lower level of the current CPU's
>sched_domain hierarchy. However new_cpu is not updated. This means
>the search is pointless and we return prev_cpu from
>select_task_rq_fair.
> 

I think its much simpler than that.. but its late, so who knows ;-)

Both cases seem predicated on the assumption that we'll return @cpu when
we don't find any idler CPU. Consider, if the local group is the idlest,
we should stick with @cpu and simply proceed with the child domain.

The confusion, and the bugs, seem to have snuck in when we started
considering @prev_cpu, whenever that was. The below is mostly code
movement to put that whole while(sd) loop into its own function.

The effective change is setting @new_cpu = @cpu when we start that loop:

@@ -6023,6 +6023,8 @@ static int wake_cap(struct task_struct *p, int cpu, int 
prev_cpu)
struct sched_group *group;
int weight;
 
+   new_cpu = cpu;
+
if (!(sd->flags & sd_flag)) {
sd = sd->child;
continue;




---
 kernel/sched/fair.c | 83 +++--
 1 file changed, 48 insertions(+), 35 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c77e4b1d51c0..3e77265c480a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5588,10 +5588,10 @@ static unsigned long capacity_spare_wake(int cpu, 
struct task_struct *p)
 }
 
 /*
- * find_idlest_cpu - find the idlest cpu among the cpus in group.
+ * find_idlest_group_cpu - find the idlest cpu among the cpus in group.
  */
 static int
-find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int 
this_cpu)
 {
unsigned long load, min_load = ULONG_MAX;
unsigned int min_exit_latency = UINT_MAX;
@@ -5640,6 +5640,50 @@ static unsigned long capacity_spare_wake(int cpu, struct 
task_struct *p)
return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : 
least_loaded_cpu;
 }
 
+static int
+find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int 
sd_flag)
+{
+   struct sched_domain *tmp;
+   int new_cpu = cpu;
+
+   while (sd) {
+   struct sched_group *group;
+   int weight;
+
+   if (!(sd->flags & sd_flag)) {
+   sd = sd->child;
+   continue;
+   }
+
+   group = find_idlest_group(sd, p, cpu, sd_flag);
+   if (!group) {
+   sd = sd->child;
+   continue;
+   }
+
+   new_cpu = find_idlest_group_cpu(group, p, cpu);
+   if (new_cpu == -1 || new_cpu == cpu) {
+   /* Now try balancing at a lower domain level of cpu */
+   sd = sd->child;
+   continue;
+   }
+
+   /* Now try balancing at a lower domain level of new_cpu */
+   cpu = new_cpu;
+   weight = sd->span_weight;
+   sd = NULL;
+   for_each_domain(cpu, tmp) {
+   if (weight <= tmp->span_weight)
+   break;
+   if (tmp->flags & sd_flag)
+   sd = tmp;
+   }
+   /* while loop will break here if sd == NULL */
+   }
+
+   return new_cpu;
+}
+
 /*
  * Implement a for_each_cpu() variant that starts the scan at a given cpu
  * (@start), and wraps around.
@@ -6019,39 +6063,8 @@ static int wake_cap(struct task_struct *p, int cpu, int 
prev_cpu)
if (sd_flag & SD_BALANCE_WAKE) /* XXX always ? */
new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
 
-   } else while (sd) {
-   struct sched_group *group;
-   int weight;
-
-   if (!(sd->flags & sd_flag)) {
-   sd = sd->child;
-   continue;
-   }
-
-   group = find_idlest_group(sd, p, cpu, sd_flag);
-   if (!group) {
-   sd = sd->child;
- 

Re: [PATCH V3 4/4] ARM64: dts: rockchip: rk3399 add iommu nodes

2017-08-21 Thread Heiko Stuebner
Am Montag, 24. Juli 2017, 10:32:10 CEST schrieb Simon Xue:
> Add VPU/VDEC/IEP/VOPL/VOPB/ISP0/ISP1 iommu nodes
> 
> Signed-off-by: Simon Xue 

applied for 4.14 (after adapting the subject a bit, dropping the
vop-mmus added via another patch)


Thanks
Heiko


Re: [PATCH] mt7601u: check memory allocation failure

2017-08-21 Thread Jakub Kicinski
On Mon, 21 Aug 2017 22:59:56 +0200, Christophe JAILLET wrote:
> Check memory allocation failure and return -ENOMEM in such a case, as
> already done a few lines below
> 
> Signed-off-by: Christophe JAILLET 

Acked-by: Jakub Kicinski 

Thanks!


Re: [PATCH] mt7601u: check memory allocation failure

2017-08-21 Thread Christophe JAILLET

Le 21/08/2017 à 23:41, Jakub Kicinski a écrit :

On Mon, 21 Aug 2017 14:34:30 -0700, Jakub Kicinski wrote:

On Mon, 21 Aug 2017 22:59:56 +0200, Christophe JAILLET wrote:

Check memory allocation failure and return -ENOMEM in such a case, as
already done a few lines below

Signed-off-by: Christophe JAILLET 

Acked-by: Jakub Kicinski 

Wait, I take that back.  This code is a bit weird.  We would return an
error, then mt7601u_dma_init() will call mt7601u_free_tx_queue() which
doesn't check for tx_q == NULL condition.

Looks like mt7601u_free_tx() has to check for dev->tx_q == NULL and
return early if that's the case.  Or mt7601u_alloc_tx() should really
clean things up on it's own on failure.  Ugh.


You are right. Thanks for the review.

I've sent a v2 which updates 'mt7601u_free_tx()'.
Doing so sounds more in line with the spirit of this code.

CJ



[PATCH 2/3] signal: simplify compat_sigpending()

2017-08-21 Thread Dmitry V. Levin
Remove "if it's big-endian..." ifdef in compat_sigpending(),
use the endian-agnostic variant.

Suggested-by: Al Viro 
Signed-off-by: Dmitry V. Levin 
---
 kernel/signal.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index a1d0426..7d9d82b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3292,15 +3292,11 @@ SYSCALL_DEFINE1(sigpending, old_sigset_t __user *, set)
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE1(sigpending, compat_old_sigset_t __user *, set32)
 {
-#ifdef __BIG_ENDIAN
sigset_t set;
int err = do_sigpending(, sizeof(set.sig[0]));
if (!err)
err = put_user(set.sig[0], set32);
return err;
-#else
-   return sys_rt_sigpending((sigset_t __user *)set32, sizeof(*set32));
-#endif
 }
 #endif
 
-- 
ldv


[PATCH 1/3] signal: replace sigset_to_compat() with put_compat_sigset()

2017-08-21 Thread Dmitry V. Levin
There are 4 callers of sigset_to_compat() in the entire kernel.  One is
in sparc compat rt_sigaction(2), the rest are in kernel/signal.c itself.
All are followed by copy_to_user(), and all but the sparc one are under
"if it's big-endian..." ifdefs.

Let's transform sigset_to_compat() into put_compat_sigset() that also
calls copy_to_user().

Suggested-by: Al Viro 
Signed-off-by: Dmitry V. Levin 
---
 arch/sparc/kernel/sys_sparc32.c |  6 +++---
 include/linux/compat.h  |  3 ++-
 kernel/compat.c | 20 ++--
 kernel/signal.c | 27 ++-
 4 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/arch/sparc/kernel/sys_sparc32.c b/arch/sparc/kernel/sys_sparc32.c
index bca44f3..5e2bec9 100644
--- a/arch/sparc/kernel/sys_sparc32.c
+++ b/arch/sparc/kernel/sys_sparc32.c
@@ -159,7 +159,6 @@ COMPAT_SYSCALL_DEFINE5(rt_sigaction, int, sig,
 {
 struct k_sigaction new_ka, old_ka;
 int ret;
-   compat_sigset_t set32;
 
 /* XXX: Don't preclude handling different sized sigset_t's.  */
 if (sigsetsize != sizeof(compat_sigset_t))
@@ -167,6 +166,7 @@ COMPAT_SYSCALL_DEFINE5(rt_sigaction, int, sig,
 
 if (act) {
u32 u_handler, u_restorer;
+   compat_sigset_t set32;
 
new_ka.ka_restorer = restorer;
ret = get_user(u_handler, >sa_handler);
@@ -183,9 +183,9 @@ COMPAT_SYSCALL_DEFINE5(rt_sigaction, int, sig,
ret = do_sigaction(sig, act ? _ka : NULL, oact ? _ka : NULL);
 
if (!ret && oact) {
-   sigset_to_compat(, _ka.sa.sa_mask);
ret = put_user(ptr_to_compat(old_ka.sa.sa_handler), 
>sa_handler);
-   ret |= copy_to_user(>sa_mask, , 
sizeof(compat_sigset_t));
+   ret |= put_compat_sigset(>sa_mask, _ka.sa.sa_mask,
+sizeof(oact->sa_mask));
ret |= put_user(old_ka.sa.sa_flags, >sa_flags);
ret |= put_user(ptr_to_compat(old_ka.sa.sa_restorer), 
>sa_restorer);
if (ret)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 5a6a109..17017bb 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -453,7 +453,8 @@ asmlinkage long compat_sys_settimeofday(struct 
compat_timeval __user *tv,
 asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp);
 
 extern void sigset_from_compat(sigset_t *set, const compat_sigset_t *compat);
-extern void sigset_to_compat(compat_sigset_t *compat, const sigset_t *set);
+extern int put_compat_sigset(compat_sigset_t __user *compat,
+const sigset_t *set, unsigned int size);
 
 asmlinkage long compat_sys_migrate_pages(compat_pid_t pid,
compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes,
diff --git a/kernel/compat.c b/kernel/compat.c
index 6f0a0e7..d2bd03b 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -520,15 +520,23 @@ sigset_from_compat(sigset_t *set, const compat_sigset_t 
*compat)
 }
 EXPORT_SYMBOL_GPL(sigset_from_compat);
 
-void
-sigset_to_compat(compat_sigset_t *compat, const sigset_t *set)
+int
+put_compat_sigset(compat_sigset_t __user *compat, const sigset_t *set,
+ unsigned int size)
 {
+   /* size <= sizeof(compat_sigset_t) <= sizeof(sigset_t) */
+#ifdef __BIG_ENDIAN
+   compat_sigset_t v;
switch (_NSIG_WORDS) {
-   case 4: compat->sig[7] = (set->sig[3] >> 32); compat->sig[6] = 
set->sig[3];
-   case 3: compat->sig[5] = (set->sig[2] >> 32); compat->sig[4] = 
set->sig[2];
-   case 2: compat->sig[3] = (set->sig[1] >> 32); compat->sig[2] = 
set->sig[1];
-   case 1: compat->sig[1] = (set->sig[0] >> 32); compat->sig[0] = 
set->sig[0];
+   case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
+   case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
+   case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
+   case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0];
}
+   return copy_to_user(compat, , size) ? -EFAULT : 0;
+#else
+   return copy_to_user(compat, set, size) ? -EFAULT : 0;
+#endif
 }
 
 #ifdef CONFIG_NUMA
diff --git a/kernel/signal.c b/kernel/signal.c
index ed804a4..a1d0426 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2621,13 +2621,7 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, 
compat_sigset_t __user *, nset,
if (error)
return error;
}
-   if (oset) {
-   compat_sigset_t old32;
-   sigset_to_compat(, _set);
-   if (copy_to_user(oset, , sizeof(compat_sigset_t)))
-   return -EFAULT;
-   }
-   return 0;
+   return oset ? put_compat_sigset(oset, _set, sizeof(*oset)) : 0;
 #else
return sys_rt_sigprocmask(how, (sigset_t __user *)nset,
  

[PATCH 3/3] signal: lift sigset size check out of do_sigpending()

2017-08-21 Thread Dmitry V. Levin
As sigsetsize argument of do_sigpending() is not used anywhere else in
that function after the check, remove this argument and move the check
out of do_sigpending() into rt_sigpending() and its compat analog.

Suggested-by: Al Viro 
Signed-off-by: Dmitry V. Levin 
---
 kernel/signal.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 7d9d82b..894418b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2629,11 +2629,8 @@ COMPAT_SYSCALL_DEFINE4(rt_sigprocmask, int, how, 
compat_sigset_t __user *, nset,
 }
 #endif
 
-static int do_sigpending(void *set, unsigned long sigsetsize)
+static int do_sigpending(sigset_t *set)
 {
-   if (sigsetsize > sizeof(sigset_t))
-   return -EINVAL;
-
spin_lock_irq(>sighand->siglock);
sigorsets(set, >pending.signal,
  >signal->shared_pending.signal);
@@ -2653,7 +2650,12 @@ static int do_sigpending(void *set, unsigned long 
sigsetsize)
 SYSCALL_DEFINE2(rt_sigpending, sigset_t __user *, uset, size_t, sigsetsize)
 {
sigset_t set;
-   int err = do_sigpending(, sigsetsize);
+   int err;
+
+   if (sigsetsize > sizeof(*uset))
+   return -EINVAL;
+
+   err = do_sigpending();
if (!err && copy_to_user(uset, , sigsetsize))
err = -EFAULT;
return err;
@@ -2664,7 +2666,12 @@ COMPAT_SYSCALL_DEFINE2(rt_sigpending, compat_sigset_t 
__user *, uset,
compat_size_t, sigsetsize)
 {
sigset_t set;
-   int err = do_sigpending(, sigsetsize);
+   int err;
+
+   if (sigsetsize > sizeof(*uset))
+   return -EINVAL;
+
+   err = do_sigpending();
if (!err)
err = put_compat_sigset(uset, , sigsetsize);
return err;
@@ -3293,7 +3300,7 @@ SYSCALL_DEFINE1(sigpending, old_sigset_t __user *, set)
 COMPAT_SYSCALL_DEFINE1(sigpending, compat_old_sigset_t __user *, set32)
 {
sigset_t set;
-   int err = do_sigpending(, sizeof(set.sig[0]));
+   int err = do_sigpending();
if (!err)
err = put_user(set.sig[0], set32);
return err;
-- 
ldv


linux-next: manual merge of the btrfs-kdave tree with the btrfs tree

2017-08-21 Thread Stephen Rothwell
Hi All,

(As expected) Today's linux-next merge of the btrfs-kdave tree got a
conflict in:

  fs/btrfs/compression.h

between commit:

  5c1aab1dd544 ("btrfs: Add zstd support")

from the btrfs tree and commit:

  dc2f29212a26 ("btrfs: remove unused BTRFS_COMPRESS_LAST")

from the btrfs-kdave tree.

I fixed it up (as suggested by Chris - see below) and can carry the fix
as necessary. This is now fixed as far as linux-next is concerned, but
any non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc fs/btrfs/compression.h
index 2269e00854d8,3b1b0ac15fdc..
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@@ -99,9 -99,7 +99,8 @@@ enum btrfs_compression_type
BTRFS_COMPRESS_NONE  = 0,
BTRFS_COMPRESS_ZLIB  = 1,
BTRFS_COMPRESS_LZO   = 2,
 -  BTRFS_COMPRESS_TYPES = 2,
 +  BTRFS_COMPRESS_ZSTD  = 3,
 +  BTRFS_COMPRESS_TYPES = 3,
-   BTRFS_COMPRESS_LAST  = 4,
  };

  struct btrfs_compress_op {
@@@ -129,6 -127,7 +128,8 @@@

  extern const struct btrfs_compress_op btrfs_zlib_compress;
  extern const struct btrfs_compress_op btrfs_lzo_compress;
 +extern const struct btrfs_compress_op btrfs_zstd_compress;

+ int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end);
+
  #endif


[PATCH] once: switch to new jump label API

2017-08-21 Thread Eric Biggers
From: Eric Biggers 

Switch the DO_ONCE() macro from the deprecated jump label API to the new
one.  The new one is more readable, and for DO_ONCE() it also makes the
generated code more icache-friendly: now the one-time initialization
code is placed out-of-line at the jump target, rather than at the inline
fallthrough case.

Signed-off-by: Eric Biggers 
---
 include/linux/once.h | 6 +++---
 lib/once.c   | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/once.h b/include/linux/once.h
index 9c98aaa87cbc..724724918e8b 100644
--- a/include/linux/once.h
+++ b/include/linux/once.h
@@ -5,7 +5,7 @@
 #include 
 
 bool __do_once_start(bool *done, unsigned long *flags);
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
unsigned long *flags);
 
 /* Call a function exactly once. The idea of DO_ONCE() is to perform
@@ -38,8 +38,8 @@ void __do_once_done(bool *done, struct static_key *once_key,
({   \
bool ___ret = false; \
static bool ___done = false; \
-   static struct static_key ___once_key = STATIC_KEY_INIT_TRUE; \
-   if (static_key_true(&___once_key)) { \
+   static DEFINE_STATIC_KEY_TRUE(___once_key);  \
+   if (static_branch_unlikely(&___once_key)) {  \
unsigned long ___flags;  \
___ret = __do_once_start(&___done, &___flags);   \
if (unlikely(___ret)) {  \
diff --git a/lib/once.c b/lib/once.c
index 05c8604627eb..831c5a6b0bb2 100644
--- a/lib/once.c
+++ b/lib/once.c
@@ -5,7 +5,7 @@
 
 struct once_work {
struct work_struct work;
-   struct static_key *key;
+   struct static_key_true *key;
 };
 
 static void once_deferred(struct work_struct *w)
@@ -14,11 +14,11 @@ static void once_deferred(struct work_struct *w)
 
work = container_of(w, struct once_work, work);
BUG_ON(!static_key_enabled(work->key));
-   static_key_slow_dec(work->key);
+   static_branch_disable(work->key);
kfree(work);
 }
 
-static void once_disable_jump(struct static_key *key)
+static void once_disable_jump(struct static_key_true *key)
 {
struct once_work *w;
 
@@ -51,7 +51,7 @@ bool __do_once_start(bool *done, unsigned long *flags)
 }
 EXPORT_SYMBOL(__do_once_start);
 
-void __do_once_done(bool *done, struct static_key *once_key,
+void __do_once_done(bool *done, struct static_key_true *once_key,
unsigned long *flags)
__releases(once_lock)
 {
-- 
2.14.1.480.gb18f417b89-goog



Re: [PATCH v5 0/3] TPS68470 PMIC drivers

2017-08-21 Thread Rafael J. Wysocki
On Tue, Aug 22, 2017 at 12:58 AM, Mani, Rajmohan
 wrote:
> Hi Andy,
>
>> > >> > This is the patch series for TPS68470 PMIC that works as a camera 
>> > >> > PMIC.
>> > >> >
>> > >> > The patch series provide the following 3 drivers, to help configure 
>> > >> > the
>> voltage regulators, clocks and GPIOs provided by the TPS68470 PMIC, to be
>> able to use the camera sensors connected to this PMIC.
>> > >> >
>> > >> > TPS68470 MFD driver:
>> > >> > This is the multi function driver that initializes the TPS68470 PMIC 
>> > >> > and
>> supports the GPIO and Op Region functions.
>> > >> >
>> > >> > TPS68470 GPIO driver:
>> > >> > This is the PMIC GPIO driver that will be used by the OS GPIO layer,
>> when the BIOS / firmware triggered GPIO access is done.
>> > >> >
>> > >> > TPS68470 Op Region driver:
>> > >> > This is the driver that will be invoked, when the BIOS / firmware
>> configures the voltage / clock for the sensors / vcm devices connected to the
>> PMIC.
>> > >> >
>> > >>
>> > >> All three patches are good to me (we did few rounds of internal
>> > >> review before posting v4)
>> > >>
>> > >> Reviewed-by: Andy Shevchenko 
>> > >
>> > > OK, so how should they be routed?
>> >
>> > Good question. I don't know how last time PMIC drivers were merged,
>> > here I think is just sane to route vi MFD with immutable branch
>> > created.
>>
>> OK
>>
>> I will assume that the series will go in through MFD then.
>>
>
> Now that the MFD and GPIO patches of v6 of this series have been applied on 
> respective trees, can you advise the next steps for the ACPI / PMIC Opregion 
> driver?

Well, it would have been better to route the whole series through one
tree.  Now it's better to wait until the two other trees get merged
and then apply the opregion patch.

Thanks,
Rafael


Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

2017-08-21 Thread Bjorn Helgaas
On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> ...
> if (pci_bus_crs_pending(id))
>   return pci_bus_wait_crs(dev->bus, dev->devfn, , 6);
> 
> > I think that makes sense.  We'd want to check for CRS SV being
> > enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> > pci_enable_crs() and cache it somewhere.  Maybe a crs_sv_enabled bit
> > in the root port's pci_dev, and check it with something like what
> > pcie_root_rcb_set() does?
> > 
> 
> You can observe CRS under the following conditions
> 
> 1. root port <-> endpoint 
> 2. bridge <-> endpoint 
> 3. root port<->bridge
> 
> I was relying on the fact that we are reading 0x001 as an indication that
> this device detected CRS. Maybe, this is too indirect.
> 
> If we also want to capture the capability, I think the right thing is to
> check the parent capability.
> 
> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> {
>   if (device type(bridge) == root port)
>   return read(root_crs_register_reg);
> 
>   if (device type(bridge) == switch)
>   return read(switch_crs_register);

I don't understand this part.  AFAIK, CRS SV is only a feature of root
ports.  The capability and enable bits are in the Root Capabilities
and Root Control registers.

It's certainly true that a device below a switch can respond with a
CRS completion, but the switch is not the requester, and my
understanding is that it would not take any action on the completion
other than passing it upstream.


Re: [PATCH] PCI: Fix and amend express capability sizes

2017-08-21 Thread Auger Eric
Hi Alex,

On 10/08/2017 18:54, Alex Williamson wrote:
> PCI_CAP_EXP_ENDPOINT_SIZEOF_V1 defines the size of the PCIe express
> capability structure for v1 devices with link, but we also have a need
> in the vfio code for sizing the capability for devices without link,
> such as root complex endpoints.  Create a separate define for this
> ending the structure before the link fields.
> 
> Additionally, this reveals that PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 is
> currently incorrect, ending the capability length before the v2 link
> fields.  Rename this to specify an RC endpoint (no link) capability
> length and move PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 to include the link
> fields as we have for the v1 version.
> 
> Signed-off-by: Alex Williamson 
> ---
>  include/uapi/linux/pci_regs.h |6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index c22d3ebaca20..7439821214d1 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -513,6 +513,7 @@
>  #define  PCI_EXP_DEVSTA_URD  0x0008  /* Unsupported Request Detected */
>  #define  PCI_EXP_DEVSTA_AUXPD0x0010  /* AUX Power Detected */
>  #define  PCI_EXP_DEVSTA_TRPND0x0020  /* Transactions Pending */
> +#define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V112  /* v1 endpoints without 
> link end here */
nit: this should have been PCI_EXP_CAP_* from the very beginning but I
guess you don't new defines to be named differently from other *SIZEOF*?
>  #define PCI_EXP_LNKCAP   12  /* Link Capabilities */
>  #define  PCI_EXP_LNKCAP_SLS  0x000f /* Supported Link Speeds */
>  #define  PCI_EXP_LNKCAP_SLS_2_5GB 0x0001 /* LNKCAP2 SLS Vector bit 0 */
> @@ -556,7 +557,7 @@
>  #define  PCI_EXP_LNKSTA_DLLLA0x2000  /* Data Link Layer Link Active 
> */
>  #define  PCI_EXP_LNKSTA_LBMS 0x4000  /* Link Bandwidth Management Status */
>  #define  PCI_EXP_LNKSTA_LABS 0x8000  /* Link Autonomous Bandwidth Status */
> -#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V1   20  /* v1 endpoints end 
> here */
> +#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V1   20  /* v1 endpoints with 
> link end here */
>  #define PCI_EXP_SLTCAP   20  /* Slot Capabilities */
>  #define  PCI_EXP_SLTCAP_ABP  0x0001 /* Attention Button Present */
>  #define  PCI_EXP_SLTCAP_PCP  0x0002 /* Power Controller Present */
> @@ -639,7 +640,7 @@
>  #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN0x4000  /* Enable OBFF Message 
> type B */
>  #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN0x6000  /* OBFF using WAKE# 
> signaling */
>  #define PCI_EXP_DEVSTA2  42  /* Device Status 2 */
> -#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2   44  /* v2 endpoints end 
> here */
> +#define PCI_CAP_EXP_RC ENDPOINT_SIZEOF_V244  /* v2 endpoints without 
> link end here */
>  #define PCI_EXP_LNKCAP2  44  /* Link Capabilities 2 */
>  #define  PCI_EXP_LNKCAP2_SLS_2_5GB   0x0002 /* Supported Speed 2.5GT/s */
>  #define  PCI_EXP_LNKCAP2_SLS_5_0GB   0x0004 /* Supported Speed 5.0GT/s */
> @@ -647,6 +648,7 @@
>  #define  PCI_EXP_LNKCAP2_CROSSLINK   0x0100 /* Crosslink supported */
>  #define PCI_EXP_LNKCTL2  48  /* Link Control 2 */
>  #define PCI_EXP_LNKSTA2  50  /* Link Status 2 */
> +#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2   52  /* v2 endpoints with 
> link end here */
Looks good to me.

Reviewed-by: Eric Auger 

Eric
>  #define PCI_EXP_SLTCAP2  52  /* Slot Capabilities 2 */
>  #define PCI_EXP_SLTCTL2  56  /* Slot Control 2 */
>  #define PCI_EXP_SLTSTA2  58  /* Slot Status 2 */
> 


Re: [PATCH] ipr: Set no_report_opcodes for RAID arrays

2017-08-21 Thread Martin K. Petersen

Brian,

> Since ipr RAID arrays do not support the MAINTENANCE_IN /
> MI_REPORT_SUPPORTED_OPERATION_CODES, set no_report_opcodes to prevent
> it from being sent.

Applied to 4.13/scsi-fixes. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [Patch v2 00/19] CIFS: Implement SMBDirect

2017-08-21 Thread Long Li
> > > Hey Long,
> > >
> > > What testing have you done with this on the various rdma transports?
> > > Does it work over IB, RoCE, and iWARP providers?
> >
> > Hi Steve,
> >
> > Currently all the tests have been done over Infiniband. We haven't
> > tested on
> RoCE
> > or iWARP, but planned to do it in the following weeks.
> >
> > Long
> 
> Ok, good.
> 
> Is this series available on github or somewhere so we can clone it and review
> it as it is applied to the kernel src?

Unfortunately they are not on github. I will look into putting them there for 
review. Will update soon.

Thanks for helping out!

> 
> Thanks,
> 
> Steve.



Re: [PATCH v3 1/5] ACPI / blacklist: add acpi_match_platform_list()

2017-08-21 Thread Rafael J. Wysocki
On Mon, Aug 21, 2017 at 7:36 PM, Borislav Petkov  wrote:
> On Mon, Aug 21, 2017 at 05:23:37PM +, Kani, Toshimitsu wrote:
>> > > 'data' here is private to the caller.  So, I do not think we need
>> > > to define the bits.  Shall I change the name to 'driver_data' to
>> > > make it more explicit?
>> >
>> > You changed it to 'data'. It was a u32-used-as-boolean
>> > is_critical_error before.
>> >
>> > So you can just as well make it into flags and people can extend
>> > those flags if needed. A flag bit should be enough in most cases
>> > anyway. If they really need driver_data, then they can add a void *
>> > member.
>>
>> Hmm.. In patch 2, intel_pstate_platform_pwr_mgmt_exists() uses this
>> field for PSS and PCC, which are enum values.  I think we should allow
>> drivers to set any values here.  I agree that it may need to be void *
>> if we also allow drivers to set a pointer here.
>
> Let's see what Rafael prefers.

I would retain the is_critical_error field and use that for printing
the recoverable / non-recoverable message.  This is kind of orthogonal
to whether or not any extra data is needed and that can be an
additional field.  In that case unsigned long should be sufficient to
accommodate a pointer if need be.

Thanks,
Rafael


Re: [PATCH v4] f2fs: introduce discard_granularity sysfs entry

2017-08-21 Thread Jaegeuk Kim
On 08/18, Chao Yu wrote:
> Hi Jaegeuk,
> 
> Sorry for the delay, the modification looks good to me. ;)

We must avoid waking up discard thread caused by # of pending commands
which are never issued.

>From a73f8807248c2f42328a2204eab16a3b8d32c83e Mon Sep 17 00:00:00 2001
From: Chao Yu 
Date: Mon, 7 Aug 2017 23:09:56 +0800
Subject: [PATCH] f2fs: introduce discard_granularity sysfs entry

Commit d618ebaf0aa8 ("f2fs: enable small discard by default") enables
f2fs to issue 4K size discard in real-time discard mode. However, issuing
smaller discard may cost more lifetime but releasing less free space in
flash device. Since f2fs has ability of separating hot/cold data and
garbage collection, we can expect that small-sized invalid region would
expand soon with OPU, deletion or garbage collection on valid datas, so
it's better to delay or skip issuing smaller size discards, it could help
to reduce overmuch consumption of IO bandwidth and lifetime of flash
storage.

This patch makes f2fs selectng 64K size as its default minimal
granularity, and issue discard with the size which is not smaller than
minimal granularity. Also it exposes discard granularity as sysfs entry
for configuration in different scenario.

Jaegeuk Kim:
 We must issue all the accumulated discard commands when fstrim is called.
 So, I've added pend_list_tag[] to indicate whether we should issue the
 commands or not. If tag sets P_ACTIVE or P_TRIM, we have to issue them.
 P_TRIM is set once at a time, given fstrim trigger.
 In addition, issue_discard_thread is calling too much due to the number of
 discard commands remaining in the pending list. I added a timer to control
 it likewise gc_thread.

Signed-off-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
---
 Documentation/ABI/testing/sysfs-fs-f2fs |  9 
 fs/f2fs/f2fs.h  | 12 +
 fs/f2fs/segment.c   | 91 -
 fs/f2fs/sysfs.c | 23 +
 4 files changed, 121 insertions(+), 14 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index 621da3fc56c5..11b7f4ebea7c 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -57,6 +57,15 @@ Contact: "Jaegeuk Kim" 
 Description:
 Controls the issue rate of small discard commands.
 
+What:  /sys/fs/f2fs//discard_granularity
+Date:  July 2017
+Contact:   "Chao Yu" 
+Description:
+   Controls discard granularity of inner discard thread, inner 
thread
+   will not issue discards with size that is smaller than 
granularity.
+   The unit size is one block, now only support configuring in 
range
+   of [1, 512].
+
 What:  /sys/fs/f2fs//max_victim_search
 Date:  January 2014
 Contact:   "Jaegeuk Kim" 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index e252e5bf9791..4b993961d81d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -148,6 +148,8 @@ enum {
(BATCHED_TRIM_SEGMENTS(sbi) << (sbi)->log_blocks_per_seg)
 #define MAX_DISCARD_BLOCKS(sbi)BLKS_PER_SEC(sbi)
 #define DISCARD_ISSUE_RATE 8
+#define DEF_MIN_DISCARD_ISSUE_TIME 50  /* 50 ms, if exists */
+#define DEF_MAX_DISCARD_ISSUE_TIME 6   /* 60 s, if no candidates */
 #define DEF_CP_INTERVAL60  /* 60 secs */
 #define DEF_IDLE_INTERVAL  5   /* 5 secs */
 
@@ -196,11 +198,18 @@ struct discard_entry {
unsigned char discard_map[SIT_VBLOCK_MAP_SIZE]; /* segment discard 
bitmap */
 };
 
+/* default discard granularity of inner discard thread, unit: block count */
+#define DEFAULT_DISCARD_GRANULARITY16
+
 /* max discard pend list number */
 #define MAX_PLIST_NUM  512
 #define plist_idx(blk_num) ((blk_num) >= MAX_PLIST_NUM ?   \
(MAX_PLIST_NUM - 1) : (blk_num - 1))
 
+#define P_ACTIVE   0x01
+#define P_TRIM 0x02
+#define plist_issue(tag)   (((tag) & P_ACTIVE) || ((tag) & P_TRIM))
+
 enum {
D_PREP,
D_SUBMIT,
@@ -236,11 +245,14 @@ struct discard_cmd_control {
struct task_struct *f2fs_issue_discard; /* discard thread */
struct list_head entry_list;/* 4KB discard entry list */
struct list_head pend_list[MAX_PLIST_NUM];/* store pending entries */
+   unsigned char pend_list_tag[MAX_PLIST_NUM];/* tag for pending entries */
struct list_head wait_list; /* store on-flushing entries */
wait_queue_head_t discard_wait_queue;   /* waiting queue for wake-up */
+   unsigned int discard_wake;  /* to wake up discard thread */
struct mutex cmd_lock;
unsigned int nr_discards;   /* # of 

Re: [PATCH 0/8] constify parisc parisc_device_id

2017-08-21 Thread Helge Deller
Hi Arvind,

On 19.08.2017 19:42, Arvind Yadav wrote:
> parisc_device_id are not supposed to change at runtime. All functions
> working with parisc_device_id provided by  work with
> const parisc_device_id. So mark the non-const structs as const.

Basically your patches are correct, but those structs aren't used
after bootup any longer.
So, they are much better placed in the __initconst or __initdata
sections so that they get dropped before the kernel enters userspace.
Changing it to __initconst includes more changes to those files
than just changing one line.

So, I won't apply your patches.
Instead I've hacked up new versions in my tree which move those to 
__init* sections.

Anyway, thanks for your patches!

Helge 
 
> Arvind Yadav (8):
>   [PATCH 1/8] parisc: asp: constify parisc_device_id
>   [PATCH 2/8] parisc: ccio: constify parisc_device_id
>   [PATCH 3/8] parisc: dino: constify parisc_device_id
>   [PATCH 4/8] parisc: hppb: constify parisc_device_id
>   [PATCH 5/8] parisc: lasi: constify parisc_device_id
>   [PATCH 6/8] parisc: lba_pci: constify parisc_device_id
>   [PATCH 7/8] parisc: sba_iommu: constify parisc_device_id
>   [PATCH 8/8] parisc: wax: constify parisc_device_id
> 
>  drivers/parisc/asp.c | 2 +-
>  drivers/parisc/ccio-rm-dma.c | 2 +-
>  drivers/parisc/dino.c| 2 +-
>  drivers/parisc/hppb.c| 2 +-
>  drivers/parisc/lasi.c| 2 +-
>  drivers/parisc/lba_pci.c | 2 +-
>  drivers/parisc/sba_iommu.c   | 2 +-
>  drivers/parisc/wax.c | 2 +-
>  8 files changed, 8 insertions(+), 8 deletions(-)
> 



[PATCH V9 0/2] powerpc/dlpar: Correct display of hot-add/hot-remove CPUs and memory

2017-08-21 Thread Michael Bringmann

On Power systems with shared configurations of CPUs and memory, there
are some issues with association of additional CPUs and memory to nodes
when hot-adding resources.  These patches address some of those problems.

powerpc/numa: Correct the currently broken capability to set the
topology for shared CPUs in LPARs.  At boot time for shared CPU
lpars, the topology for each shared CPU is set to node zero, however,
this is now updated correctly using the Virtual Processor Home Node
(VPHN) capabilities information provided by the pHyp. The VPHN handling
in Linux is disabled, if PRRN handling is present.

powerpc/nodes: On systems like PowerPC which allow 'hot-add' of CPU
or memory resources, it may occur that the new resources are to be
inserted into nodes that were not used for these resources at bootup.
In the kernel, any node that is used must be defined and initialized
at boot.

This patch extracts the value of the lowest domain level (number of
allocable resources) from the "rtas" device tree property
"ibm,max-associativity-domains" to use as the maximum number of nodes
to setup as possibly available in the system.  This new setting will
override the instruction,

nodes_and(node_possible_map, node_possible_map, node_online_map);

presently seen in the function arch/powerpc/mm/numa.c:initmem_init().

If the property is not present at boot, no operation will be performed
to define or enable additional nodes.

Signed-off-by: Michael Bringmann 

Michael Bringmann (2):
  powerpc/numa: Update CPU topology when VPHN enabled
  powerpc/nodes: Ensure enough nodes avail for operations
---
Changes in V9:
  -- Calculate number of nodes via property "ibm,max-associativity-domains"



linux-next: Signed-off-by missing for commit in the arc tree

2017-08-21 Thread Stephen Rothwell
Hi Vineet,

Commit

  62611ac87d44 ("ARC: [plat-eznps] handle extra aux regs #2: kernel/entry exit")

is missing a Signed-off-by from its author.

-- 
Cheers,
Stephen Rothwell


[PATCH 2/2] scsi: Preserve retry counter through scsi_prep_fn

2017-08-21 Thread Brian King
Save / restore the retry counter in scsi_cmd in scsi_init_command.
This allows us to go back through scsi_init_command for retries
and not forget we are doing a retry.

Signed-off-by: Brian King 
---

Index: linux-2.6.git/drivers/scsi/scsi_lib.c
===
--- linux-2.6.git.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.git/drivers/scsi/scsi_lib.c
@@ -1155,6 +1155,7 @@ void scsi_init_command(struct scsi_devic
void *prot = cmd->prot_sdb;
unsigned int unchecked_isa_dma = cmd->flags & SCMD_UNCHECKED_ISA_DMA;
unsigned long jiffies_at_alloc = cmd->jiffies_at_alloc;
+   int retries = cmd->retries;
 
/* zero out the cmd, except for the embedded scsi_request */
memset((char *)cmd + sizeof(cmd->req), 0,
@@ -1166,6 +1167,7 @@ void scsi_init_command(struct scsi_devic
cmd->flags = unchecked_isa_dma;
INIT_DELAYED_WORK(>abort_work, scmd_eh_abort_handler);
cmd->jiffies_at_alloc = jiffies_at_alloc;
+   cmd->retries = retries;
 
scsi_add_cmd_to_list(cmd);
 }



[PATCH 1/2] scsi: Move scsi_cmd->jiffies_at_alloc initialization to allocation time

2017-08-21 Thread Brian King
Move the initialization of scsi_cmd->jiffies_at_alloc to allocation
time rather than prep time. Also ensure that jiffies_at_alloc
is preserved when we go through prep. This lets us send retries
through prep again and not break the overall retry timer logic
in scsi_softirq_done.

Suggested-by: Bart Van Assche 
Signed-off-by: Brian King 
---

Index: linux-2.6.git/drivers/scsi/scsi_lib.c
===
--- linux-2.6.git.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.git/drivers/scsi/scsi_lib.c
@@ -1154,6 +1154,7 @@ void scsi_init_command(struct scsi_devic
void *buf = cmd->sense_buffer;
void *prot = cmd->prot_sdb;
unsigned int unchecked_isa_dma = cmd->flags & SCMD_UNCHECKED_ISA_DMA;
+   unsigned long jiffies_at_alloc = cmd->jiffies_at_alloc;
 
/* zero out the cmd, except for the embedded scsi_request */
memset((char *)cmd + sizeof(cmd->req), 0,
@@ -1164,7 +1165,7 @@ void scsi_init_command(struct scsi_devic
cmd->prot_sdb = prot;
cmd->flags = unchecked_isa_dma;
INIT_DELAYED_WORK(>abort_work, scmd_eh_abort_handler);
-   cmd->jiffies_at_alloc = jiffies;
+   cmd->jiffies_at_alloc = jiffies_at_alloc;
 
scsi_add_cmd_to_list(cmd);
 }
@@ -2119,6 +2120,7 @@ static int scsi_init_rq(struct request_q
if (!cmd->sense_buffer)
goto fail;
cmd->req.sense = cmd->sense_buffer;
+   cmd->jiffies_at_alloc = jiffies;
 
if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) {
cmd->prot_sdb = kmem_cache_zalloc(scsi_sdb_cache, gfp);



Re: [PATCH v2 1/4] clk: rockchip: add rv1108 ACLK_GAMC and PCLK_GMAC ID

2017-08-21 Thread Heiko Stuebner
Am Montag, 21. August 2017, 16:16:04 CEST schrieb Elaine Zhang:
> This patch exports gmac aclk and pclk for dts reference.
> 
> Signed-off-by: Elaine Zhang 

applied for 4.14


Thanks
Heiko


[PATCHv2 1/2] scsi: Move scsi_cmd->jiffies_at_alloc initialization to allocation time

2017-08-21 Thread Brian King
This second version also sets up jiffies_at_alloc in scsi_init_request.
This has been tested without the second patch in the series and
I've confirmed I now see the following in the logs after booting:

[  121.718088] sd 1:2:0:0: timing out command, waited 120s
[  121.798081] sd 1:2:1:0: timing out command, waited 120s

Without this patch I was never seeing these messages, indicating
the retry timer code wasn't working. Also, after seeing these
messages, I've confirmed there are no longer any hung tasks
in the kernel with sysrq-w, while before, without this patch,
I would see hung tasks for the scsi_report_opcodes calls which
were getting retried forever.

8<

Move the initialization of scsi_cmd->jiffies_at_alloc to allocation
time rather than prep time. Also ensure that jiffies_at_alloc
is preserved when we go through prep. This lets us send retries
through prep again and not break the overall retry timer logic
in scsi_softirq_done.

Suggested-by: Bart Van Assche 
Signed-off-by: Brian King 
---

Index: linux-2.6.git/drivers/scsi/scsi_lib.c
===
--- linux-2.6.git.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6.git/drivers/scsi/scsi_lib.c
@@ -1154,6 +1154,7 @@ void scsi_init_command(struct scsi_devic
void *buf = cmd->sense_buffer;
void *prot = cmd->prot_sdb;
unsigned int unchecked_isa_dma = cmd->flags & SCMD_UNCHECKED_ISA_DMA;
+   unsigned long jiffies_at_alloc = cmd->jiffies_at_alloc;
 
/* zero out the cmd, except for the embedded scsi_request */
memset((char *)cmd + sizeof(cmd->req), 0,
@@ -1164,7 +1165,7 @@ void scsi_init_command(struct scsi_devic
cmd->prot_sdb = prot;
cmd->flags = unchecked_isa_dma;
INIT_DELAYED_WORK(>abort_work, scmd_eh_abort_handler);
-   cmd->jiffies_at_alloc = jiffies;
+   cmd->jiffies_at_alloc = jiffies_at_alloc;
 
scsi_add_cmd_to_list(cmd);
 }
@@ -2016,6 +2017,7 @@ static int scsi_init_request(struct blk_
if (!cmd->sense_buffer)
return -ENOMEM;
cmd->req.sense = cmd->sense_buffer;
+   cmd->jiffies_at_alloc = jiffies;
 
if (scsi_host_get_prot(shost)) {
sg = (void *)cmd + sizeof(struct scsi_cmnd) +
@@ -2119,6 +2121,7 @@ static int scsi_init_rq(struct request_q
if (!cmd->sense_buffer)
goto fail;
cmd->req.sense = cmd->sense_buffer;
+   cmd->jiffies_at_alloc = jiffies;
 
if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) {
cmd->prot_sdb = kmem_cache_zalloc(scsi_sdb_cache, gfp);



Re: [PATCH v2] KVM: nVMX: Fix trying to cancel vmlauch/vmresume

2017-08-21 Thread Wanpeng Li
2017-08-22 0:20 GMT+08:00 Radim Krčmář :
> 2017-08-18 07:11-0700, Wanpeng Li:
>> From: Wanpeng Li 
>>
>> [ cut here ]
>> WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 
>> nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
>> CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW  OE   4.13.0-rc4+ 
>> #11
>> RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
>> Call Trace:
>>  ? kvm_multiple_exception+0x149/0x170 [kvm]
>>  ? handle_emulation_failure+0x79/0x230 [kvm]
>>  ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
>>  ? check_chain_key+0x137/0x1e0
>>  ? reexecute_instruction.part.168+0x130/0x130 [kvm]
>>  nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
>>  ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
>>  vmx_queue_exception+0x197/0x300 [kvm_intel]
>>  kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
>>  ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
>>  ? preempt_count_sub+0x18/0xc0
>>  ? restart_apic_timer+0x17d/0x300 [kvm]
>>  ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
>>  ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
>>  kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
>>  ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
>>  ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]
>>
>> The flag "nested_run_pending", which can override the decision of which 
>> should run
>> next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not 
>> L1. This
>> is necessary in particular when L1 did a VMLAUNCH of L2 and therefore 
>> expects L2 to
>> be run (and perhaps be injected with an event it specified, etc.). 
>> Nested_run_pending
>> is especially intended to avoid switching  to L1 in the injection 
>> decision-point.
>>
>> I catch this in the queue exception path, this patch fixes it by requesting
>> an immediate VM exit from L2 and keeping the exception for L1 pending for a
>> subsequent nested VM exit.
>>
>> Cc: Paolo Bonzini 
>> Cc: Radim Krčmář 
>> Signed-off-by: Wanpeng Li 
>> ---
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> @@ -6356,8 +6356,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, 
>> bool req_int_win)
>>   kvm_update_dr7(vcpu);
>>   }
>
> Hm, we shouldn't execute the code above if exception won't be injected.
>
>>
>> - kvm_x86_ops->queue_exception(vcpu);
>> - return 0;
>
> vmx_complete_interrupts() assumes that the exception is always injected,
> so it would be dropped by kvm_clear_exception_queue().
>
> I'm starting to wonder whether getting rid of nested_run_pending
> wouldn't be nicer.

Yeah, I rethink of your concern for nested_run_pending w/ return value
is 0, actually the path in the calltrace is the else branch in
nested_vmx_check_exception(), an exception will be injected to L2 by
L1 if L1 owns this exception, otherwise injected by L0 directly. For
the nested_run_pending w/ return value is 0 stuff, we can treat it as
L0 injects the exception to L2 directly. So there is no exception is
injected to wrong guest.

Regards,
Wanpeng Li


Re: [PATCH v2] KVM: nVMX: Fix trying to cancel vmlauch/vmresume

2017-08-21 Thread Wanpeng Li
2017-08-22 6:55 GMT+08:00 Wanpeng Li :
> 2017-08-22 0:20 GMT+08:00 Radim Krčmář :
>> 2017-08-18 07:11-0700, Wanpeng Li:
>>> From: Wanpeng Li 
>>>
>>> [ cut here ]
>>> WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 
>>> nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
>>> CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW  OE   
>>> 4.13.0-rc4+ #11
>>> RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
>>> Call Trace:
>>>  ? kvm_multiple_exception+0x149/0x170 [kvm]
>>>  ? handle_emulation_failure+0x79/0x230 [kvm]
>>>  ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
>>>  ? check_chain_key+0x137/0x1e0
>>>  ? reexecute_instruction.part.168+0x130/0x130 [kvm]
>>>  nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
>>>  ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
>>>  vmx_queue_exception+0x197/0x300 [kvm_intel]
>>>  kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
>>>  ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
>>>  ? preempt_count_sub+0x18/0xc0
>>>  ? restart_apic_timer+0x17d/0x300 [kvm]
>>>  ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
>>>  ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
>>>  kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
>>>  ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
>>>  ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]
>>>
>>> The flag "nested_run_pending", which can override the decision of which 
>>> should run
>>> next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not 
>>> L1. This
>>> is necessary in particular when L1 did a VMLAUNCH of L2 and therefore 
>>> expects L2 to
>>> be run (and perhaps be injected with an event it specified, etc.). 
>>> Nested_run_pending
>>> is especially intended to avoid switching  to L1 in the injection 
>>> decision-point.
>>>
>>> I catch this in the queue exception path, this patch fixes it by requesting
>>> an immediate VM exit from L2 and keeping the exception for L1 pending for a
>>> subsequent nested VM exit.
>>>
>>> Cc: Paolo Bonzini 
>>> Cc: Radim Krčmář 
>>> Signed-off-by: Wanpeng Li 
>>> ---
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> @@ -6356,8 +6356,8 @@ static int inject_pending_event(struct kvm_vcpu 
>>> *vcpu, bool req_int_win)
>>>   kvm_update_dr7(vcpu);
>>>   }
>>
>> Hm, we shouldn't execute the code above if exception won't be injected.
>>
>>>
>>> - kvm_x86_ops->queue_exception(vcpu);
>>> - return 0;
>>
>> vmx_complete_interrupts() assumes that the exception is always injected,
>> so it would be dropped by kvm_clear_exception_queue().
>>
>> I'm starting to wonder whether getting rid of nested_run_pending
>> wouldn't be nicer.
>
> Yeah, I rethink of your concern for nested_run_pending w/ return value
> is 0, actually the path in the calltrace is the else branch in
> nested_vmx_check_exception(), an exception will be injected to L2 by
> L1 if L1 owns this exception, otherwise injected by L0 directly. For
> the nested_run_pending w/ return value is 0 stuff, we can treat it as
> L0 injects the exception to L2 directly. So there is no exception is
> injected to wrong guest.

I just sent out v3 to move the nested_run_pending stuff to the else branch.

Regards,
Wanpeng Li


[PATCH v3] KVM: nVMX: Fix trying to cancel vmlauch/vmresume

2017-08-21 Thread Wanpeng Li
From: Wanpeng Li 

[ cut here ]
WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 
nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW  OE   4.13.0-rc4+ #11
RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
Call Trace:
 ? kvm_multiple_exception+0x149/0x170 [kvm]
 ? handle_emulation_failure+0x79/0x230 [kvm]
 ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
 ? check_chain_key+0x137/0x1e0
 ? reexecute_instruction.part.168+0x130/0x130 [kvm]
 nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 vmx_queue_exception+0x197/0x300 [kvm_intel]
 kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
 ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
 ? preempt_count_sub+0x18/0xc0
 ? restart_apic_timer+0x17d/0x300 [kvm]
 ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
 ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
 kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]

The flag "nested_run_pending", which can override the decision of which should 
run 
next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. 
This 
is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects 
L2 to 
be run (and perhaps be injected with an event it specified, etc.). 
Nested_run_pending 
is especially intended to avoid switching  to L1 in the injection 
decision-point.

I catch this in the queue exception path, this patch fixes it by running L2 
next 
instead of L1 in the queue exception path and injecting the pending exception 
to 
L2 directly.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * move the nested_run_pending to the else branch
v1 -> v2:
 * request an immediate VM exit from L2 and keep the exception for 
   L1 pending for a subsequent nested VM exit

 arch/x86/kvm/vmx.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e398946..685f51e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2488,6 +2488,10 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu)
}
} else {
unsigned long exit_qual = 0;
+
+   if (to_vmx(vcpu)->nested.nested_run_pending)
+   return 0;
+
if (nr == DB_VECTOR)
exit_qual = vcpu->arch.dr6;
 
-- 
2.7.4



Re: [PATCH] Fix compat_sys_sigpending breakage introduced by v4.13-rc1~6^2~12

2017-08-21 Thread Dmitry V. Levin
On Sun, Aug 06, 2017 at 07:22:03PM +0100, Al Viro wrote:
> On Sat, Aug 05, 2017 at 11:00:50PM +0300, Dmitry V. Levin wrote:
> > The latest change of compat_sys_sigpending has broken it in two ways.
> > 
> > First, it tries to write 4 bytes more than userspace expects:
> > sizeof(old_sigset_t) == sizeof(long) == 8 instead of
> > sizeof(compat_old_sigset_t) == sizeof(u32) == 4.
> > 
> > Second, on big endian architectures these bytes are being written
> > in the wrong order.
> 
> > @@ -3303,12 +3303,15 @@ SYSCALL_DEFINE1(sigpending, old_sigset_t __user *, 
> > set)
> >  #ifdef CONFIG_COMPAT
> >  COMPAT_SYSCALL_DEFINE1(sigpending, compat_old_sigset_t __user *, set32)
> >  {
> > +#ifdef __BIG_ENDIAN
> > sigset_t set;
> > -   int err = do_sigpending(, sizeof(old_sigset_t)); 
> > -   if (err == 0)
> > -   if (copy_to_user(set32, , sizeof(old_sigset_t)))
> > -   err = -EFAULT;
> > +   int err = do_sigpending(, sizeof(set.sig[0]));
> > +   if (!err)
> > +   err = put_user(set.sig[0], set32);
> > return err;
> > +#else
> > +   return sys_rt_sigpending((sigset_t __user *)set32, sizeof(*set32));
> > +#endif
> 
> Interesting...  Basically, your fix makes it parallel to compat 
> rt_sigpending(2);
> I agree that the bug is real and gets fixed by that, but...  rt_sigpending()
> itself looks a bit fishy.  There we have
> compat_sigset_t set32;
> sigset_to_compat(, );
> /* we can get here only if sigsetsize <= sizeof(set) */
> if (copy_to_user(uset, , sigsetsize))
> err = -EFAULT;
> in big-endian case; now, there are 4 callers of sigset_to_compat() in the
> entire kernel.  One in sparc compat rt_sigaction(2), the rest in 
> kernel/signal.c
> itself.  All are followed by copy_to_user(), and all but the sparc one are
> under that kind of "if it's big-endian..." ifdefs.
> 
> Looks like it might make sense to do this:
> put_compat_sigset(compat_sigset_t __user *compat, const sigset_t *set, int 
> size)
> {
> #ifdef 
>   compat_sigset_t v;
> switch (_NSIG_WORDS) {
> case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
> case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
> case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
> case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0];
> }
>   return copy_to_user(compat, , size) ? -EFAULT : 0;
> #else
>   return copy_to_user(compat, set, size) ? -EFAULT : 0;
> #endif
> }
> 
> int put_compat_old_sigset(compat_old_sigset_t __user *compat, const sigset_t 
> *set)
> {
>   /* we want bits 0--31 of the bitmap */
>   return put_user(compat, set->sig[0]);
> }
[...]
> COMPAT_SYSCALL_DEFINE2(sigpending, compat_old_sigset_t __user *, uset)
> {
> sigset_t set;
> int err = do_sigpending(, sizeof(set));
> if (!err)
>   err = put_compat_old_sigset(uset, );
> return err;
> }

I don't think a separate function for put_user(compat, set->sig[0])
is needed given that its only user is going to be compat_sigpending().

Introducing put_compat_sigset() and moving sigset size check out
of do_sigpending() definitely makes sense, patches will follow shortly.


-- 
ldv


Re: [PATCH v2 1/8] dt-bindings: mediatek: Add binding for mt2712 IOMMU and SMI

2017-08-21 Thread Rob Herring
On Mon, Aug 21, 2017 at 07:00:14PM +0800, Yong Wu wrote:
> This patch adds decriptions for mt2712 IOMMU and SMI.
> 
> In order to balance the bandwidth, mt2712 has two M4Us, two
> smi-commons, 10 smi-larbs. and mt2712 is also MTK IOMMU gen2 which
> uses ARM Short-Descriptor translation table format.
> 
> The mt2712 M4U-SMI HW diagram is as below:
> 
> EMI
>  |
>   
>   |  |
>  M4U0  M4U1
>   |  |
>  smi-common0smi-common1
>   |  |
>   -   
>   | | | | |   | || | |
>   | | | | |   | || | |
> larb0 larb1 larb2 larb3 larb6larb4larb5larb7 larb8 larb9
> disp0 vdec  cam   venc   jpg  mdp1/disp1 mdp2/disp2 mdp3 vdo/nr tvd
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> Signed-off-by: Yong Wu 
> ---
> Hi Rob,
> Comparing with the v1, I add larb8 and larb9 in this version.
> So I don't add your ACK here.

Thanks for the explanation. That's minor enough you could have kept it.

Acked-by: Rob Herring 

> ---
>  .../devicetree/bindings/iommu/mediatek,iommu.txt   |   6 +-
>  .../memory-controllers/mediatek,smi-common.txt |   6 +-
>  .../memory-controllers/mediatek,smi-larb.txt   |   5 +-
>  include/dt-bindings/memory/mt2712-larb-port.h  | 102 
> +
>  4 files changed, 113 insertions(+), 6 deletions(-)
>  create mode 100644 include/dt-bindings/memory/mt2712-larb-port.h


Re: [PATCH net-next] net: dsa: User per-cpu 64-bit statistics

2017-08-21 Thread Florian Fainelli
On 08/21/2017 04:23 PM, Florian Fainelli wrote:
> On 08/04/2017 10:11 AM, Eric Dumazet wrote:
>> On Fri, 2017-08-04 at 08:51 -0700, Florian Fainelli wrote:
>>> On 08/03/2017 10:36 PM, Eric Dumazet wrote:
 On Thu, 2017-08-03 at 21:33 -0700, Florian Fainelli wrote:
> During testing with a background iperf pushing 1Gbit/sec worth of
> traffic and having both ifconfig and ethtool collect statistics, we
> could see quite frequent deadlocks. Convert the often accessed DSA slave
> network devices statistics to per-cpu 64-bit statistics to remove these
> deadlocks and provide fast efficient statistics updates.
>

 This seems to be a bug fix, it would be nice to get a proper tag like :

 Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
>>>
>>> Right, should have been added, thanks!
>>>

 Problem here is that if multiple cpus can call dsa_switch_rcv() at the
 same time, then u64_stats_update_begin() contract is not respected.
>>>
>>> This is really where I struggled understanding what is wrong in the
>>> non-per CPU version, my understanding is that we have:
>>>
>>> - writers for xmit executes in process context
>>> - writers for receive executes from NAPI (from the DSA's master network
>>> device through it's own NAPI doing netif_receive_skb -> netdev_uses_dsa
>>> -> netif_receive_skb)
>>>
>>> readers should all execute in process context. The test scenario that
>>> led to a deadlock involved running iperf in the background, having a
>>> while loop with both ifconfig and ethtool reading stats, and somehow
>>> when iperf exited, either reader would just be locked. So I guess this
>>> leaves us with the two writers not being mutually excluded then, right?
>>
>> You could add a debug version of u64_stats_update_begin()
>>
>> doing 
>>
>> int ret = atomic_inc((atomic_t *)syncp);
>>
>> BUG_ON(ret & 1);>
>>
>> And u64_stats_update_end()
>>
>> int ret = atomic_inc((atomic_t *)syncp);
> 
> so with your revised suggested patch:
> 
> static inline void u64_stats_update_begin(struct u64_stats_sync *syncp)
> {
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> int ret = atomic_inc_return((atomic_t *)syncp);
> BUG_ON(ret & 1);
> #endif
> #if 0
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> write_seqcount_begin(>seq);
> #endif
> #endif
> }
> 
> static inline void u64_stats_update_end(struct u64_stats_sync *syncp)
> {
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> int ret = atomic_inc_return((atomic_t *)syncp);
> BUG_ON(!(ret & 1));
> #endif
> #if 0
> #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> write_seqcount_end(>seq);
> #endif
> #endif
> }
> 
> and this makes us choke pretty early in IRQ accounting, did I get your
> suggestion right?

Well if we return 1 from atomic_inc_return() and the previous value was
zero, of course we are going to be bugging here. The idea behind the
patch I suppose is to make sure that we always get an odd number upon
u64_stats_update_begin()/entry, and an even number upon
u64_stats_update_end()/exit, right?

> 
> [0.015149] [ cut here ]
> [0.020051] kernel BUG at ./include/linux/u64_stats_sync.h:82!
> [0.026221] Internal error: Oops - BUG: 0 [#1] SMP ARM
> [0.031661] Modules linked in:
> [0.034970] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.13.0-rc5-01297-g7d3f0cd43fee-dirty #33
> [0.043990] Hardware name: Broadcom STB (Flattened Device Tree)
> [0.050237] task: c180a500 task.stack: c180
> [0.055065] PC is at irqtime_account_delta+0xa4/0xa8
> [0.060322] LR is at 0x1
> [0.063057] pc : []lr : [<0001>]psr: 01d3
> [0.069652] sp : c1801eec  ip : ee78b458  fp : c0e5ea48
> [0.075212] r10: c18b4b40  r9 : f0803000  r8 : ee00a800
> [0.080781] r7 : 0001  r6 : c180a500  r5 : c180  r4 : 
> [0.087680] r3 :   r2 : ec8c  r1 : ee78b3c0  r0 : ee78b440
> [0.094546] Flags: nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM
> Segment user
> [0.102314] Control: 30c5387d  Table: 3000  DAC: fffd
> [0.108414] Process swapper/0 (pid: 0, stack limit = 0xc1800210)
> [0.114791] Stack: (0xc1801eec to 0xc1802000)
> [0.119431] 1ee0:ee78b440 c180
> c180a500 0001 c02505c8
> [0.128079] 1f00: 0004 ee00a800 e000  
> c0227890 c17e6f20 c0278910
> [0.136665] 1f20: c185724c c18079a0 f080200c c1801f58 f0802000
> c0201494 c0e00c18 2053
> [0.145303] 1f40:  c1801f8c  c180 c18b4b40
> c020d238  001f
> [0.153915] 1f60: 00040d00  efffc940  c18b4b40
> c1807440  
> [0.162571] 1f80: c18b4b40 c0e5ea48 0004 c1801fa8 c0322fb0
> c0e00c18 2053 
> [0.171226] 1fa0: c18b4b40    
> c0e006c0  
> [0.179890] 1fc0:  c1807448 c0e5ea48  
> c18b4dd4 

Re: [PATCH v5 0/9] mtd: sharpslpart partition parser

2017-08-21 Thread Boris Brezillon
Le Mon, 14 Aug 2017 22:48:31 +0200,
Andrea Adami  a écrit :

> This patchset introduces a simple partition parser for the Sharp SL
> Series PXA handhelds. More details in the commit text.
> 
> I have set in cc the ARM PXA maintainers because this is the MTD part of
> a planned wider patchset cleaning the Zaurus board files. The MFD maintainers
> are also in cc (tmio.h change).
> 
> Changelog:
> v1 firt version, initial import of 2.4 sources
> v2 refactor applying many suggested fixes
> v3 put the partition parser types in the platform data
> v4 refactor after ML review
> v5 fix commit messages and subject texts, remove global, fixes after v4 review
> 
> GPL sources: http://support.ezaurus.com/developer/source/source_dl.asp
> 
> Andrea Adami (9):
>   mtd: sharpslpart: Add sharpslpart partition parser
>   mtd: nand: sharpsl: Add partition parsers platform data
>   mfd: tmio: Add partition parsers platform data
>   mtd: nand: sharpsl: Register partitions using the parsers
>   mtd: nand: tmio: Register partitions using the parsers

Applied patches 2, to 5 to nand/next.

Thanks,

Boris

>   ARM: pxa/corgi: Remove hardcoded partitioning, use sharpslpart parser
>   ARM: pxa/tosa: Remove hardcoded partitioning, use sharpslpart parser
>   ARM: pxa/spitz: Remove hardcoded partitioning, use sharpslpart parser
>   ARM: pxa/poodle: Remove hardcoded partitioning, use sharpslpart parser
> 
>  arch/arm/mach-pxa/corgi.c |  31 +---
>  arch/arm/mach-pxa/poodle.c|  28 +--
>  arch/arm/mach-pxa/spitz.c |  34 +---
>  arch/arm/mach-pxa/tosa.c  |  28 +--
>  drivers/mtd/nand/sharpsl.c|   2 +-
>  drivers/mtd/nand/tmio_nand.c  |   4 +-
>  drivers/mtd/parsers/Kconfig   |   8 +
>  drivers/mtd/parsers/Makefile  |   1 +
>  drivers/mtd/parsers/sharpslpart.c | 376 
> ++
>  include/linux/mfd/tmio.h  |   1 +
>  include/linux/mtd/sharpsl.h   |   1 +
>  11 files changed, 424 insertions(+), 90 deletions(-)
>  create mode 100644 drivers/mtd/parsers/sharpslpart.c
> 



[PATCH 0/4] w1: Adjustments for some function implementations

2017-08-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 21 Aug 2017 22:04:56 +0200

A few update suggestions were taken into account
from static source code analysis.

Markus Elfring (4):
  Delete an error message for a failed memory allocation in two functions
  Improve a size determination in two functions
  masters: Delete an error message for a failed memory allocation in four 
functions
  masters: Improve a size determination in four functions

 drivers/w1/masters/ds2482.c| 3 ++-
 drivers/w1/masters/ds2490.c| 7 +++
 drivers/w1/masters/matrox_w1.c | 7 +--
 drivers/w1/masters/mxc_w1.c| 3 +--
 drivers/w1/masters/omap_hdq.c  | 4 +---
 drivers/w1/masters/w1-gpio.c   | 7 ++-
 drivers/w1/slaves/w1_ds28e04.c | 2 +-
 drivers/w1/w1.c| 9 ++---
 drivers/w1/w1_int.c| 6 +-
 9 files changed, 14 insertions(+), 34 deletions(-)

-- 
2.14.0



[PATCH 2/4] w1: Improve a size determination in two functions

2017-08-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 21 Aug 2017 21:17:01 +0200

Replace the specification of data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/w1/slaves/w1_ds28e04.c | 2 +-
 drivers/w1/w1.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds28e04.c b/drivers/w1/slaves/w1_ds28e04.c
index ec234b846eb3..794db5e8f46f 100644
--- a/drivers/w1/slaves/w1_ds28e04.c
+++ b/drivers/w1/slaves/w1_ds28e04.c
@@ -397,5 +397,5 @@ static int w1_f1C_add_slave(struct w1_slave *sl)
struct w1_f1C_data *data = NULL;
 
if (w1_enable_crccheck) {
-   data = kzalloc(sizeof(struct w1_f1C_data), GFP_KERNEL);
+   data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
index f26c1ea280dd..9f71dc7aca3a 100644
--- a/drivers/w1/w1.c
+++ b/drivers/w1/w1.c
@@ -711,5 +711,5 @@ int w1_attach_slave_device(struct w1_master *dev, struct 
w1_reg_num *rn)
int err;
struct w1_netlink_msg msg;
 
-   sl = kzalloc(sizeof(struct w1_slave), GFP_KERNEL);
+   sl = kzalloc(sizeof(*sl), GFP_KERNEL);
if (!sl)
-- 
2.14.0



[PATCH 1/4] w1: Delete an error message for a failed memory allocation in two functions

2017-08-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 21 Aug 2017 21:05:42 +0200

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/w1/w1.c | 7 +--
 drivers/w1/w1_int.c | 6 +-
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
index 74471e7aa5cc..f26c1ea280dd 100644
--- a/drivers/w1/w1.c
+++ b/drivers/w1/w1.c
@@ -715,10 +715,5 @@ int w1_attach_slave_device(struct w1_master *dev, struct 
w1_reg_num *rn)
-   if (!sl) {
-   dev_err(>dev,
-"%s: failed to allocate new slave device.\n",
-__func__);
+   if (!sl)
return -ENOMEM;
-   }
-
 
sl->owner = THIS_MODULE;
sl->master = dev;
diff --git a/drivers/w1/w1_int.c b/drivers/w1/w1_int.c
index 1c776178f598..9e37463960ed 100644
--- a/drivers/w1/w1_int.c
+++ b/drivers/w1/w1_int.c
@@ -44,9 +44,5 @@ static struct w1_master *w1_alloc_dev(u32 id, int 
slave_count, int slave_ttl,
-   if (!dev) {
-   pr_err("Failed to allocate %zd bytes for new w1 device.\n",
-   sizeof(struct w1_master));
+   if (!dev)
return NULL;
-   }
-
 
dev->bus_master = (struct w1_bus_master *)(dev + 1);
 
-- 
2.14.0



[patch] fs, proc: unconditional cond_resched when reading smaps

2017-08-21 Thread David Rientjes
If there are large numbers of hugepages to iterate while reading
/proc/pid/smaps, the page walk never does cond_resched().  On archs
without split pmd locks, there can be significant and observable
contention on mm->page_table_lock which cause lengthy delays without
rescheduling.

Always reschedule in smaps_pte_range() if necessary since the pagewalk
iteration can be expensive.

Signed-off-by: David Rientjes 
---
 fs/proc/task_mmu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -599,11 +599,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long 
addr, unsigned long end,
if (ptl) {
smaps_pmd_entry(pmd, addr, walk);
spin_unlock(ptl);
-   return 0;
+   goto out;
}
 
if (pmd_trans_unstable(pmd))
-   return 0;
+   goto out;
/*
 * The mmap_sem held all the way back in m_start() is what
 * keeps khugepaged out of here and from collapsing things
@@ -613,6 +613,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, 
unsigned long end,
for (; addr != end; pte++, addr += PAGE_SIZE)
smaps_pte_entry(pte, addr, walk);
pte_unmap_unlock(pte - 1, ptl);
+out:
cond_resched();
return 0;
 }


Re: [PATCH v3 net-next] bpf/verifier: track liveness for pruning

2017-08-21 Thread Alexei Starovoitov

On 8/21/17 1:24 PM, Edward Cree wrote:

On 18/08/17 15:16, Edward Cree wrote:

On 18/08/17 04:21, Alexei Starovoitov wrote:

It seems you're trying to sort-of do per-fake-basic block liveness
analysis, but our state_list_marks are not correct if we go with
canonical basic block definition, since we mark the jump insn and
not insn after the branch and not every basic block boundary is
properly detected.

I think the reason this works is that jump insns can't do writes.
[snip]
the sl->state will never have any write marks and it'll all just work.
But I should really test that!

I tested this, and found that, no, sl->state can have write marks, and the
 algorithm will get the wrong answer in that case.  So I've got a patch to
 make the first iteration ignore write marks, as part of a series which I
 will post shortly.  When I do so, please re-do your tests with adding
 state_list_marks in strange and exciting places; it should work wherever
 you put them.  Like you say, it "magically doesn't depend on proper basic
 block boundaries", and that's because really pruning is just a kind of
 checkpointing that just happens to be most effective when done just after
 a jump (pop_stack).

Can I have a SOB for your "grr" test program, so I can include it in the
 series?


yes. of course. just give the test some reasonable name :)



Re: [PATCH v3] livepatch: add (un)patch callbacks

2017-08-21 Thread Joe Lawrence
On Fri, Aug 18, 2017 at 03:58:16PM +0200, Petr Mladek wrote:
> On Wed 2017-08-16 15:17:04, Joe Lawrence wrote:
> > Provide livepatch modules a klp_object (un)patching notification
> > mechanism.  Pre and post-(un)patch callbacks allow livepatch modules to
> > setup or synchronize changes that would be difficult to support in only
> > patched-or-unpatched code contexts.
> > 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 194991ef9347..500dc9b2b361 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -138,6 +154,71 @@ struct klp_patch {
> >  func->old_name || func->new_func || func->old_sympos; \
> >  func++)
> >  
> > +/**
> > + * klp_is_object_loaded() - is klp_object currently loaded?
> > + * @obj:   klp_object pointer
> > + *
> > + * Return: true if klp_object is loaded (always true for vmlinux)
> > + */
> > +static inline bool klp_is_object_loaded(struct klp_object *obj)
> > +{
> > +   return !obj->name || obj->mod;
> > +}
> > +
> > +/**
> > + * klp_pre_patch_callback - execute before klp_object is patched
> > + * @obj:   invoke callback for this klp_object
> > + *
> > + * Return: status from callback
> > + *
> > + * Callers should ensure obj->patched is *not* set.
> > + */
> > +static inline int klp_pre_patch_callback(struct klp_object *obj)
> > +{
> > +   if (obj->callbacks.pre_patch)
> > +   return (*obj->callbacks.pre_patch)(obj);
> > +   return 0;
> > +}
> > +
> > +/**
> > + * klp_post_patch_callback() - execute after klp_object is patched
> > + * @obj:   invoke callback for this klp_object
> > + *
> > + * Callers should ensure obj->patched is set.
> > + */
> > +static inline void klp_post_patch_callback(struct klp_object *obj)
> > +{
> > +   if (obj->callbacks.post_patch)
> > +   (*obj->callbacks.post_patch)(obj);
> > +}
> > +
> > +/**
> > + * klp_pre_unpatch_callback() - execute before klp_object is unpatched
> > + *  and is active across all tasks
> > + * @obj:   invoke callback for this klp_object
> > + *
> > + * Callers should ensure obj->patched is set.
> > + */
> > +static inline void klp_pre_unpatch_callback(struct klp_object *obj)
> > +{
> > +   if (obj->callbacks.pre_unpatch)
> > +   (*obj->callbacks.pre_unpatch)(obj);
> > +}
> > +
> > +/**
> > + * klp_post_unpatch_callback() - execute after klp_object is unpatched,
> > + *   all code has been restored and no tasks
> > + *   are running patched code
> > + * @obj:   invoke callback for this klp_object
> > + *
> > + * Callers should ensure obj->patched is *not* set.
> > + */
> > +static inline void klp_post_unpatch_callback(struct klp_object *obj)
> > +{
> > +   if (obj->callbacks.post_unpatch)
> > +   (*obj->callbacks.post_unpatch)(obj);
> > +}
> 
> I guess that we do not want to make these function usable
> outside livepatch code. Thefore these inliners should go
> to kernel/livepatch/core.h or so.

Okay, I can stash them away in an internal header file like core.h.

> > +
> >  int klp_register_patch(struct klp_patch *);
> >  int klp_unregister_patch(struct klp_patch *);
> >  int klp_enable_patch(struct klp_patch *);
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index b9628e43c78f..ddb23e18a357 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -878,6 +890,8 @@ int klp_module_coming(struct module *mod)
> > goto err;
> > }
> >  
> > +   klp_post_patch_callback(obj);
> 
> This should be called only if (patch != klp_transition_patch).
> Otherwise, it would be called too early.

Can you elaborate a bit on this scenario?  When would the transition
patch (as I understand it, a livepatch not quite fully (un)patched) hit
the module coming/going notifier?  Is it possible to load or unload a
module like this?  I'd like to add this scenario to my test script if
possible.
 
> > +
> > break;
> > }
> > }
> > @@ -929,7 +943,10 @@ void klp_module_going(struct module *mod)
> > if (patch->enabled || patch == klp_transition_patch) {
> > pr_notice("reverting patch '%s' on unloading 
> > module '%s'\n",
> >   patch->mod->name, obj->mod->name);
> > +
> > +   klp_pre_unpatch_callback(obj);
> 
> Also the pre_unpatch() callback should be called only
> if (patch != klp_transition_patch). Otherwise, it should have
> already been called. It is not the current case but see below.

Ditto.

> > klp_unpatch_object(obj);
> > +   klp_post_unpatch_callback(obj);
> > }
> >  
> > klp_free_object_loaded(obj);
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 52c4e907c14b..0eed0df6e6d9 100644
> > --- 

Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group

2017-08-21 Thread Peter Zijlstra
On Mon, Aug 21, 2017 at 11:14:00PM +0200, Peter Zijlstra wrote:
> +static int
> +find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int 
> sd_flag)
> +{
> + struct sched_domain *tmp;
> + int new_cpu = cpu;
> +
> + while (sd) {
> + struct sched_group *group;
> + int weight;
> +
> + if (!(sd->flags & sd_flag)) {
> + sd = sd->child;
> + continue;
> + }
> +
> + group = find_idlest_group(sd, p, cpu, sd_flag);
> + if (!group) {
> + sd = sd->child;
> + continue;
> + }
> +
> + new_cpu = find_idlest_group_cpu(group, p, cpu);
> + if (new_cpu == -1 || new_cpu == cpu) {
> + /* Now try balancing at a lower domain level of cpu */
> + sd = sd->child;
> + continue;
> + }
> +
> + /* Now try balancing at a lower domain level of new_cpu */
> + cpu = new_cpu;
> + weight = sd->span_weight;
> + sd = NULL;
> + for_each_domain(cpu, tmp) {
> + if (weight <= tmp->span_weight)
> + break;
> + if (tmp->flags & sd_flag)
> + sd = tmp;
> + }

This find-the-sd-for-another-cpu thing is horrific. And it has always
bugged me that the whole thing is O(n^2) to find a CPU.

I understand why it has this form, but scanning each CPU more than once
is just offensive.

> + /* while loop will break here if sd == NULL */
> + }
> +
> + return new_cpu;
> +}


[PATCH V9 1/2] powerpc/numa: Update CPU topology when VPHN enabled

2017-08-21 Thread Michael Bringmann

powerpc/numa: Correct the currently broken capability to set the
topology for shared CPUs in LPARs.  At boot time for shared CPU
lpars, the topology for each shared CPU is set to node zero, however,
this is now updated correctly using the Virtual Processor Home Node
(VPHN) capabilities information provided by the pHyp.

Also, update initialization checks for device-tree attributes to
independently recognize PRRN or VPHN usage.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/topology.h  |   14 ++
 arch/powerpc/mm/numa.c   |   64 +++---
 arch/powerpc/platforms/pseries/dlpar.c   |2 +
 arch/powerpc/platforms/pseries/hotplug-cpu.c |2 +
 4 files changed, 75 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h 
b/arch/powerpc/include/asm/topology.h
index dc4e159..85d6428 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -98,6 +98,20 @@ static inline int prrn_is_enabled(void)
 }
 #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
 
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_NEED_MULTIPLE_NODES)
+#if defined(CONFIG_PPC_SPLPAR)
+extern int timed_topology_update(int nsecs);
+#else
+#definetimed_topology_update(nsecs)0
+#endif /* CONFIG_PPC_SPLPAR */
+#endif /* CONFIG_HOTPLUG_CPU || CONFIG_NEED_MULTIPLE_NODES */
+
+#if defined(CONFIG_PPC_SPLPAR)
+extern void shared_topology_update(void);
+#else
+#defineshared_topology_update()0
+#endif /* CONFIG_PPC_SPLPAR */
+
 #include 
 
 #ifdef CONFIG_SMP
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b95c584..3fd4536 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -906,7 +907,7 @@ void __init initmem_init(void)
 
/*
 * Reduce the possible NUMA nodes to the online NUMA nodes,
-* since we do not support node hotplug. This ensures that  we
+* since we do not support node hotplug. This ensures that we
 * lower the maximum NUMA node ID to what is actually present.
 */
nodes_and(node_possible_map, node_possible_map, node_online_map);
@@ -1148,11 +1149,32 @@ struct topology_update_data {
int new_nid;
 };
 
+#defineTOPOLOGY_DEF_TIMER_SECS 60
+
 static u8 vphn_cpu_change_counts[NR_CPUS][MAX_DISTANCE_REF_POINTS];
 static cpumask_t cpu_associativity_changes_mask;
 static int vphn_enabled;
 static int prrn_enabled;
 static void reset_topology_timer(void);
+static int topology_timer_secs = TOPOLOGY_DEF_TIMER_SECS;
+static int topology_inited;
+static int topology_update_needed;
+
+/*
+ * Change polling interval for associativity changes.
+ */
+int timed_topology_update(int nsecs)
+{
+   if (nsecs > 0)
+   topology_timer_secs = nsecs;
+   else
+   topology_timer_secs = TOPOLOGY_DEF_TIMER_SECS;
+
+   if (vphn_enabled)
+   reset_topology_timer();
+
+   return 0;
+}
 
 /*
  * Store the current values of the associativity change counters in the
@@ -1246,6 +1268,12 @@ static long vphn_get_associativity(unsigned long cpu,
"hcall_vphn() experienced a hardware fault "
"preventing VPHN. Disabling polling...\n");
stop_topology_update();
+   break;
+   case H_SUCCESS:
+   printk(KERN_INFO
+   "VPHN hcall succeeded. Reset polling...\n");
+   timed_topology_update(0);
+   break;
}
 
return rc;
@@ -1323,8 +1351,11 @@ int numa_update_cpu_topology(bool cpus_locked)
struct device *dev;
int weight, new_nid, i = 0;
 
-   if (!prrn_enabled && !vphn_enabled)
+   if (!prrn_enabled && !vphn_enabled) {
+   if (!topology_inited)
+   topology_update_needed = 1;
return 0;
+   }
 
weight = cpumask_weight(_associativity_changes_mask);
if (!weight)
@@ -1363,6 +1394,8 @@ int numa_update_cpu_topology(bool cpus_locked)
cpumask_andnot(_associativity_changes_mask,
_associativity_changes_mask,
cpu_sibling_mask(cpu));
+   pr_info("Assoc chg gives same node %d for cpu%d\n",
+   new_nid, cpu);
cpu = cpu_last_thread_sibling(cpu);
continue;
}
@@ -1379,6 +1412,9 @@ int numa_update_cpu_topology(bool cpus_locked)
cpu = cpu_last_thread_sibling(cpu);
}
 
+   if (i)
+   updates[i-1].next = NULL;
+
pr_debug("Topology update for the following CPUs:\n");
if (cpumask_weight(_cpus)) {
for (ud = [0]; ud; ud = ud->next) {
@@ -1433,6 +1469,7 @@ int 

[PATCH V9 2/2] powerpc/nodes: Ensure enough nodes avail for operations

2017-08-21 Thread Michael Bringmann
To: linuxppc-...@lists.ozlabs.org

From: Michael Bringmann 

To: linux-kernel@vger.kernel.org
Cc: Michael Ellerman 
Cc: Michael Bringmann 
Cc: John Allen 
Cc: Nathan Fontenot 
Subject: [PATCH V9 2/2] powerpc/nodes: Ensure enough nodes avail for operations

powerpc/nodes: On systems like PowerPC which allow 'hot-add' of CPU
or memory resources, it may occur that the new resources are to be
inserted into nodes that were not used for these resources at bootup.
In the kernel, any node that is used must be defined and initialized
at boot.

This patch extracts the value of the lowest domain level (number of
allocable resources) from the "rtas" device tree property
"ibm,max-associativity-domains" to use as the maximum number of nodes
to setup as possibly available in the system.  This new setting will
override the instruction,

nodes_and(node_possible_map, node_possible_map, node_online_map);

presently seen in the function arch/powerpc/mm/numa.c:initmem_init().

If the property is not present at boot, no operation will be performed
to define or enable additional nodes.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/mm/numa.c |   44 
 1 file changed, 44 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 3fd4536..3ae6510 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -893,6 +893,48 @@ static void __init setup_node_data(int nid, u64 start_pfn, 
u64 end_pfn)
NODE_DATA(nid)->node_spanned_pages = spanned_pages;
 }
 
+static void __init node_associativity_setup(void)
+{
+   struct device_node *rtas;
+   printk(KERN_INFO "%s:%d\n", __FUNCTION__, __LINE__);
+
+   rtas = of_find_node_by_path("/rtas");
+   if (rtas) {
+   const __be32 *prop;
+   u32 len, entries, levelval, i;
+   printk(KERN_INFO "%s:%d\n", __FUNCTION__, __LINE__);
+
+   prop = of_get_property(rtas, "ibm,max-associativity-domains", 
);
+   if (!prop || len < sizeof(unsigned int)) {
+   printk(KERN_INFO "%s:%d\n", __FUNCTION__, __LINE__);
+   goto endit;
+   }
+
+   entries = of_read_number(prop++, 1);
+
+   if (len < (entries * sizeof(unsigned int))) {
+   printk(KERN_INFO "%s:%d\n", __FUNCTION__, __LINE__);
+   goto endit;
+   }
+
+   for (i = 0; i < entries; i++)
+   levelval = of_read_number(prop++, 1);
+
+   printk(KERN_INFO "Numa nodes avail: %d (%d) \n", (int) 
levelval, (int) entries);
+
+   for (i = 0; i < levelval; i++) {
+   if (!node_possible(i)) {
+   setup_node_data(i, 0, 0);
+   node_set(i, node_possible_map);
+   }
+   }
+   }
+
+endit:
+   if (rtas)
+   of_node_put(rtas);
+}
+
 void __init initmem_init(void)
 {
int nid, cpu;
@@ -912,6 +954,8 @@ void __init initmem_init(void)
 */
nodes_and(node_possible_map, node_possible_map, node_online_map);
 
+   node_associativity_setup();
+
for_each_online_node(nid) {
unsigned long start_pfn, end_pfn;
 



[PATCH] Staging: greybus: Fix spelling error in comment

2017-08-21 Thread Eames Trinh
Fixed a spelling error.

Signed-off-by: Eames Trinh 
---
 drivers/staging/greybus/arche-platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/greybus/arche-platform.c 
b/drivers/staging/greybus/arche-platform.c
index 4837aca41389..21ac92d0f533 100644
--- a/drivers/staging/greybus/arche-platform.c
+++ b/drivers/staging/greybus/arche-platform.c
@@ -196,7 +196,7 @@ static irqreturn_t arche_platform_wd_irq(int irq, void 
*devid)
if (arche_pdata->wake_detect_state == WD_STATE_IDLE) {
arche_pdata->wake_detect_start = jiffies;
/*
-* In the begining, when wake/detect goes low
+* In the beginning, when wake/detect goes low
 * (first time), we assume it is meant for coldboot
 * and set the flag. If wake/detect line stays low
 * beyond 30msec, then it is coldboot else fallback
-- 
2.11.0



Re: L0AN

2017-08-21 Thread softlink Int'L
Do you need a personal/business L0AN, if yes contact Softlink Int'L for more 
info


[PATCH 0/2] Allow scsi_prep_fn to occur for retried commands

2017-08-21 Thread Brian King
The following two patches address the hang issue being observed
with Bart's patch on powerpc. The first patch moves the initialization
of jiffies_at_alloc from scsi_init_command to scsi_init_rq, and ensures
we don't zero jiffies_at_alloc in scsi_init_command. The second patch
saves / restores the retry counter in scsi_init_command which lets us
go through scsi_init_command for retries and not forget why we were
there. 

These patches have only been boot tested on my Power machine with ipr
to ensure they fix the issue I was seeing.

-Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: events: possible deadlock in __perf_event_task_sched_out

2017-08-21 Thread Peter Zijlstra
On Mon, Aug 21, 2017 at 01:58:13PM +0530, Shubham Bansal wrote:
> > This is a WARN, printk is a pig.
> 
> So, its not a bug?

No, triggering the WARN is the problem, this is just fallout after that.


RE: [PATCH v5 0/3] TPS68470 PMIC drivers

2017-08-21 Thread Mani, Rajmohan
Hi Andy,

> > >> > This is the patch series for TPS68470 PMIC that works as a camera PMIC.
> > >> >
> > >> > The patch series provide the following 3 drivers, to help configure the
> voltage regulators, clocks and GPIOs provided by the TPS68470 PMIC, to be
> able to use the camera sensors connected to this PMIC.
> > >> >
> > >> > TPS68470 MFD driver:
> > >> > This is the multi function driver that initializes the TPS68470 PMIC 
> > >> > and
> supports the GPIO and Op Region functions.
> > >> >
> > >> > TPS68470 GPIO driver:
> > >> > This is the PMIC GPIO driver that will be used by the OS GPIO layer,
> when the BIOS / firmware triggered GPIO access is done.
> > >> >
> > >> > TPS68470 Op Region driver:
> > >> > This is the driver that will be invoked, when the BIOS / firmware
> configures the voltage / clock for the sensors / vcm devices connected to the
> PMIC.
> > >> >
> > >>
> > >> All three patches are good to me (we did few rounds of internal
> > >> review before posting v4)
> > >>
> > >> Reviewed-by: Andy Shevchenko 
> > >
> > > OK, so how should they be routed?
> >
> > Good question. I don't know how last time PMIC drivers were merged,
> > here I think is just sane to route vi MFD with immutable branch
> > created.
> 
> OK
> 
> I will assume that the series will go in through MFD then.
> 

Now that the MFD and GPIO patches of v6 of this series have been applied on 
respective trees, can you advise the next steps for the ACPI / PMIC Opregion 
driver?

Thanks
Raj


Re: [PATCH net-next,1/4] hv_netvsc: Clean up unused parameter from netvsc_get_hash()

2017-08-21 Thread David Miller

All proper patch series must have a header "[PATCH xxx 0/N]" posting
which explains at a high level what the patch series does, how it does
it, and why it is doing it that way.

Therefore, please resubmit this patch series with a proper header
posting.

Thank you.


Re: [PATCH] perf record: enable multiplexing scaling via -R

2017-08-21 Thread Stephane Eranian
On Mon, Aug 21, 2017 at 4:02 PM, Andi Kleen  wrote:
>
> Stephane Eranian  writes:
> >
> > To activate, the user must use:
> > $ perf record -a -R 
>
> I don't know why you're overloading the existing raw mode?
>
> It has nothing to do with that.
>
I explained this in the changelog. So that is does not change any of
the processing in perf report, i.e., no faced with data it does not
know how to handle.
Also trying to avoid adding yet another option.

>
> -Andi


RE: [PATCH v5 0/3] TPS68470 PMIC drivers

2017-08-21 Thread Mani, Rajmohan
Hi Rafael,

> >> > >> > This is the patch series for TPS68470 PMIC that works as a camera
> PMIC.
> >> > >> >
> >> > >> > The patch series provide the following 3 drivers, to help
> >> > >> > configure the
> >> voltage regulators, clocks and GPIOs provided by the TPS68470 PMIC,
> >> to be able to use the camera sensors connected to this PMIC.
> >> > >> >
> >> > >> > TPS68470 MFD driver:
> >> > >> > This is the multi function driver that initializes the
> >> > >> > TPS68470 PMIC and
> >> supports the GPIO and Op Region functions.
> >> > >> >
> >> > >> > TPS68470 GPIO driver:
> >> > >> > This is the PMIC GPIO driver that will be used by the OS GPIO
> >> > >> > layer,
> >> when the BIOS / firmware triggered GPIO access is done.
> >> > >> >
> >> > >> > TPS68470 Op Region driver:
> >> > >> > This is the driver that will be invoked, when the BIOS /
> >> > >> > firmware
> >> configures the voltage / clock for the sensors / vcm devices
> >> connected to the PMIC.
> >> > >> >
> >> > >>
> >> > >> All three patches are good to me (we did few rounds of internal
> >> > >> review before posting v4)
> >> > >>
> >> > >> Reviewed-by: Andy Shevchenko 
> >> > >
> >> > > OK, so how should they be routed?
> >> >
> >> > Good question. I don't know how last time PMIC drivers were merged,
> >> > here I think is just sane to route vi MFD with immutable branch
> >> > created.
> >>
> >> OK
> >>
> >> I will assume that the series will go in through MFD then.
> >>
> >
> > Now that the MFD and GPIO patches of v6 of this series have been applied
> on respective trees, can you advise the next steps for the ACPI / PMIC 
> Opregion
> driver?
> 
> Well, it would have been better to route the whole series through one tree.
> Now it's better to wait until the two other trees get merged and then apply 
> the
> opregion patch.
> 

Ack.
Let me get back once the other 2 trees are merged.

Thanks
Raj


Re: [PATCH RESEND 1/2] net: enable high resolution timer mode to timeout datagram sockets

2017-08-21 Thread Cong Wang
On Fri, Aug 18, 2017 at 11:44 AM, Vallish Vaidyeshwara
 wrote:
> -   *timeo_p = schedule_timeout(*timeo_p);
> +   /* Wait using highres timer */
> +   expires = ktime_add_ns(ktime_get(), jiffies_to_nsecs(*timeo_p));
> +   pre_sched_time = jiffies;
> +   if (schedule_hrtimeout(, HRTIMER_MODE_ABS))

Does this work with MAX_SCHEDULE_TIMEOUT too??


Re: [PATCH v3 net-next] bpf/verifier: track liveness for pruning

2017-08-21 Thread Edward Cree
On 18/08/17 15:16, Edward Cree wrote:
> On 18/08/17 04:21, Alexei Starovoitov wrote:
>> It seems you're trying to sort-of do per-fake-basic block liveness
>> analysis, but our state_list_marks are not correct if we go with
>> canonical basic block definition, since we mark the jump insn and
>> not insn after the branch and not every basic block boundary is
>> properly detected.
> I think the reason this works is that jump insns can't do writes.
> [snip]
> the sl->state will never have any write marks and it'll all just work.
> But I should really test that!
I tested this, and found that, no, sl->state can have write marks, and the
 algorithm will get the wrong answer in that case.  So I've got a patch to
 make the first iteration ignore write marks, as part of a series which I
 will post shortly.  When I do so, please re-do your tests with adding
 state_list_marks in strange and exciting places; it should work wherever
 you put them.  Like you say, it "magically doesn't depend on proper basic
 block boundaries", and that's because really pruning is just a kind of
 checkpointing that just happens to be most effective when done just after
 a jump (pop_stack).

Can I have a SOB for your "grr" test program, so I can include it in the
 series?

-Ed


  1   2   3   4   5   6   7   8   9   10   >