date:20181107

Re: [PATCH v4 1/6] tpm: dynamically allocate active_banks array

2018-11-07 Thread Mimi Zohar

On Wed, 2018-11-07 at 11:44 +0530, Nayna Jain wrote:
> On 11/06/2018 08:31 PM, Roberto Sassu wrote:


> > @@ -878,11 +877,14 @@ static ssize_t tpm2_get_pcr_allocation(struct 
> > tpm_chip *chip)
> > if (rc)
> > goto out;
> >
> > -   count = be32_to_cpup(
> > +   chip->nr_active_banks = be32_to_cpup(
> > (__be32 *)[TPM_HEADER_SIZE + 5]);
> 
> 
> As per my understanding, the count in the TPML_PCR_SELECTION represent 
> the number of possible banks and not the number of active banks.
> TCG Structures Spec for TPM 2.0 - Table 102 mentions this as explanation 
> of #TPM_RC_SIZE.

Instead of storing the result in a local variable, the only change
here is saving the result in the chip info (nr_active_banks).
 Everything else remains the same.


> >
> > -   if (count > ARRAY_SIZE(chip->active_banks)) {
> > -   rc = -ENODEV;
> > +   chip->active_banks = kmalloc_array(chip->nr_active_banks,
> > +  sizeof(*chip->active_banks),
> > +  GFP_KERNEL);

With this change, the exact number of banks can be allocated, as done
here.  Nice! 

Mimi

> > +   if (!chip->active_banks) {
> > +   rc = -ENOMEM;
> > goto out;
> > }
> >
>

Re: [RFC 0/2] Add RISC-V cpu topology

2018-11-07 Thread Mark Rutland

On Wed, Nov 07, 2018 at 04:31:34AM +0200, Nick Kossifidis wrote:
> Mark and Sundeep thanks a lot for your feedback, I guess you convinced
> me that having a device tree binding for the scheduler is not a
> correct approach. It's not a device after all and I agree that the
> device tree shouldn't become an OS configuration file.

Good to hear.

> Regarding multiple levels of shared resources my point is that since
> cpu-map doesn't contain any information of what is shared among the
> cluster/core members it's not easy to do any further translation. Last
> time I checked the arm code that uses cpu-map, it only defines one
> domain for SMT, one for MC and then everything else is ignored. No
> matter how many clusters have been defined, anything above the core
> level is the same (and then I guess you started talking about adding
> "packages" on the representation side).

While cpu-map doesn't contain that information today, we can *add* that
information to the cpu-map binding if necessary.

> The reason I proposed to have a binding for the scheduler directly is
> not only because it's simpler and closer to what really happens in the
> code, it also makes more sense to me than the combination of cpu-map
> with all the related mappings e.g.  for numa or caches or power
> domains etc.
> 
> However you are right we could definitely augment cpu-map to include
> support for what I'm saying and clean things up, and since you are
> open about improving it here is a proposal that I hope you find
> interesting:
> 
> At first let's get rid of the  nodes, they don't make sense:
> 
> thread0 {
>  cpu = <>;
> };
> 
> A thread node can't have more than one cpu entry and any properties
> should be on the cpu node itself, so it doesn't / can't add any
> more information. We could just have an array of cpu nodes on the
>  node, it's much cleaner this way.
> 
> core0 {
>  members = <>, <>;
> };

Hold on. Rather than reinventing things from first principles, can we
please discuss what you want to *achieve*, i.e. what information you
need?

Having a node is not a significant cost, and there are reasons we may
want thread nodes. For example, it means that we can always refer to any
level of topology with a phandle, and we might want to describe
thread-affine devices in future.

There are a tonne of existing bindings that are ugly, but re-inventing
them for taste reasons alone is more costly to the ecosystem than simply
using the existing bindings. We avoid re-inventing bindings unless there
is a functional problem e.g. cases which they cannot possibly describe.

> Then let's allow the cluster and core nodes to accept attributes that are
> common for the cpus they contain. Right now this is considered invalid.
> 
> For power domains we have a generic binding described on
> Documentation/devicetree/bindings/power/power_domain.txt
> which basically says that we need to put power-domains =  specifiers>
> attribute on each of the cpu nodes.

FWIW, given this is arguably topological, I'm not personally averse to
describing this in the cpu-map, if that actually gains us more than the
complexity require to support it.

Given we don't do this for device power domains, I suspect that it's
simpler to stick with the existing binding.

> The same happens with the capacity binding specified for arm on
> Documentation/devicetree/bindings/arm/cpu-capacity.txt
> which says we should add the capacity-dmips-mhz on each of the cpu nodes.

The cpu-map was intended to expose topological dtails, and this isn't
really a topological property. For example, Arm DynamIQ systems can have
heterogeneous CPUs within clusters.

I do not think it's worth moving this, tbh.

> The same also happens with the generic numa binding on
> Documentation/devicetree/bindings/numa.txt
> which says we should add the nuna-node-id on each of the cpu nodes.

Is there a strong gain from moving this?

[...]

> Finally from the examples above I'd like to stress out that the distinction
> between a cluster and a core doesn't make much sense and it also makes the
> representation more complicated. To begin with, how would you call the setup
> on HiFive Unleashed ? A cluster of 4 cores that share the same L3 cache ?

Not knowing much about the hardware, I can't really say.

I'm not sure I follow why the distinction between a cluster and a core
is non-sensical. A cluster is always a collection of cores.

A hart could be a core in its own right, or it could be a thread under a
core, which shares functional units with other harts within that core.

Arguably, we could have mandated that the topology always needed to
describe down to a thread, even if a core only had a single thread. That
ship has sailed, however.

Thanks,
Mark.

Re: [PATCH v3] staging: olpc_dcon: olpc_dcon_xo_1.c: Switch to the gpio descriptor interface

2018-11-07 Thread Greg Kroah-Hartman

On Tue, Nov 06, 2018 at 01:13:19PM +0530, Nishad Kamdar wrote:
> Use the gpiod interface instead of the deprecated old non-descriptor
> interface in olpc_dcon_xo_1.c.
> ---
> Changes in v3:
>  - Resolve a few compilation errors.
> Changes in v2:
>  - Resolve a few compilation errors.
>  - Add a level of indirection to read and write gpios.
> Signed-off-by: Nishad Kamdar 

The signed-off-by has to be above the --- line otherwise it will get
stripped off by git when the patch is applied :(

Re: fs/ubifs/auth.c:249:2: error: implicit declaration of function 'request_key'

2018-11-07 Thread Sascha Hauer

On Wed, Nov 07, 2018 at 02:25:10AM +0800, kbuild test robot wrote:
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   8053e5b93eca9b011f7b79bb019bf1eeaaf96c4b
> commit: d8a22773a12c6d78ee758c9e530f3a488bb7cb29 ubifs: Enable authentication 
> support
> date:   2 weeks ago
> config: i386-randconfig-h1-11070135 (attached as .config)
> compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
> reproduce:
> git checkout d8a22773a12c6d78ee758c9e530f3a488bb7cb29
> # save the attached .config to linux build tree
> make ARCH=i386

Should be fixed with "[PATCH] ubifs: auth: add CONFIG_KEYS dependency"
(https://lkml.org/lkml/2018/11/2/355) already.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

2018-11-07 Thread Michal Hocko

On Tue 06-11-18 21:51:47, Arun KS wrote:
> This patch is in preparation to a later patch which converts totalram_pages
> and zone->managed_pages to atomic variables. This patch does not introduce
> any functional changes.

I forgot to comment on this one. The patch makes a lot of sense. But I
would be little bit more conservative and won't claim "no functional
changes". As things stand now multiple reads in the same function are
racy (without holding the lock). I do not see any example of an
obviously harmful case but claiming the above is too strong of a
statement. I would simply go with something like "Please note that
re-reading the value might lead to a different value and as such it
could lead to unexpected behavior. There are no known bugs as a result
of the current code but it is better to prevent from them in principle."

> Signed-off-by: Arun KS 
> Reviewed-by: Konstantin Khlebnikov 

Other than that
Acked-by: Michal Hocko 

> ---
>  arch/um/kernel/mem.c |  3 +--
>  arch/x86/kernel/cpu/microcode/core.c |  5 +++--
>  drivers/hv/hv_balloon.c  | 19 ++-
>  fs/file_table.c  |  7 ---
>  kernel/fork.c|  5 +++--
>  kernel/kexec_core.c  |  5 +++--
>  mm/page_alloc.c  |  5 +++--
>  mm/shmem.c   |  3 ++-
>  net/dccp/proto.c |  7 ---
>  net/netfilter/nf_conntrack_core.c|  7 ---
>  net/netfilter/xt_hashlimit.c |  5 +++--
>  net/sctp/protocol.c  |  7 ---
>  12 files changed, 44 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
> index 1067469..134d3fd 100644
> --- a/arch/um/kernel/mem.c
> +++ b/arch/um/kernel/mem.c
> @@ -51,8 +51,7 @@ void __init mem_init(void)
>  
>   /* this will put all low memory onto the freelists */
>   memblock_free_all();
> - max_low_pfn = totalram_pages;
> - max_pfn = totalram_pages;
> + max_pfn = max_low_pfn = totalram_pages;
>   mem_init_print_info(NULL);
>   kmalloc_ok = 1;
>  }
> diff --git a/arch/x86/kernel/cpu/microcode/core.c 
> b/arch/x86/kernel/cpu/microcode/core.c
> index 2637ff0..99c67ca 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -434,9 +434,10 @@ static ssize_t microcode_write(struct file *file, const 
> char __user *buf,
>  size_t len, loff_t *ppos)
>  {
>   ssize_t ret = -EINVAL;
> + unsigned long totalram_pgs = totalram_pages;
>  
> - if ((len >> PAGE_SHIFT) > totalram_pages) {
> - pr_err("too much data (max %ld pages)\n", totalram_pages);
> + if ((len >> PAGE_SHIFT) > totalram_pgs) {
> + pr_err("too much data (max %ld pages)\n", totalram_pgs);
>   return ret;
>   }
>  
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 4163151..cac4945 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1090,6 +1090,7 @@ static void process_info(struct hv_dynmem_device *dm, 
> struct dm_info_msg *msg)
>  static unsigned long compute_balloon_floor(void)
>  {
>   unsigned long min_pages;
> + unsigned long totalram_pgs = totalram_pages;
>  #define MB2PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
>   /* Simple continuous piecewiese linear function:
>*  max MiB -> min MiB  gradient
> @@ -1102,16 +1103,16 @@ static unsigned long compute_balloon_floor(void)
>*8192   744(1/16)
>*   32768  1512(1/32)
>*/
> - if (totalram_pages < MB2PAGES(128))
> - min_pages = MB2PAGES(8) + (totalram_pages >> 1);
> - else if (totalram_pages < MB2PAGES(512))
> - min_pages = MB2PAGES(40) + (totalram_pages >> 2);
> - else if (totalram_pages < MB2PAGES(2048))
> - min_pages = MB2PAGES(104) + (totalram_pages >> 3);
> - else if (totalram_pages < MB2PAGES(8192))
> - min_pages = MB2PAGES(232) + (totalram_pages >> 4);
> + if (totalram_pgs < MB2PAGES(128))
> + min_pages = MB2PAGES(8) + (totalram_pgs >> 1);
> + else if (totalram_pgs < MB2PAGES(512))
> + min_pages = MB2PAGES(40) + (totalram_pgs >> 2);
> + else if (totalram_pgs < MB2PAGES(2048))
> + min_pages = MB2PAGES(104) + (totalram_pgs >> 3);
> + else if (totalram_pgs < MB2PAGES(8192))
> + min_pages = MB2PAGES(232) + (totalram_pgs >> 4);
>   else
> - min_pages = MB2PAGES(488) + (totalram_pages >> 5);
> + min_pages = MB2PAGES(488) + (totalram_pgs >> 5);
>  #undef MB2PAGES
>   return min_pages;
>  }
> diff --git a/fs/file_table.c b/fs/file_table.c
> index e49af4c..6e3c088 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -380,10 +380,11 @@ void __init files_init(void)
>  void __init files_maxfiles_init(void)
>  {
>   unsigned long n;
> - unsigned long memreserve = (totalram_pages

Re: [PATCH RFC] hist lookups

2018-11-07 Thread Jiri Olsa

On Tue, Nov 06, 2018 at 10:13:49PM -0800, David Miller wrote:
> From: Jiri Olsa 
> Date: Tue, 6 Nov 2018 21:42:55 +0100
> 
> > I pushed that fix in perf/fixes branch, but I'm still occasionaly
> > hitting the namespace crash.. working on it ;-)
> 
> Jiri, how can this new scheme work without setting copy_on_queue
> for the queued_events we use here?

aahh.. it won't, setting it up ;-)

> 
> I don't see copy_on_queue being set and that means the queued event
> structures reference the event memory directly in the mmaps, after the
> mmap thread has released them back to the queue.
> 
> That means new events can come in to the mmap ring and overwrite what
> was there previously, maybe even while deliver_event() is in the
> middle of parsing the event.
> 
> Setting copy_on_queue for data[0] and data[1] makes all of the crashes
> go away for me.
> 
> I get a lot of "[unknown]" shared objects shortly after perf top
> starts up during a full workload.  I've been wondering about one
> side effect of how the mmap queues are processed, consider the
> following:
> 
>   cpu 0   cpu 1
> 
>   exec
>   create new mmap2 events
>   scheduled to cpu 0 for whatever reason
>   sample 1
>   sample 2
> 
> And let's say that perf top is backlogged processing the mmap ring of
> events generated for cpu 0, and sees sample 1 and sample 2 before
> getting to any of cpu 1's events.
> 
> This means the thread and map and symbol objects won't exist and
> we'll get those '[Unknown]' histogram entries, and they won't go
> away.
> 
> When it finally stops looping over the mmap ring for cpu 0's events
> it gets to cpu 1's mmap ring and sees the exec and mmap2 events
> but at that point it's far too late.
> 
> I surmise from what I see with perf top right now that this happens
> a lot.

right, there's no reason why top should have different standards than
record/report.. above can definitely happen, I'll enable time sample
type and use ordered events for the queue

jirka

Re: [PATCH] mmc: core: Remove timeout when enabling cache

2018-11-07 Thread Wolfram Sang


> That also happens to be one of the cards we deploy; However i did
> wonder about adding a quirk but decided against it as it was not clear
> to me from the specification that CACHE ON really is meant to complete
> within GENERIC_CMD6_TIMEOUT. That and i fret about ending up in hit-a-
> mole games as the failure is really quite tedious (boot failure). 

I agree that we should use the more defensive variant as a default. I
mean there should be no performance regression since most cards will
respond just faster, or? The only downside I could see is that we might
miss a real timeout with no bounds set and might get stuck? Maybe it is
worth contacting eMMC spec people to at least know what is the expected
behaviour?



signature.asc
Description: PGP signature

Re: [PATCH v3 0/7] PBLK Bugfixes and cleanups

2018-11-07 Thread Matias Bjørling


On 11/06/2018 02:33 PM, Hans Holmberg wrote:

From: Hans Holmberg 

This series is a slew of bugfixes and cleanups for PBLK, mostly
fixing issues found during corner-case testing in QEMU.

Changes since v1:
Messed up from:, now the patches apply with the correct author
Pardon the mess.

Changes since v2:
Fixed kbuild reported issue and potential divide by zero in:
("lightnvm: pblk: set conservative threshold for user writes")
Fixed commit message nitpicks reported by Sebastien

The patch-set applies on top of:
remote https://github.com/OpenChannelSSD/linux.git
branch for-4.21/core

Hans Holmberg (7):
   lightnvm: pblk: fix resubmission of overwritten write err lbas
   lightnvm: pblk: account for write error sectors in emeta
   lightnvm: pblk: stop writes gracefully when running out of lines
   lightnvm: pblk: set conservative threshold for user writes
   lightnvm: pblk: remove unused macro
   lightnvm: pblk: fix pblk_lines_init error handling path
   lightnvm: pblk: remove dead code in pblk_recov_l2p

  drivers/lightnvm/pblk-init.c | 45 ++
  drivers/lightnvm/pblk-map.c  | 47 ---
  drivers/lightnvm/pblk-recovery.c |  1 -
  drivers/lightnvm/pblk-rl.c   |  5 ++-
  drivers/lightnvm/pblk-write.c| 55 +++-
  drivers/lightnvm/pblk.h  | 16 --
  6 files changed, 116 insertions(+), 53 deletions(-)



Sebastien, would you like me add your Reviewed-by?

Re: [PATCH] pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD

2018-11-07 Thread Michal Simek

On 01. 11. 18 1:57, Nathan Chancellor wrote:
> Clang warns when one enumerated type is implicitly converted to another:
> 
> drivers/pinctrl/pinctrl-zynq.c:985:18: warning: implicit conversion from
> enumeration type 'enum zynq_pin_config_param' to different enumeration
> type 'enum pin_config_param' [-Wenum-conversion]
> {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> ~   ^
> drivers/pinctrl/pinctrl-zynq.c:990:16: warning: implicit conversion from
> enumeration type 'enum zynq_pin_config_param' to different enumeration
> type 'enum pin_config_param' [-Wenum-conversion]
> = { PCONFDUMP(PIN_CONFIG_IOSTANDARD, "IO-standard", NULL, true),
> ~~^
> ./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
> macro 'PCONFDUMP'
> .param = a, .display = b, .format = c, .has_arg = d \
>  ^
> 2 warnings generated.

This is interesting. I have never tried to use llvm for building the
kernel. Do you have any description how this can be done?


> 
> It is expected that pinctrl drivers can extend pin_config_param because
> of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
> isn't an issue. Most drivers that take advantage of this define the
> PIN_CONFIG variables as constants, rather than enumerated values. Do the
> same thing here so that Clang no longer warns.
> 
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/pinctrl/pinctrl-zynq.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/pinctrl/pinctrl-zynq.c b/drivers/pinctrl/pinctrl-zynq.c
> index a0daf27042bd..57046c221756 100644
> --- a/drivers/pinctrl/pinctrl-zynq.c
> +++ b/drivers/pinctrl/pinctrl-zynq.c
> @@ -972,14 +972,11 @@ enum zynq_io_standards {
>  };
>  
>  /**
> - * enum zynq_pin_config_param - possible pin configuration parameters

This is wrong. kernel-doc is reporting issue with it.

drivers/pinctrl/pinctrl-zynq.c:975: warning: Cannot understand  *
@PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the
argument to
 on line 975 - I thought it was a doc line
1 warnings


>   * @PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the 
> argument to
>   *   this parameter (on a custom format) tells the driver which alternative
>   *   IO standard to use.
>   */
> -enum zynq_pin_config_param {
> - PIN_CONFIG_IOSTANDARD = PIN_CONFIG_END + 1,
> -};
> +#define PIN_CONFIG_IOSTANDARD(PIN_CONFIG_END + 1)
>  
>  static const struct pinconf_generic_params zynq_dt_params[] = {
>   {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> 

This change is fine.

Thanks,
Michal

Re: [PATCH] pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD

2018-11-07 Thread Michal Simek

On 07. 11. 18 9:55, Nathan Chancellor wrote:
> On Wed, Nov 07, 2018 at 09:46:12AM +0100, Michal Simek wrote:
>> On 01. 11. 18 1:57, Nathan Chancellor wrote:
>>> Clang warns when one enumerated type is implicitly converted to another:
>>>
>>> drivers/pinctrl/pinctrl-zynq.c:985:18: warning: implicit conversion from
>>> enumeration type 'enum zynq_pin_config_param' to different enumeration
>>> type 'enum pin_config_param' [-Wenum-conversion]
>>> {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
>>> ~   ^
>>> drivers/pinctrl/pinctrl-zynq.c:990:16: warning: implicit conversion from
>>> enumeration type 'enum zynq_pin_config_param' to different enumeration
>>> type 'enum pin_config_param' [-Wenum-conversion]
>>> = { PCONFDUMP(PIN_CONFIG_IOSTANDARD, "IO-standard", NULL, true),
>>> ~~^
>>> ./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
>>> macro 'PCONFDUMP'
>>> .param = a, .display = b, .format = c, .has_arg = d \
>>>  ^
>>> 2 warnings generated.
>>
>> This is interesting. I have never tried to use llvm for building the
>> kernel. Do you have any description how this can be done?
>>
> 
> Depending on what version of Clang you have access to, it is usually just as
> simple as running 'make ARCH=arm CC=clang CROSS_COMPILE=arm-linux-gnueabi-'.
> 
> Clang 7.0+ is recommended but 6.0 might work too.

TBH I would expect to download container and run this there to make sure
that I don't break anything else.

Thanks,
Michal

Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages

2018-11-07 Thread Michal Hocko

On Wed 07-11-18 08:55:26, osalvador wrote:
> On Wed, 2018-11-07 at 08:35 +0100, Michal Hocko wrote:
> > On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > > The check seems to be quite aggressive and in a loop that iterates
> > > pages, but has nothing to do with the page, did you mean to make
> > > the check
> > > 
> > > zone_idx(page_zone(page)) == ZONE_MOVABLE
> > 
> > Does it make any difference? Can we actually encounter a page from a
> > different zone here?
> 
> AFAIK, test_pages_in_a_zone() called from offline_pages() should ensure
> that the range belongs to a unique zone, so we should not encounter
> pages from other zones there, right?

Yes that is the case for memory hotplug. We do assume a single zone at
set_migratetype_isolate where we take the zone->lock. If the
contig_alloc can span multiple zones then it should check for similar.
-- 
Michal Hocko
SUSE Labs

Re: [PATCH 03/24] leds: dt-bindings: Add LED_FUNCTION definitions

2018-11-07 Thread Vokáč Michal

On 6.11.2018 23:07, Jacek Anaszewski wrote:
> Add common LED function definitions for use in Device Tree.
> The function names were extracted from existing dts files
> after eliminating oddities.
> 
> Signed-off-by: Jacek Anaszewski 
> Cc: Baolin Wang 
> Cc: Daniel Mack 
> Cc: Dan Murphy 
> Cc: Linus Walleij 
> Cc: Oleh Kravchenko 
> Cc: Sakari Ailus 
> Cc: Simon Shields 
> Cc: Xiaotong Lu 
> ---
>   include/dt-bindings/leds/functions.h | 101 
> +++
>   1 file changed, 101 insertions(+)
>   create mode 100644 include/dt-bindings/leds/functions.h

Hi Jacek,
Just nit picking. I think the subject should be:

"dt-bindings: leds: ..." to be consistent.

Best regards,
Michal

[PATCH 0/3] x86/cpu: fix some prototype warning

2018-11-07 Thread Yi Wang

This series of patch fix some prototype warning because
of missing include file.

Yi Wang (3):
  x86/cpu: fix prototype warning in cacheinfo.c
  x86/cpu: fix prototype warning in scattered.c
  x86/cpu: fix prototype warning in topology.c

 arch/x86/kernel/cpu/cacheinfo.c | 1 +
 arch/x86/kernel/cpu/scattered.c | 2 ++
 arch/x86/kernel/cpu/topology.c  | 2 ++
 3 files changed, 5 insertions(+)

-- 
1.8.3.1

Re: [PATCH] pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD

2018-11-07 Thread Nathan Chancellor

On Wed, Nov 07, 2018 at 09:46:12AM +0100, Michal Simek wrote:
> On 01. 11. 18 1:57, Nathan Chancellor wrote:
> > Clang warns when one enumerated type is implicitly converted to another:
> > 
> > drivers/pinctrl/pinctrl-zynq.c:985:18: warning: implicit conversion from
> > enumeration type 'enum zynq_pin_config_param' to different enumeration
> > type 'enum pin_config_param' [-Wenum-conversion]
> > {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> > ~   ^
> > drivers/pinctrl/pinctrl-zynq.c:990:16: warning: implicit conversion from
> > enumeration type 'enum zynq_pin_config_param' to different enumeration
> > type 'enum pin_config_param' [-Wenum-conversion]
> > = { PCONFDUMP(PIN_CONFIG_IOSTANDARD, "IO-standard", NULL, true),
> > ~~^
> > ./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
> > macro 'PCONFDUMP'
> > .param = a, .display = b, .format = c, .has_arg = d \
> >  ^
> > 2 warnings generated.
> 
> This is interesting. I have never tried to use llvm for building the
> kernel. Do you have any description how this can be done?
> 

Depending on what version of Clang you have access to, it is usually just as
simple as running 'make ARCH=arm CC=clang CROSS_COMPILE=arm-linux-gnueabi-'.

Clang 7.0+ is recommended but 6.0 might work too.

> 
> > 
> > It is expected that pinctrl drivers can extend pin_config_param because
> > of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
> > isn't an issue. Most drivers that take advantage of this define the
> > PIN_CONFIG variables as constants, rather than enumerated values. Do the
> > same thing here so that Clang no longer warns.
> > 
> > Signed-off-by: Nathan Chancellor 
> > ---
> >  drivers/pinctrl/pinctrl-zynq.c | 5 +
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/drivers/pinctrl/pinctrl-zynq.c b/drivers/pinctrl/pinctrl-zynq.c
> > index a0daf27042bd..57046c221756 100644
> > --- a/drivers/pinctrl/pinctrl-zynq.c
> > +++ b/drivers/pinctrl/pinctrl-zynq.c
> > @@ -972,14 +972,11 @@ enum zynq_io_standards {
> >  };
> >  
> >  /**
> > - * enum zynq_pin_config_param - possible pin configuration parameters
> 
> This is wrong. kernel-doc is reporting issue with it.
> 
> drivers/pinctrl/pinctrl-zynq.c:975: warning: Cannot understand  *
> @PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the
> argument to
>  on line 975 - I thought it was a doc line
> 1 warnings
> 

Ah yes, I forgot to send a v2 of this patch when someone pointed out
this problem in a different patch. I'll send that now, thanks for the
review!

> 
> >   * @PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the 
> > argument to
> >   * this parameter (on a custom format) tells the driver which alternative
> >   * IO standard to use.
> >   */
> > -enum zynq_pin_config_param {
> > -   PIN_CONFIG_IOSTANDARD = PIN_CONFIG_END + 1,
> > -};
> > +#define PIN_CONFIG_IOSTANDARD  (PIN_CONFIG_END + 1)
> >  
> >  static const struct pinconf_generic_params zynq_dt_params[] = {
> > {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> > 
> 
> This change is fine.
> 
> Thanks,
> Michal

Re: [PATCH v2] pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD

2018-11-07 Thread Michal Simek

On 07. 11. 18 9:56, Nathan Chancellor wrote:
> Clang warns when one enumerated type is implicitly converted to another:
> 
> drivers/pinctrl/pinctrl-zynq.c:985:18: warning: implicit conversion from
> enumeration type 'enum zynq_pin_config_param' to different enumeration
> type 'enum pin_config_param' [-Wenum-conversion]
> {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> ~   ^
> drivers/pinctrl/pinctrl-zynq.c:990:16: warning: implicit conversion from
> enumeration type 'enum zynq_pin_config_param' to different enumeration
> type 'enum pin_config_param' [-Wenum-conversion]
> = { PCONFDUMP(PIN_CONFIG_IOSTANDARD, "IO-standard", NULL, true),
> ~~^
> ./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
> macro 'PCONFDUMP'
> .param = a, .display = b, .format = c, .has_arg = d \
>  ^
> 2 warnings generated.
> 
> It is expected that pinctrl drivers can extend pin_config_param because
> of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
> isn't an issue. Most drivers that take advantage of this define the
> PIN_CONFIG variables as constants, rather than enumerated values. Do the
> same thing here so that Clang no longer warns.
> 
> Signed-off-by: Nathan Chancellor 
> ---
> 
> v1 -> v2:
> 
> * Avoid kernel-doc warning
> 
>  drivers/pinctrl/pinctrl-zynq.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pinctrl/pinctrl-zynq.c b/drivers/pinctrl/pinctrl-zynq.c
> index a0daf27042bd..90fd37e8207b 100644
> --- a/drivers/pinctrl/pinctrl-zynq.c
> +++ b/drivers/pinctrl/pinctrl-zynq.c
> @@ -971,15 +971,12 @@ enum zynq_io_standards {
>   zynq_iostd_max
>  };
>  
> -/**
> - * enum zynq_pin_config_param - possible pin configuration parameters
> - * @PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the 
> argument to
> +/*
> + * PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the argument 
> to
>   *   this parameter (on a custom format) tells the driver which alternative
>   *   IO standard to use.
>   */
> -enum zynq_pin_config_param {
> - PIN_CONFIG_IOSTANDARD = PIN_CONFIG_END + 1,
> -};
> +#define PIN_CONFIG_IOSTANDARD(PIN_CONFIG_END + 1)
>  
>  static const struct pinconf_generic_params zynq_dt_params[] = {
>   {"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
> 

Acked-by: Michal Simek 

Thanks,
Michal

Re: [PATCH] arm64: fix commit style in the comments

2018-11-07 Thread Catalin Marinas

On Wed, Nov 07, 2018 at 11:39:11PM +0800, Peng Hao wrote:
> Use git commit description style 'commit <12+ chars of sha1>
> ("")' in the comments.
> 
> Signed-off-by: Peng Hao 
> ---
>  arch/arm64/kernel/sys32.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/sys32.c b/arch/arm64/kernel/sys32.c
> index 0f8bcb7..8317a5a 100644
> --- a/arch/arm64/kernel/sys32.c
> +++ b/arch/arm64/kernel/sys32.c
> @@ -40,8 +40,8 @@
>* arbitrary binaries may rely upon it, so we must do the same.
>* For more details, see commit:
>*
> -  * 713c481519f19df9 ("[ARM] 3108/2: old ABI compat: statfs64 and
> -  * fstatfs64")
> +  * commit 713c481519f19df9 ("[ARM] 3108/2: old ABI compat: statfs64
> +  * and fstatfs64")

Seriously, how does this help? It already states "see commit:" on the
previous line.

-- 
Catalin

Re: [PATCH 2/4] x86/amd_nb: add support for newer PCI topologies

2018-11-07 Thread Borislav Petkov

On Tue, Nov 06, 2018 at 05:20:41PM -0600, Bjorn Helgaas wrote:
> Or maybe even drivers/acpi/thermal.c, which claims every Thermal Zone
> (ACPI 6.2, sec 11), would be sufficient.  I don't know what the
> relationship between hwmon and other thermal stuff, e.g.,
> Documentation/thermal/sysfs-api.txt is.  acpi/thermal.c looks tied
> into the drivers/thermal stuff (it registers "thermal_zone" devices),
> but not to hwmon.

Err, I still don't think I'm catching your drift but let me stop you
right there: amd_nb is not there only for hwmon/k10temp. It is a small
interface glue if you will, which exports the CPU functionality in PCI
config space to other consumers.

So it is not really a driver - it is used by drivers to talk/query CPU
settings through it.

With that said, I don't think I understand all that talk about PNP IDs
and ACPI methods. But maybe I'm missing something...

So what's up?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Re: [PATCH v2 1/4] mm: Fix multiple evaluvations of totalram_pages and managed_pages

2018-11-07 Thread Vlastimil Babka

On 11/7/18 9:20 AM, Michal Hocko wrote:
> On Tue 06-11-18 21:51:47, Arun KS wrote:

Hi,

there's typo in subject: evaluvations -> evaluations.

However, "fix" is also misleading (more below), so I'd suggest something
like:

mm: reference totalram_pages and managed_pages once per function

>> This patch is in preparation to a later patch which converts totalram_pages
>> and zone->managed_pages to atomic variables. This patch does not introduce
>> any functional changes.
> 
> I forgot to comment on this one. The patch makes a lot of sense. But I
> would be little bit more conservative and won't claim "no functional
> changes". As things stand now multiple reads in the same function are
> racy (without holding the lock). I do not see any example of an
> obviously harmful case but claiming the above is too strong of a
> statement. I would simply go with something like "Please note that
> re-reading the value might lead to a different value and as such it
> could lead to unexpected behavior. There are no known bugs as a result
> of the current code but it is better to prevent from them in principle."

However, the new code doesn't use READ_ONCE(), so the compiler is free
to read the value multiple times, and before the patch it was free to
read it just once, as the variables are not volatile. So strictly
speaking this is indeed not a functional change (if compiler decides
differently based on the patch, it's an implementation detail).

So even in my suggested subject above, 'reference' is meant as a source
code reference, not really a memory read reference. Couldn't think of a
better word though.

Re: [PATCH v3] x86/vsmp_64.c: Remove dependency on pv_irq_ops

2018-11-07 Thread Eial Czerwacki

Greetings Ingo,

On 11/05/2018 07:59 PM, Ingo Molnar wrote:
> 
> * Eial Czerwacki  wrote:
> 
>> +#if defined(CONFIG_PCI)
> 
> This is shorter:
> 
>#ifdef CONFIG_PCI
> 
> Thanks,
> 
>   Ingo
> 

you are absolutely right, looks like Thomas have handled it already.

Eial.

Greetings From Mrs.Elodie Antoine,

2018-11-07 Thread Mrs Elodie Antoine

Greetings From Mrs.Elodie Antoine,

May be this letter will definitely come to you as a huge surprise, but 
I implore you to take the time to go through it carefully as the 
decision you make will go off a long way to determine my future and 
continued existence. I am Mrs.Elodie Antoine aging widow of 59 years 
old suffering from long time illness. 

I have some funds I inherited from my late husband Dr. jean baptiste 
antoine,The sum of (US$4.5 Million Dollars) please i want you to 
withdraw this fund and use it for Charity works. I found your email 
address from the internet after honest prayers  to the LORD to bring me 
a good person that can handle this project and i decided to contact 
you, if you may be willing and interested to handle these trust funds 
in good faith before anything happens to me,
Please kindly respond for further details.

Thanks and God bless you,
Mrs.Elodie Antoine

[tip:x86/boot] x86/boot: Simplify the detect_memory*() control flow

2018-11-07 Thread tip-bot for Jordan Borgner

Commit-ID:  e8eeb3c8aab044ee8faf5e0389db8518629a9324
Gitweb: https://git.kernel.org/tip/e8eeb3c8aab044ee8faf5e0389db8518629a9324
Author: Jordan Borgner 
AuthorDate: Fri, 2 Nov 2018 14:56:22 +
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Nov 2018 21:05:01 +0100

x86/boot: Simplify the detect_memory*() control flow

The return values of these functions are not used - so simplify the functions.

No change in functionality.

[ mingo: Simplified the changelog. ]

Suggested: Ingo Molnar 
Signed-off-by: Jordan Borgner 
Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20181102145622.zjx2t3mdu3rv6sgy@JordanDesktop
Signed-off-by: Ingo Molnar 
---
 arch/x86/boot/boot.h   |  2 +-
 arch/x86/boot/memory.c | 31 ++-
 2 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index ef5a9cc66fb8..32a09eb5c101 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -309,7 +309,7 @@ void query_edd(void);
 void __attribute__((noreturn)) die(void);
 
 /* memory.c */
-int detect_memory(void);
+void detect_memory(void);
 
 /* pm.c */
 void __attribute__((noreturn)) go_to_protected_mode(void);
diff --git a/arch/x86/boot/memory.c b/arch/x86/boot/memory.c
index 7df2b28207be..f06c147b5140 100644
--- a/arch/x86/boot/memory.c
+++ b/arch/x86/boot/memory.c
@@ -17,7 +17,7 @@
 
 #define SMAP   0x534d4150  /* ASCII "SMAP" */
 
-static int detect_memory_e820(void)
+static void detect_memory_e820(void)
 {
int count = 0;
struct biosregs ireg, oreg;
@@ -68,10 +68,10 @@ static int detect_memory_e820(void)
count++;
} while (ireg.ebx && count < ARRAY_SIZE(boot_params.e820_table));
 
-   return boot_params.e820_entries = count;
+   boot_params.e820_entries = count;
 }
 
-static int detect_memory_e801(void)
+static void detect_memory_e801(void)
 {
struct biosregs ireg, oreg;
 
@@ -80,7 +80,7 @@ static int detect_memory_e801(void)
intcall(0x15, , );
 
if (oreg.eflags & X86_EFLAGS_CF)
-   return -1;
+   return;
 
/* Do we really need to do this? */
if (oreg.cx || oreg.dx) {
@@ -89,7 +89,7 @@ static int detect_memory_e801(void)
}
 
if (oreg.ax > 15*1024) {
-   return -1;  /* Bogus! */
+   return; /* Bogus! */
} else if (oreg.ax == 15*1024) {
boot_params.alt_mem_k = (oreg.bx << 6) + oreg.ax;
} else {
@@ -102,11 +102,9 @@ static int detect_memory_e801(void)
 */
boot_params.alt_mem_k = oreg.ax;
}
-
-   return 0;
 }
 
-static int detect_memory_88(void)
+static void detect_memory_88(void)
 {
struct biosregs ireg, oreg;
 
@@ -115,22 +113,13 @@ static int detect_memory_88(void)
intcall(0x15, , );
 
boot_params.screen_info.ext_mem_k = oreg.ax;
-
-   return -(oreg.eflags & X86_EFLAGS_CF); /* 0 or -1 */
 }
 
-int detect_memory(void)
+void detect_memory(void)
 {
-   int err = -1;
-
-   if (detect_memory_e820() > 0)
-   err = 0;
-
-   if (!detect_memory_e801())
-   err = 0;
+   detect_memory_e820();
 
-   if (!detect_memory_88())
-   err = 0;
+   detect_memory_e801();
 
-   return err;
+   detect_memory_88();
 }

Re: [PATCH] ubifs: CONFIG_UBIFS_FS_AUTHENTICATION should depend on UBIFS_FS

2018-11-07 Thread Sascha Hauer

On Mon, Nov 05, 2018 at 09:25:40AM +0100, Geert Uytterhoeven wrote:
> Instead of adding yet another dependency on UBIFS_FS, wrap the whole
> block of ubifs config options in a single "if UBIFS_FS".
> 
> Fixes: d8a22773a12c6d78 ("ubifs: Enable authentication support")
> Signed-off-by: Geert Uytterhoeven 
> ---
>  fs/ubifs/Kconfig | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)

I would have sent a similar patch with taking the easier way out of just
adding another "depends on UBIFS_FS". This one is nicer though.

Acked-by: Sascha Hauer 

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

Re: [PATCH] i2c: at91: switched to resume/suspend callbacks.

2018-11-07 Thread Ludovic Desroches

On Mon, Oct 22, 2018 at 12:17:47PM +0200, Andrei Stefanescu - M50506 wrote:
> In the previous version of the driver resume/suspend_noirq callbacks
> were used. Because of this, when resuming from suspend-to-ram,
> an I2C (belonging to a FLEXCOM) would resume before FLEXCOM.
> The first read on the I2C bus would then result in a timeout.
> 
> This patch switches to resume/suspend callbacks which are 
> called after FLEXCOM resumes. FLEXCOM, SPI and USART drivers use
> resume/suspend callbacks.
> 
> Signed-off-by: Andrei Stefanescu 
I can't figure out why we use the _noirq variant. When patches for PM
stuff were sent, suspend/resume callbacks were used but in the latest
version it moved to the _noirq variant without explanation.

Excepting if someone has an argument to keep the _noirq variant,
Acked-by: Ludovic Desroches  

Thanks

Regards

Ludovic

> ---
>  drivers/i2c/busses/i2c-at91.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-at91.c b/drivers/i2c/busses/i2c-at91.c
> index bfd1fdf..81f7b94 100644
> --- a/drivers/i2c/busses/i2c-at91.c
> +++ b/drivers/i2c/busses/i2c-at91.c
> @@ -1174,7 +1174,7 @@ static int at91_twi_runtime_resume(struct device *dev)
>   return clk_prepare_enable(twi_dev->clk);
>  }
>  
> -static int at91_twi_suspend_noirq(struct device *dev)
> +static int at91_twi_suspend(struct device *dev)
>  {
>   if (!pm_runtime_status_suspended(dev))
>   at91_twi_runtime_suspend(dev);
> @@ -1182,7 +1182,7 @@ static int at91_twi_suspend_noirq(struct device *dev)
>   return 0;
>  }
>  
> -static int at91_twi_resume_noirq(struct device *dev)
> +static int at91_twi_resume(struct device *dev)
>  {
>   struct at91_twi_dev *twi_dev = dev_get_drvdata(dev);
>   int ret;
> @@ -1202,8 +1202,8 @@ static int at91_twi_resume_noirq(struct device *dev)
>  }
>  
>  static const struct dev_pm_ops at91_twi_pm = {
> - .suspend_noirq  = at91_twi_suspend_noirq,
> - .resume_noirq   = at91_twi_resume_noirq,
> + .suspend= at91_twi_suspend,
> + .resume = at91_twi_resume,
>   .runtime_suspend= at91_twi_runtime_suspend,
>   .runtime_resume = at91_twi_runtime_resume,
>  };
> -- 
> 2.7.4
>

Re: [PATCH 0/2] pinctrl: sh-pfc: r8a77965: Add VIN4 and VIN5

2018-11-07 Thread Geert Uytterhoeven

Hi Jacopo,

(sorry, seems I prepared a reply, but forgot to press "Send")

On Tue, Nov 6, 2018 at 10:31 AM jacopo mondi  wrote:
> On Tue, Nov 06, 2018 at 10:24:30AM +0100, Geert Uytterhoeven wrote:
> > On Tue, Nov 6, 2018 at 10:08 AM jacopo mondi  wrote:
> > > On Mon, Nov 05, 2018 at 06:19:22PM +0100, Geert Uytterhoeven wrote:
> > > > On Mon, Oct 29, 2018 at 7:14 PM Jacopo Mondi 
> > > >  wrote:
> > > > >this two patches add supports for VIN4 and VIN5 interfaces to 
> > > > > R-Car M3-N.
> > > > >
> > > > > On this SoC (and in the forthcoming support for E3 R8A77990) the VIN 
> > > > > groups
> > > > > could appear on different sets of pins, usually the 'a' and 'b' one.
> > > > >
> > > > > With the existing VIN_DATA_PIN_GROUP macro we have to specify group 
> > > > > names as:
> > > > >
> > > > > VIN_DATA_PIN_GROUP(vin4_data_a, 8)
> > > > >
> > > > > which results in the group being named as "vin4_data_a_8" which is
> > > > > un-consistent with the canonical group names (eg. "vin4_data8_a").
> > > > >
> > > > > This series adds a macro that allows to specify the group 'version' 
> > > > > along with
> > > > > the pin and mux numbers in patch [1/1]. I haven't been able to find a 
> > > > > better
> > > > > term than 'version' as 'group' was already taken. Suggestions welcome.
> > > >
> > > > Yeah, the datasheet also calls these groups :-(
> > > > A possible alternative is to use "variant"?
> > > >
> > > > Or, what about avoiding the name issue by making the 
> > > > VIN_DATA_PIN_GROUP()
> > > > macro varargs, and passing the "variant" as the (optional) third 
> > > > parameter?
> > > > That way existing users work as a before, while you can also write e.g.
> > > >
> > > > VIN_DATA_PIN_GROUP_VER(vin4_data, 8, _a),
> > >
> > > Indeed.
> > >
> > > Would something along the following lines fly for you?
> > >
> > > #define VIN_DATA_PIN_GROUP(n, s, ...)   \
> > > {   \
> > > .name = #n#s#__VA_ARGS__,   \
> > > .pins = n##__VA_ARGS__##_pins.data##s,  \
> > > .mux = n##__VA_ARGS__##_mux.data##s,\
> > > .nr_pins = ARRAY_SIZE(n##__VA_ARGS__##_pins.data##s),   \
> > > }
> > >
> > > It can be used as:
> > > VIN_DATA_PIN_GROUP(vin4_data, 8, _a),
> > > VIN_DATA_PIN_GROUP(vin5_data, 8),
> > >
> > > With your ack on this, I'll send v2.
> >
> > Thank you, that is exactly what I had in mind.
> >
> > > > > As I cannot test VIN4 nor VIN5 on Salvator-XS as the parallel pins 
> > > > > are not
> > > > > wired, I made sure the macro creates correct names and fields not 
> > > > > only by
> > > > > compile testing it, but with a small C program [1] that replicates 
> > > > > the VIN data
> > > > > layout defined in the PFC module and access fields (and has helped me 
> > > > > testing
> > > > > more easily the preprocessor stringification/concatenation process).
> > > > >
> > > > > Final note: Simon, you took the E3 patches in your tree, and I expect 
> > > > > them to
> > > > > land on v4.20-rc1. They use the old macros, are follow up patches ok?)
> > > >
> > > > Which patches are using these macro names, and are in v4.20-rc1?
> > > >
> > > > BTW, "grep vin._data_[a-z][0-9] drivers/pinctrl/sh-pfc/*o" tells me we 
> > > > already
> > > > have broken groups names on r8a7792, r8a7795, and r8a7796.
> > > > Fortunately we have no known users of them, so they can be fixed.
> > > >
> > >
> > > On v4.20-rc1 the grep returns none for me :/
> > > git grep v4.20-rc1 "vin._data_[a-z][0-9]" drivers/pinctrl/sh-pfc/
> >
> > I grepped the .o files, to make sure it would see the final strings, which
> > obviously works in the build tree only ;-)
>
> Ah yes, stupid me.
>
> >
> > For the source tree, please try:
> >
> > git grep -w VIN_DATA_PIN_GROUP.*_[a-z] v4.20-rc1
>
> Argh, there are quite a few of them, but fortunately no users so far.
>
> Is it ok fixing them in v2 of this series with follow-up patches, or
> would you like a single patch that introduces the variadic macro and
> replaces all the occurrences in the per-SoC PFC modules in one go?

Given the r8a7795 and r8a7796 issues were introduced in v4.17:

a5c2949ff7bd9e04 ("pinctrl: sh-pfc: r8a7796: Deduplicate VIN4 pin
definitions")
9942a5b52990b8d5 ("pinctrl: sh-pfc: r8a7795: Deduplicate VIN4 pin
definitions")

while the r8a7792 issue date back to v4.9:

7dd74bb1f058786e ("pinctrl: sh-pfc: r8a7792: Add VIN pin groups")

I think separate patches are easier for backporting.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v2 2/4] mm: Convert zone->managed_pages to atomic variable

2018-11-07 Thread Vlastimil Babka

On 11/6/18 5:21 PM, Arun KS wrote:
> totalram_pages, zone->managed_pages and totalhigh_pages updates
> are protected by managed_page_count_lock, but readers never care
> about it. Convert these variables to atomic to avoid readers
> potentially seeing a store tear.
> 
> This patch converts zone->managed_pages. Subsequent patches will
> convert totalram_panges, totalhigh_pages and eventually
> managed_page_count_lock will be removed.
> 
> Suggested-by: Michal Hocko 
> Suggested-by: Vlastimil Babka 
> Signed-off-by: Arun KS 
> Reviewed-by: Konstantin Khlebnikov 
> Acked-by: Michal Hocko 

Acked-by: Vlastimil Babka

[PATCH v2] pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD

2018-11-07 Thread Nathan Chancellor

Clang warns when one enumerated type is implicitly converted to another:

drivers/pinctrl/pinctrl-zynq.c:985:18: warning: implicit conversion from
enumeration type 'enum zynq_pin_config_param' to different enumeration
type 'enum pin_config_param' [-Wenum-conversion]
{"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
~   ^
drivers/pinctrl/pinctrl-zynq.c:990:16: warning: implicit conversion from
enumeration type 'enum zynq_pin_config_param' to different enumeration
type 'enum pin_config_param' [-Wenum-conversion]
= { PCONFDUMP(PIN_CONFIG_IOSTANDARD, "IO-standard", NULL, true),
~~^
./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
macro 'PCONFDUMP'
.param = a, .display = b, .format = c, .has_arg = d \
 ^
2 warnings generated.

It is expected that pinctrl drivers can extend pin_config_param because
of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
isn't an issue. Most drivers that take advantage of this define the
PIN_CONFIG variables as constants, rather than enumerated values. Do the
same thing here so that Clang no longer warns.

Signed-off-by: Nathan Chancellor 
---

v1 -> v2:

* Avoid kernel-doc warning

 drivers/pinctrl/pinctrl-zynq.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/pinctrl/pinctrl-zynq.c b/drivers/pinctrl/pinctrl-zynq.c
index a0daf27042bd..90fd37e8207b 100644
--- a/drivers/pinctrl/pinctrl-zynq.c
+++ b/drivers/pinctrl/pinctrl-zynq.c
@@ -971,15 +971,12 @@ enum zynq_io_standards {
zynq_iostd_max
 };
 
-/**
- * enum zynq_pin_config_param - possible pin configuration parameters
- * @PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the argument 
to
+/*
+ * PIN_CONFIG_IOSTANDARD: if the pin can select an IO standard, the argument to
  * this parameter (on a custom format) tells the driver which alternative
  * IO standard to use.
  */
-enum zynq_pin_config_param {
-   PIN_CONFIG_IOSTANDARD = PIN_CONFIG_END + 1,
-};
+#define PIN_CONFIG_IOSTANDARD  (PIN_CONFIG_END + 1)
 
 static const struct pinconf_generic_params zynq_dt_params[] = {
{"io-standard", PIN_CONFIG_IOSTANDARD, zynq_iostd_lvcmos18},
-- 
2.19.1

[PATCH 3/3] x86/cpu: fix prototype warning in topology.c

2018-11-07 Thread Yi Wang

Missing include file causes warning:
arch/x86/kernel/cpu/topology.c:25:5: warning: no previous prototype for 
‘detect_extended_topology_early’ [-Wmissing-prototypes]
arch/x86/kernel/cpu/topology.c:57:5: warning: no previous prototype for 
‘detect_extended_topology’ [-Wmissing-prototypes]

Signed-off-by: Yi Wang 
---
 arch/x86/kernel/cpu/topology.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 71ca064..8f6c784 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 
+#include "cpu.h"
+
 /* leaf 0xb SMT level */
 #define SMT_LEVEL  0
 
-- 
1.8.3.1

[PATCH 1/3] x86/cpu: fix prototype warning in cacheinfo.c

2018-11-07 Thread Yi Wang

Missing include file causes warning:
arch/x86/kernel/cpu/cacheinfo.c:647:6: warning: no previous prototype for 
‘cacheinfo_amd_init_llc_id’ [-Wmissing-prototypes]
arch/x86/kernel/cpu/cacheinfo.c:686:6: warning: no previous prototype for 
‘cacheinfo_hygon_init_llc_id’ [-Wmissing-prototypes]

Signed-off-by: Yi Wang 
---
 arch/x86/kernel/cpu/cacheinfo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index dc1b934..5bafd93 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu.h"
 
-- 
1.8.3.1

[PATCH 2/3] x86/cpu: fix prototype warning in scattered.c

2018-11-07 Thread Yi Wang

Missing include file causes warning:
arch/x86/kernel/cpu/scattered.c:37:6: warning: no previous prototype for 
‘init_scattered_cpuid_features’ [-Wmissing-prototypes]
arch/x86/kernel/cpu/scattered.c:60:5: warning: no previous prototype for 
‘get_scattered_cpuid_leaf’ [-Wmissing-prototypes]

Signed-off-by: Yi Wang 
---
 arch/x86/kernel/cpu/scattered.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 772c219..5b6866f 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -9,6 +9,8 @@
 
 #include 
 
+#include "cpu.h"
+
 struct cpuid_bit {
u16 feature;
u8 reg;
-- 
1.8.3.1

Re: [RFC/RFT][PATCH v3] cpuidle: New timer events oriented governor for tickless systems

2018-11-07 Thread Peter Zijlstra

On Wed, Nov 07, 2018 at 12:39:31AM +0100, Rafael J. Wysocki wrote:
> On Tue, Nov 6, 2018 at 8:51 PM Peter Zijlstra  wrote:
> >
> > On Tue, Nov 06, 2018 at 07:19:24PM +0100, Rafael J. Wysocki wrote:
> > > On Tue, Nov 6, 2018 at 6:04 PM Peter Zijlstra  
> > > wrote:
> >
> > > > Instead of this detector; why haven't you used the code from
> > > > kernel/irq/timings.c ?
> > >
> > > Because it doesn't help much AFAICS.
> > >
> > > Wakeups need not be interrupts in particular
> >
> > You're alluding to the MWAIT wakeup through the MONITOR address ?
> 
> Yes.

Right, those will not be accounted for and will need something else.

> > > and interrupt patterns that show up when the CPU is busy may not be
> > > relevant for when it is idle.
> >
> > I think that is not always true; consider things like the periodic
> > interrupt from frame rendering or audio; if there is nothing more going
> > on in the system than say playing your favourite tune, it gets the
> > 'need more data soon' interrupt from the audio card, wakes up, does a little
> > mp3/flac/ogg/whatever decode to fill up the buffer and goes back to
> > sleep. Same for video playback I assume, the vsync interrupt for buffer
> > flips is fairly predictable.
> >
> > The interrupt predictor we have in kernel/irq/timings.c should be very
> > accurate in predicting those interrupts.
> 
> In the above case the interrupts should produce a detectable pattern
> of wakeups anyway.

Ah, not so. Suppose you have both the audio and video interrupt going at
a steady rate but different rate, then the combined pattern isn't
trivial at all.

> In general, however, I need to be convinced that interrupts that
> didn't wake up the CPU from idle are relevant for next wakeup
> prediction.  I see that this may be the case, but to what extent is
> rather unclear to me and it looks like calling
> irq_timings_next_event() would add considerable overhead.

How about we add a (debug) knob so that people can play with it for now?
If it turns out to be useful, we'll learn.

Re: [PATCH v3] driver-staging: vsoc.c: Add sysfs support for examining the permissions of regions.

2018-11-07 Thread Greg Kroah-Hartman

On Wed, Nov 07, 2018 at 10:30:43AM +0800, Jerry Lin wrote:
> Add a attribute called permissions under vsoc device node for examining
> current granted permissions in vsoc_device.
> 
> This file will display permissions in following format:
>   begin_offset  end_offset  owner_offset  owned_value
> %x  %x%x   %x
> 
> Signed-off-by: Jerry Lin 
> ---
>  drivers/staging/android/vsoc.c | 48 
> +++---
>  1 file changed, 45 insertions(+), 3 deletions(-)

What changed from v2?  And where was v2?  What about v1?

You need a change log here of what you did different from the previous
patches.

And why ignore my response saying that this type of sysfs file is not ok
at all?

thanks,

greg k-h

Re: [PATCH 1/2] pci: prevent sk hynix nvme from entering D3

2018-11-07 Thread Christoph Hellwig

On Tue, Nov 06, 2018 at 06:11:31PM +0800, AceLan Kao wrote:
> Agree, this is not a good fix for Sk hynix nvme, so Dell is still pushing
> Sk hynix to fix it from firmware.
> But before the firmware is ready, this is still a issue that need to be fixed 
> in
> kernel side, and the new firmware may not be applied on the old nvme
> modules.
> This list won't keep growing, and we'll keep an eye on the new firmware and
> co-work with engineers from Sky hynix to make sure this issue won't happen
> again.

We generally quirk for grave issues that make devices unusable.

To me working around a D3 mode that consumers more power than D0
does not quite fit that bill.

Re: [PATCH v1 0/4]mm: convert totalram_pages, totalhigh_pages and managed pages to atomic

2018-11-07 Thread Vlastimil Babka

On 11/7/18 8:02 AM, Konstantin Khlebnikov wrote:
> On 06.11.2018 11:43, Arun KS wrote:
>> On 2018-11-06 14:07, Konstantin Khlebnikov wrote:
>>> On 06.11.2018 11:30, Arun KS wrote:
 On 2018-11-06 13:47, Konstantin Khlebnikov wrote:
> On 06.11.2018 8:38, Arun KS wrote:
>> Any comments?
>
> Looks good.
> Except unclear motivation behind this change.
> This should be in comment of one of patch.

 totalram_pages, zone->managed_pages and totalhigh_pages are sometimes 
 modified outside managed_page_count_lock. Hence convert these 
 variable to atomic to avoid readers potentially seeing a store tear.
>>>
>>> So, this is just theoretical issue or splat from sanitizer.
>>> After boot memory online\offline are strictly serialized by rw-semaphore.
>>
>> Few instances which can race with hot add. Please see below,
>> https://patchwork.kernel.org/patch/10627521/
> Could you point what exactly are you fixing with this set?
> 
> from v2:
> 
>  > totalram_pages, zone->managed_pages and totalhigh_pages updates
>  > are protected by managed_page_count_lock, but readers never care
>  > about it. Convert these variables to atomic to avoid readers
>  > potentially seeing a store tear.
> 
> This?
> 
> 
> Aligned unsigned long almost always stored at once.

The point is "almost always", so better not rely on it :) But the main
motivation was that managed_page_count_lock handling was complicating
Arun's "memory_hotplug: Free pages as higher order" patch and it seemed
a better idea to just remove and convert this to atomics, with
preventing potential store-to-read tearing as a bonus.

It would be nice to mention it in the changelogs though.

> To make it completely correct you could replace
> 
> a += b;
> 
> with
> 
> WRITE_ONCE(a, a + b);

Wouldn't be enough to get rid of the locks.

[PATCH] ARM: dts: imx6sx: specify proper clock for nodes with dummy clock

2018-11-07 Thread Anson Huang

>From i.MX6SX reference manual CCM chapter, KPP and
WDOGn use IPG clock as their clock, specify IPG
clock for KPP and WDOGn instead of DUMMY clock.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sx.dtsi | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi
index 84b7687..07ea7d9 100644
--- a/arch/arm/boot/dts/imx6sx.dtsi
+++ b/arch/arm/boot/dts/imx6sx.dtsi
@@ -558,7 +558,7 @@
compatible = "fsl,imx6sx-kpp", "fsl,imx21-kpp";
reg = <0x020b8000 0x4000>;
interrupts = ;
-   clocks = < IMX6SX_CLK_DUMMY>;
+   clocks = < IMX6SX_CLK_IPG>;
status = "disabled";
};
 
@@ -566,14 +566,14 @@
compatible = "fsl,imx6sx-wdt", "fsl,imx21-wdt";
reg = <0x020bc000 0x4000>;
interrupts = ;
-   clocks = < IMX6SX_CLK_DUMMY>;
+   clocks = < IMX6SX_CLK_IPG>;
};
 
wdog2: wdog@20c {
compatible = "fsl,imx6sx-wdt", "fsl,imx21-wdt";
reg = <0x020c 0x4000>;
interrupts = ;
-   clocks = < IMX6SX_CLK_DUMMY>;
+   clocks = < IMX6SX_CLK_IPG>;
status = "disabled";
};
 
@@ -1270,7 +1270,7 @@
compatible = "fsl,imx6sx-wdt", "fsl,imx21-wdt";
reg = <0x02288000 0x4000>;
interrupts = ;
-   clocks = < IMX6SX_CLK_DUMMY>;
+   clocks = < IMX6SX_CLK_IPG>;
status = "disabled";
};
 
-- 
2.7.4

Re: [PATCH v2 3/4] mm: convert totalram_pages and totalhigh_pages variables to atomic

2018-11-07 Thread Vlastimil Babka

On 11/6/18 5:21 PM, Arun KS wrote:
> totalram_pages and totalhigh_pages are made static inline function.
> 
> Suggested-by: Michal Hocko 
> Suggested-by: Vlastimil Babka 
> Signed-off-by: Arun KS 
> Reviewed-by: Konstantin Khlebnikov 
> Acked-by: Michal Hocko 

Acked-by: Vlastimil Babka 

One bug (probably) below:

> diff --git a/mm/highmem.c b/mm/highmem.c
> index 59db322..02a9a4b 100644
> --- a/mm/highmem.c
> +++ b/mm/highmem.c
> @@ -105,9 +105,7 @@ static inline wait_queue_head_t 
> *get_pkmap_wait_queue_head(unsigned int color)
>  }
>  #endif
>  
> -unsigned long totalhigh_pages __read_mostly;
> -EXPORT_SYMBOL(totalhigh_pages);

I think you still need to export _totalhigh_pages so that modules can
use the inline accessors.

> -
> +atomic_long_t _totalhigh_pages __read_mostly;
>  
>  EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx);
>

Re: [PATCH] jffs2: implement mount option to configure endianness

2018-11-07 Thread David Woodhouse

On Tue, 2018-11-06 at 13:49 -0800, Nikunj Kela wrote:
> This patch allows the endianness of the JFSS2 filesystem to be
> specified by mount option 'force_endian=big|little|native'. If
> endianness is not specified, it defaults to 'native' endianness
> thus retaining the existing behavior.
> 
> Some architectures benefit from having a single known endianness
> of JFFS2 filesystem (for data, not executables) independent of the
> endianness of the processor (ARM processors can be switched to either
> endianness at run-time).
> 
> We have boards that are shipped with BE jffs2 and BE kernel. We are
> now migrating to LE kernel. This mount option helps us in mounting
> BE jffs2 on LE kernel.

Thanks for implementing this. JFFS2 has often been very sensitive to
performance at mount time — I'd love to see how this affects the time
taken to mount a large file system. If it's significant, we may want a
config option to make this conditional?

Also, perhaps we should improve the behaviour when the user
accidentally tries to mount a wrong-endian file system. If we can't
just make it autodetect and use the "correct" endianness, perhaps we
should at least printk a message suggesting that the user try again
with the other endianness?

smime.p7s
Description: S/MIME cryptographic signature

[PATCH] utimensat: AT_EMPTY_PATH support

2018-11-07 Thread Miklos Szeredi

This makes it possible to use utimensat on an O_PATH file (including
symlinks).

It supersedes the nonstandard utimensat(fd, NULL, ...) form.

Signed-off-by: Miklos Szeredi 
---
 fs/utimes.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/utimes.c b/fs/utimes.c
index bdcf2daf39c1..f9c7ebad19d7 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -97,13 +97,13 @@ long do_utimes(int dfd, const char __user *filename, struct 
timespec64 *times,
goto out;
}
 
-   if (flags & ~AT_SYMLINK_NOFOLLOW)
+   if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH))
goto out;
 
if (filename == NULL && dfd != AT_FDCWD) {
struct fd f;
 
-   if (flags & AT_SYMLINK_NOFOLLOW)
+   if (flags)
goto out;
 
f = fdget(dfd);
@@ -119,6 +119,8 @@ long do_utimes(int dfd, const char __user *filename, struct 
timespec64 *times,
 
if (!(flags & AT_SYMLINK_NOFOLLOW))
lookup_flags |= LOOKUP_FOLLOW;
+   if (flags & AT_EMPTY_PATH)
+   lookup_flags |= LOOKUP_EMPTY;
 retry:
error = user_path_at(dfd, filename, lookup_flags, );
if (error)
-- 
2.14.3

Re: [PATCH] jffs2: implement mount option to configure endianness

2018-11-07 Thread Joakim Tjernlund

On Wed, 2018-11-07 at 10:05 +0100, David Woodhouse wrote:
> 
> On Tue, 2018-11-06 at 13:49 -0800, Nikunj Kela wrote:
> > This patch allows the endianness of the JFSS2 filesystem to be
> > specified by mount option 'force_endian=big|little|native'. If
> > endianness is not specified, it defaults to 'native' endianness
> > thus retaining the existing behavior.
> > 
> > Some architectures benefit from having a single known endianness
> > of JFFS2 filesystem (for data, not executables) independent of the
> > endianness of the processor (ARM processors can be switched to either
> > endianness at run-time).
> > 
> > We have boards that are shipped with BE jffs2 and BE kernel. We are
> > now migrating to LE kernel. This mount option helps us in mounting
> > BE jffs2 on LE kernel.
> 
> Thanks for implementing this. JFFS2 has often been very sensitive to
> performance at mount time — I'd love to see how this affects the time
> taken to mount a large file system. If it's significant, we may want a
> config option to make this conditional?

Yes, this may slow things down. I am not sure I agree with the impl. either.
Could one not make cpu_to_je_X/jeX_to_cpu a function ptr which is set to
a func. with the correct endian?

That way it should also be easy to have a config option for endian: 
Native/BE/LE or Dynamic

 Jocke

Re: [PATCH v2 1/4] dt-bindings: pinctrl: Add devicetree bindings for MT6797 SoC Pinctrl

2018-11-07 Thread Sean Wang

On Wed, Oct 31, 2018 at 11:28 PM Manivannan Sadhasivam
 wrote:
>
> Add devicetree bindings for Mediatek MT6797 SoC Pin Controller.
>
> Signed-off-by: Manivannan Sadhasivam 
> ---
>  .../bindings/pinctrl/pinctrl-mt6797.txt   |   74 +
>  include/dt-bindings/pinctrl/mt6797-pinfunc.h  | 1368 +
>  2 files changed, 1442 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/pinctrl/pinctrl-mt6797.txt
>  create mode 100644 include/dt-bindings/pinctrl/mt6797-pinfunc.h
>
> diff --git a/Documentation/devicetree/bindings/pinctrl/pinctrl-mt6797.txt 
> b/Documentation/devicetree/bindings/pinctrl/pinctrl-mt6797.txt
> new file mode 100644
> index ..b9e2e2fe138f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pinctrl/pinctrl-mt6797.txt
> @@ -0,0 +1,74 @@
> +* Mediatek MT6797 Pin Controller

MediaTek

> +
> +The Mediatek's MT6797 Pin controller is used to control SoC pins.

MediaTek's

> +
> +Required properties:
> +- compatible: Value should be one of the following.
> +  "mediatek,mt6797-pinctrl", compatible with mt6797 pinctrl.
> +- reg:Should contain address and size for gpio, iocfgl, iocfgb,
> +  iocfgr and iocfgt register bases.
> +- reg-names:  An array of strings describing the "reg" entries. Must
> +  contain "gpio", "iocfgl", "iocfgb", "iocfgr", "iocfgt".
> +- gpio-controller: Marks the device node as a gpio controller.
> +- #gpio-cells: Should be two. The first cell is the gpio pin number
> +   and the second cell is used for optional parameters.
> +
> +Optional properties:
> +- interrupt-controller: Marks the device node as an interrupt controller.
> +- #interrupt-cells: Should be two.
> +- interrupts : The interrupt outputs from the controller.
> +
> +Please refer to pinctrl-bindings.txt in this directory for details of the
> +common pinctrl bindings used by client devices.
> +
> +Subnode format
> +A pinctrl node should contain at least one subnodes representing the
> +pinctrl groups available on the machine. Each subnode will list the
> +pins it needs, and how they should be configured, with regard to muxer
> +configuration, pullups, drive strength, input enable/disable and input 
> schmitt.
> +
> +node {
> +pinmux = ;
> +GENERIC_PINCONFIG;
> +};
> +
> +Required properties:
> +- pinmux: Integer array, represents gpio pin number and mux setting.
> +  Supported pin number and mux varies for different SoCs, and are
> +  defined as macros in dt-bindings/pinctrl/-pinfunc.h directly.
> +
> +Optional properties:

Could you help to update the section as the below ? it would help
describe the optional properties fully about the device.  Apart from
that, other things look good to me.

Optional properies:
GENERIC_PINCONFIG: is the generic pinconfig options to use,
 bias-disable, bias-pull, bias-pull-down, input-enable,
 input-schmitt-enable, input-schmitt-disable, output-enable
 output-low, output-high, drive-strength, slew-rate are valid.

 Valid arguments for 'slew-rate' are '0' for no slew rate controlled and '1' for
 slower slew rate respectively.
 Valid arguments for 'drive-strength' is limited, such as 4, 8, 12, or
16 in mA for a DRV_GRP0 pin.

Some optional vendor properties as defined are valid to specify in a
pinconf subnode:
 - mediatek,tdsel: An integer describing the steps for output level shifter duty
   cycle when asserted (high pulse width adjustment). Valid arguments are from 0
   to 15.
 - mediatek,rdsel: An integer describing the steps for input level
shifter duty cycle when asserted (high pulse width adjustment). Valid
arguments are from 0
   to 63.
-  mediatek,pull-up-adv: An integer describing the code R1R0 as 0, 1,
2 or 3 for the advanced pull-up resistors.
-  mediatek,pull-down-adv: An integer describing the code R1R0 as 0,
1, 2, or 3 for the advanced pull-up resistors.

> +- GENERIC_PINCONFIG: is the generic pinconfig options to use, bias-disable,
> +bias-pull-down, bias-pull-up, input-enable, input-disable, output-low, 
> output-high,
> +input-schmitt-enable, input-schmitt-disable and drive-strength are valid.
> +
> +Some special pins have extra pull up strength, there are R0 and R1 
> pull-up
> +resistors available, but for user, it's only need to set R1R0 as 00, 01, 
> 10 or 11.
> +So when config bias-pull-up, it support arguments for those special pins.
> +Some macros have been defined for this usage, such as 
> MTK_PUPD_SET_R1R0_00.
> +See dt-bindings/pinctrl/mt65xx.h.
> +
> +When config drive-strength, it can support some arguments, such as
> +MTK_DRIVE_4mA, MTK_DRIVE_6mA, etc. See dt-bindings/pinctrl/mt65xx.h.
> +

for the above stuff, I think we can remove them freely

> +Examples:
> +
> +pio: pinctrl@10005000 {
> +compatible = "mediatek,mt6797-pinctrl";
> +reg = <0 0x10005000 0 0x1000>,
> +  <0 0x10002000 0 0x400>,
> +

Re: RFC: staging: gasket: re-implement using UIO

2018-11-07 Thread Greg KH

On Tue, Nov 06, 2018 at 04:20:49PM -0800, Todd Poynor wrote:
> On Mon, Sep 10, 2018 at 8:28 AM Ahmed S. Darwish  wrote:
> >
> > The gasket in-kernel framework, recently introduced under staging,
> > re-implements what is already long-time provided by the UIO
> > subsystem, with extra PCI BAR remapping and MSI conveniences.
> >
> > Before moving it out of staging, make sure we add the new bits to
> > the UIO framework instead, then transform its signle client, the
> > Apex driver, to a proper UIO driver (uio_driver.h).
> >
> > Link: https://lkml.kernel.org/r/20180828103817.GB1397@do-kernel
> 
> So I'm looking at this for reals now.  The BAR mapping stuff is
> straightforward with the existing framework.  Everything else could be
> done outside of UIO via the existing device interface, but figured I'd
> collect any opinions about adding the new bits to UIO.
> 
> The Apex device has 13 MSIX interrupts.  UIO does one IRQ per device.
> The PRUSS driver registers 8 instances of the UIO device with
> identical memory mappings but individual IRQs for its 8 interrupts.
> Currently gasket has userspace pass down an eventfd (via ioctl) for
> each interrupt it wants to watch.  Is there interest in modifying UIO
> to handle multiple IRQs in some perhaps similar fashion?
> 
> Speaking of ioctls, are those allowed here, or is sysfs or something
> else always required?  The aforementioned multiple IRQ stuff probably
> maps nicely to sysfs (there's a small number of them easily
> represented as attributes), while DMA buffer mappings seem more
> problematic, but maybe somebody's thought of a good way to represent
> these already.

Adding ioctls opens up a huge can of worms where each driver would
probably want to do their own mess and then we have to constantly audit
the mess all of the time.  What do you really need to do in an ioctl?

> And then we need to map buffers to our device.  We could probably
> implement this via an IOMMU driver API for our custom MMU and hook
> that up to generic IOMMU support for UIO, which sounds like something
> a lot of drivers could use.

You need to map buffers from what to what?  UIO is for pass-through
memory that your device exposes to userspace through mmap, what more do
you need that the existing api does not give you?

> There's a few other tidbits the driver does, including allocating
> coherent memory for userspace to share with the device, but that's
> probably enough for now.
> 
> If anybody wants to squash any of the above as a non-starter for UIO
> or point things in a different direction, it's appreciated.

Patches are always the best way to do review, we generally do not have
time or bandwidth to do "what do you think of my design" ideas without
real code behind it, sorry.  That way we know you really did try to do
something and you have something that starts to work for you.

thanks,

greg k-h

Re: [PATCH 4/6] drivers: base: Introducing software nodes to the firmware node framework

2018-11-07 Thread Heikki Krogerus

On Wed, Nov 07, 2018 at 07:39:33AM +0300, Dan Carpenter wrote:
> Hi Heikki,
> 
> url:
> https://github.com/0day-ci/linux/commits/Heikki-Krogerus/device-property-Introducing-software-nodes/20181106-031310
> 
> smatch warnings:
> drivers/base/swnode.c:391 fwnode_create_software_node() error: dereferencing 
> freed memory 'swnode'
> 
> # 
> https://github.com/0day-ci/linux/commit/a8c9678ea46a0171baed68e4ec355a9b3f967458
> git remote add linux-review https://github.com/0day-ci/linux
> git remote update linux-review
> git checkout a8c9678ea46a0171baed68e4ec355a9b3f967458
> vim +/swnode +391 drivers/base/swnode.c
> 
> a8c9678e Heikki Krogerus 2018-11-05  365  
> a8c9678e Heikki Krogerus 2018-11-05  366  struct fwnode_handle *
> a8c9678e Heikki Krogerus 2018-11-05  367  fwnode_create_software_node(const 
> struct property_entry *properties,
> a8c9678e Heikki Krogerus 2018-11-05  368  const 
> struct fwnode_handle *parent)
> a8c9678e Heikki Krogerus 2018-11-05  369  {
> a8c9678e Heikki Krogerus 2018-11-05  370  struct software_node *p = NULL;
> a8c9678e Heikki Krogerus 2018-11-05  371  struct software_node *swnode;
> a8c9678e Heikki Krogerus 2018-11-05  372  char node_name[20];
> a8c9678e Heikki Krogerus 2018-11-05  373  int ret;
> a8c9678e Heikki Krogerus 2018-11-05  374  
> a8c9678e Heikki Krogerus 2018-11-05  375  if (parent) {
> a8c9678e Heikki Krogerus 2018-11-05  376  if (IS_ERR(parent))
> a8c9678e Heikki Krogerus 2018-11-05  377  return 
> ERR_CAST(parent);
> a8c9678e Heikki Krogerus 2018-11-05  378  if 
> (!is_software_node(parent))
> a8c9678e Heikki Krogerus 2018-11-05  379  return 
> ERR_PTR(-EINVAL);
> a8c9678e Heikki Krogerus 2018-11-05  380  p = 
> to_software_node(parent);
> a8c9678e Heikki Krogerus 2018-11-05  381  }
> a8c9678e Heikki Krogerus 2018-11-05  382  
> a8c9678e Heikki Krogerus 2018-11-05  383  swnode = 
> kzalloc(sizeof(*swnode), GFP_KERNEL);
> a8c9678e Heikki Krogerus 2018-11-05  384  if (!swnode)
> a8c9678e Heikki Krogerus 2018-11-05  385  return ERR_PTR(-ENOMEM);
> a8c9678e Heikki Krogerus 2018-11-05  386  
> a8c9678e Heikki Krogerus 2018-11-05  387  swnode->id = ida_simple_get(p ? 
> >child_ids : _root_ids, 0, 0,
> a8c9678e Heikki Krogerus 2018-11-05  388  
> GFP_KERNEL);
> a8c9678e Heikki Krogerus 2018-11-05  389  if (swnode->id < 0) {
> a8c9678e Heikki Krogerus 2018-11-05  390  kfree(swnode);
>   ^^
> a8c9678e Heikki Krogerus 2018-11-05 @391  return 
> ERR_PTR(swnode->id);
>
> ^^
> a8c9678e Heikki Krogerus 2018-11-05  392  }

Thanks!

-- 
heikki

Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page

2018-11-07 Thread Paolo Bonzini

On 07/11/2018 13:10, Alexander Potapenko wrote:
> This appears to be a real bug in KVM.
> Please see a simplified reproducer attached.

Thanks, I agree it's a reael bug.  The basic issue is that the
kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE
ioctl, aka 0x4080aebf.

One way to fix it would be to just change kmalloc to kzalloc when
allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl
is wrong and should be rejected.  And the case where a shadow VMCS has
to be loaded is even more wrong, and we have to fix it anyway, so I
don't really like the idea of papering over the bug in the allocation.

I'll test this patch and submit it formally:

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c645f777b425..c546f0b1f3e0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct
kvm_vcpu *vcpu,
if (ret)
return ret;

-   /* Empty 'VMXON' state is permitted */
-   if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
+   /* Empty 'VMXON' state is permitted.  A partial VMCS12 is not.  */
+   if (kvm_state->size == sizeof(kvm_state))
return 0;

+   if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE)
+   return -EINVAL;
+
if (kvm_state->vmx.vmcs_pa != -1ull) {
if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa ||
!page_address_valid(vcpu, kvm_state->vmx.vmcs_pa))
@@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
*vcpu,
}

vmcs12 = get_vmcs12(vcpu);
+   BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE);
if (copy_from_user(vmcs12, user_kvm_nested_state->data, 
sizeof(*vmcs12)))
return -EFAULT;

@@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
*vcpu,
if (nested_cpu_has_shadow_vmcs(vmcs12) &&
vmcs12->vmcs_link_pointer != -1ull) {
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
-   if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12))
+   if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE)
return -EINVAL;

if (copy_from_user(shadow_vmcs12,

Paolo

Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages

2018-11-07 Thread Michal Hocko

On Wed 07-11-18 23:53:24, Balbir Singh wrote:
> On Wed, Nov 07, 2018 at 08:35:48AM +0100, Michal Hocko wrote:
> > On Wed 07-11-18 07:35:18, Balbir Singh wrote:
[...]
> > > The check seems to be quite aggressive and in a loop that iterates
> > > pages, but has nothing to do with the page, did you mean to make
> > > the check
> > > 
> > > zone_idx(page_zone(page)) == ZONE_MOVABLE
> > 
> > Does it make any difference? Can we actually encounter a page from a
> > different zone here?
> > 
> 
> Just to avoid page state related issues, do we want to go ahead
> with the migration if zone_idx(page_zone(page)) != ZONE_MOVABLE.

Could you be more specific what kind of state related issues you have in
mind?

> > > it also skips all checks for pinned pages and other checks
> > 
> > Yes, this is intentional and the comment tries to explain why. I wish we
> > could be add a more specific checks for movable pages - e.g. detect long
> > term pins that would prevent migration - but we do not have any facility
> > for that. Please note that the worst case of a false positive is a
> > repeated migration failure and user has a way to break out of migration
> > by a signal.
> >
> 
> Basically isolate_pages() will fail as opposed to hotplug failing upfront.
> The basic assertion this patch makes is that all ZONE_MOVABLE pages that
> are not reserved are hotpluggable.

Yes, that is correct.

-- 
Michal Hocko
SUSE Labs

Re: [RCF PATCH,v2,2/2] pwm: imx: Configure output to GPIO in disabled state

2018-11-07 Thread Vokáč Michal

Hi Uwe,

On 7.11.2018 10:33, Uwe Kleine-König wrote:
> Hello Michal,
> 
> just to state it more explicitly, I think the following patch (not even
> compile tested) is much preferable over your approach:

Interesting idea. I just wonder why nobody else did not come up with such
a simple solution before.

> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index 1d5242c9cde0..af88644b5efb 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -216,7 +216,14 @@ static int imx_pwm_apply_v2(struct pwm_chip *chip, 
> struct pwm_device *pwm,
>   cr |= MX3_PWMCR_POUTC;
>   
>   writel(cr, imx->mmio_base + MX3_PWMCR);
> - } else if (cstate.enabled) {
> + } else if (cstate.enabled && state->polarity == PWM_POLARITY_NORMAL) {
> + /*
> +  * When disabled in hardware the output pin goes to 0
> +  * independant of the polarity setting. The expectation of some
> +  * people however is that after disabling the pin goes to the
> +  * inactive level which isn't given for an inversed pwm, so
> +  * only disable for normal polarity.
> +  */
>   writel(0, imx->mmio_base + MX3_PWMCR);
>   
>   clk_disable_unprepare(imx->clk_per);

I tested your patch. It does not work as you expected.

In v4.20-rc1 the pwm-backlight driver has been converted to atomic API.
So the pwm_apply_v2 function is called only once to set new period/duty
and state. With your patch that means that "echo 0 > brightness" has no
visible effect. It leaves the PWM chip enabled with period/duty set to
however it was. But the core thinks it was reconfigured:

# cat /sys/class/backlight/backight/brightness
0

# cat /sys/kernel/debug/pwm
platform/208.pwm, 1 PWM device
  pwm-0   (backlight   ): requested period: 50 ns duty: 0 ns 
polarity: inverse

> I think it solves most if not all problems you want to address with the
> pinctrl stuff.

Unfortunately not. I also tested your patch on v4.19. It works as you
probably intended - it is possible to disable backlight without the PWM
chip being disabled. But it does not solve the time frame between
imx_pwm_probe() and imx_pwm_apply_v2().

In probe you do not have any users yet. So you do not know the requested
output polarity. With "default" pinctrl the PWM output would be muxed to
the selected pin and since the PWM chip is most probably disabled
(unless you enabled it in bootloader) you would get low level on the pin.
That means your backlight is fully enabled until the first call to
imx_pwm_apply_v2(). On my system this is 2 seconds.

It might not be a big issue for backlight but for motor control it is
not the right thing to do.

The other thing is I would prefer to make the change optional. With your
approach you are changing the behavior for all current users of inverted
PWM. I do not think all imx6 users are aware of the problem so they might
not be OK with the sudden change in the behavior.

I agree it would be nice if such a simple change as you proposed would
solve the problem. Still, I do not see any other solution than pinctrl
to deal with all the problems I would like to address.

Anyway, thank you for the idea!

All the best,
Michal

[PATCH v1 0/1] pinctrl: nuvoton: modify NPCM7xx pin configuration

2018-11-07 Thread Tomer Maimon

This patch Modify GPIO direction setting in pin configuration function.

please refer patch:
Kun Yi https://patchwork.ozlabs.org/patch/985540/

Tomer Maimon (1):
  pinctrl: nuvoton: modify NPCM7xx pin configuration function

 drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

-- 
2.14.1

[PATCH trivial] Documentation: ABI: led-trigger-pattern: Fix typos

2018-11-07 Thread Geert Uytterhoeven

  - Spelling s/brigntess/brightness/,
  - Double "use".

Signed-off-by: Geert Uytterhoeven 
---
 Documentation/ABI/testing/sysfs-class-led-trigger-pattern | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern 
b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
index fb3d1e03b8819bb9..1e5d172e064624d9 100644
--- a/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
+++ b/Documentation/ABI/testing/sysfs-class-led-trigger-pattern
@@ -37,8 +37,8 @@ Description:
  0-|   / \/ \/
+---0123456> time (s)
 
-   2. To make the LED go instantly from one brigntess value to 
another,
-   we should use use zero-time lengths (the brightness must be 
same as
+   2. To make the LED go instantly from one brightness value to 
another,
+   we should use zero-time lengths (the brightness must be same as
the previous tuple's). So the format should be:
"brightness_1 duration_1 brightness_1 0 brightness_2 duration_2
brightness_2 0 ...". For example:
-- 
2.17.1

[PATCH v1 1/1] pinctrl: nuvoton: modify NPCM7xx pin configuration function

2018-11-07 Thread Tomer Maimon

Modify GPIO direction setting in pin configuration function by using
generic GPIO functions to set the GPIO direction instead of direct
access to the GPIO direction register.

Signed-off-by: Tomer Maimon 
---
 drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c 
b/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
index 7ad50d9268aa..b455209382a5 100644
--- a/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
+++ b/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
@@ -1799,19 +1799,12 @@ static int npcm7xx_config_set_one(struct 
npcm7xx_pinctrl *npcm,
npcm_gpio_set(>gc, bank->base + NPCM7XX_GP_N_PU, gpio);
break;
case PIN_CONFIG_INPUT_ENABLE:
-   if (arg) {
-   iowrite32(gpio, bank->base + NPCM7XX_GP_N_OEC);
-   npcm_gpio_set(>gc, bank->base + NPCM7XX_GP_N_IEM,
- gpio);
-   } else
-   npcm_gpio_clr(>gc, bank->base + NPCM7XX_GP_N_IEM,
- gpio);
+   iowrite32(gpio, bank->base + NPCM7XX_GP_N_OEC);
+   bank->direction_input(>gc, pin % bank->gc.ngpio);
break;
case PIN_CONFIG_OUTPUT:
-   npcm_gpio_clr(>gc, bank->base + NPCM7XX_GP_N_IEM, gpio);
-   iowrite32(gpio, arg ? bank->base + NPCM7XX_GP_N_DOS :
- bank->base + NPCM7XX_GP_N_DOC);
iowrite32(gpio, bank->base + NPCM7XX_GP_N_OES);
+   bank->direction_output(>gc, pin % bank->gc.ngpio, arg);
break;
case PIN_CONFIG_DRIVE_PUSH_PULL:
npcm_gpio_clr(>gc, bank->base + NPCM7XX_GP_N_OTYP, gpio);
-- 
2.14.1

Re: [PATCH 6/7] ext4: lost brelse in ext4_xattr_move_to_block()

2018-11-07 Thread Jan Kara

On Wed 31-10-18 22:13:00, Vasily Averin wrote:
> Fixes 3f2571c1f91f ("ext4: factor out xattr moving")
> cc: Jan Kara 
> however issue was present in original ext4_expand_extra_isize_ea()
> Fixes 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...") # 2.6.23
> cc: Kalpak Shah 
> 
> Signed-off-by: Vasily Averin 

Good catch. Thanks for the fix. The patch looks good. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  fs/ext4/xattr.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 07b9a335c8eb..5c9bc0d85cc0 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -2617,6 +2617,8 @@ static int ext4_xattr_move_to_block(handle_t *handle, 
> struct inode *inode,
>   kfree(buffer);
>   if (is)
>   brelse(is->iloc.bh);
> + if (bs)
> + brelse(bs->bh);
>   kfree(is);
>   kfree(bs);
>  
> -- 
> 2.17.1
> 
-- 
Jan Kara 
SUSE Labs, CR

Re: [PATCH v5 02/15] sched/core: make sched_setattr able to tune the current policy

2018-11-07 Thread Patrick Bellasi

On 07-Nov 13:11, Peter Zijlstra wrote:
> On Mon, Oct 29, 2018 at 06:32:56PM +, Patrick Bellasi wrote:
> 
> > @@ -50,11 +52,13 @@
> >  #define SCHED_FLAG_RESET_ON_FORK   0x01
> >  #define SCHED_FLAG_RECLAIM 0x02
> >  #define SCHED_FLAG_DL_OVERRUN  0x04
> > -#define SCHED_FLAG_UTIL_CLAMP  0x08
> > +#define SCHED_FLAG_TUNE_POLICY 0x08
> > +#define SCHED_FLAG_UTIL_CLAMP  0x10
> 
> That seems to suggest you want to do this patch first, but you didn't..

I've kept it here just to better highlight this change, suggested by
Juri, since we was not entirely sure you are fine with it...

If you think it's ok adding a SCHED_FLAG_TUNE_POLICY behavior to the
sched_setattr syscall, I can certainly squash into the previous patch,
which gives more context on why we need it.

Otherwise, if we want to keep these bits better isolated for possible
future bisects, I can also move it at the beginning of the series.

What do you like best ?

Since we are at that, are we supposed to document some{where,how}
these API changes ?

-- 
#include 

Patrick Bellasi

Re: [PATCH v5 02/15] sched/core: make sched_setattr able to tune the current policy

2018-11-07 Thread Peter Zijlstra

On Wed, Nov 07, 2018 at 01:50:39PM +, Patrick Bellasi wrote:
> On 07-Nov 13:11, Peter Zijlstra wrote:
> > On Mon, Oct 29, 2018 at 06:32:56PM +, Patrick Bellasi wrote:
> > 
> > > @@ -50,11 +52,13 @@
> > >  #define SCHED_FLAG_RESET_ON_FORK 0x01
> > >  #define SCHED_FLAG_RECLAIM   0x02
> > >  #define SCHED_FLAG_DL_OVERRUN0x04
> > > -#define SCHED_FLAG_UTIL_CLAMP0x08
> > > +#define SCHED_FLAG_TUNE_POLICY   0x08
> > > +#define SCHED_FLAG_UTIL_CLAMP0x10
> > 
> > That seems to suggest you want to do this patch first, but you didn't..
> 
> I've kept it here just to better highlight this change, suggested by
> Juri, since we was not entirely sure you are fine with it...
> 
> If you think it's ok adding a SCHED_FLAG_TUNE_POLICY behavior to the
> sched_setattr syscall, I can certainly squash into the previous patch,
> which gives more context on why we need it.

I'm fine with the idea I think. It's the details I worry about. Which
fields in particular are not updated with this. Are flags?

Also, I'm not too keen on the name; since it explicitly does not modify
the policy and its related parameters, so TUNE_POLICY is actively wrong.

But the thing that confused me most is how fiddled the numbers to fit
this before UTIL_CLAMP.

> Since we are at that, are we supposed to document some{where,how}
> these API changes ?

I'm pretty sure there's a manpage somewhere... SCHED_SETATTR(2) seems to
exist on my machine. So that wants updates.

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Patrick Bellasi

On 07-Nov 14:16, Peter Zijlstra wrote:
> On Mon, Oct 29, 2018 at 06:32:57PM +, Patrick Bellasi wrote:
> 
> > +static void uclamp_group_put(unsigned int clamp_id, unsigned int group_id)
> >  {
> > +   union uclamp_map *uc_maps = _maps[clamp_id][0];
> > +   union uclamp_map uc_map_old, uc_map_new;
> > +   long res;
> > +
> > +retry:
> > +
> > +   uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> > +   uc_map_new = uc_map_old;
> > +   uc_map_new.se_count -= 1;
> > +   res = atomic_long_cmpxchg(_maps[group_id].adata,
> > + uc_map_old.data, uc_map_new.data);
> > +   if (res != uc_map_old.data)
> > +   goto retry;
> > +}
> 
> Please write cmpxchg loops in the form:
> 
>   atomic_long_t *ptr = _maps[clamp_id][group_id].adata;
>   union uclamp_map old, new;
> 
>   old.data = atomic_long_read(ptr);
>   do {
>   new.data = old.data;
>   new.se_cound--;
>   } while (!atomic_long_try_cmpxchg(ptr, , new.data));
> 
> 
> (same for all the others of course)

Ok, I did that to save some indentation, but actually it's most
commonly used in a while loop... will update in v6.

Out of curiosity, apart from code consistency, is that required also
specifically for any possible compiler related (mis)behavior ?

-- 
#include 

Patrick Bellasi

[PATCH v12 2/5] clk: imx: add fractional PLL output clock

2018-11-07 Thread Abel Vesa

From: Lucas Stach 

This is a new fractional clock type introduced on i.MX8.

The description of this fractional clock can be found here:

https://www.nxp.com/docs/en/reference-manual/IMX8MDQLQRM.pdf#page=834

Signed-off-by: Lucas Stach 
Signed-off-by: Abel Vesa 
Reviewed-by: Sascha Hauer 
---
 drivers/clk/imx/Makefile   |   1 +
 drivers/clk/imx/clk-frac-pll.c | 221 +
 drivers/clk/imx/clk.h  |   3 +
 3 files changed, 225 insertions(+)
 create mode 100644 drivers/clk/imx/clk-frac-pll.c

diff --git a/drivers/clk/imx/Makefile b/drivers/clk/imx/Makefile
index 8c3baa7..4893c1f 100644
--- a/drivers/clk/imx/Makefile
+++ b/drivers/clk/imx/Makefile
@@ -6,6 +6,7 @@ obj-y += \
clk-cpu.o \
clk-fixup-div.o \
clk-fixup-mux.o \
+   clk-frac-pll.o \
clk-gate-exclusive.o \
clk-gate2.o \
clk-pllv1.o \
diff --git a/drivers/clk/imx/clk-frac-pll.c b/drivers/clk/imx/clk-frac-pll.c
new file mode 100644
index 000..a800093
--- /dev/null
+++ b/drivers/clk/imx/clk-frac-pll.c
@@ -0,0 +1,221 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2018 NXP.
+ *
+ * This driver supports the fractional plls found in the imx8m SOCs
+ *
+ * Documentation for this fractional pll can be found at:
+ *   https://www.nxp.com/docs/en/reference-manual/IMX8MDQLQRM.pdf#page=834
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PLL_CFG0   0x0
+#define PLL_CFG1   0x4
+
+#define PLL_LOCK_STATUSBIT(31)
+#define PLL_PD_MASKBIT(19)
+#define PLL_BYPASS_MASKBIT(14)
+#define PLL_NEWDIV_VAL BIT(12)
+#define PLL_NEWDIV_ACK BIT(11)
+#define PLL_FRAC_DIV_MASK  GENMASK(30, 7)
+#define PLL_INT_DIV_MASK   GENMASK(6, 0)
+#define PLL_OUTPUT_DIV_MASKGENMASK(4, 0)
+#define PLL_FRAC_DENOM 0x100
+
+#define PLL_FRAC_LOCK_TIMEOUT  1
+#define PLL_FRAC_ACK_TIMEOUT   50
+
+struct clk_frac_pll {
+   struct clk_hw   hw;
+   void __iomem*base;
+};
+
+#define to_clk_frac_pll(_hw) container_of(_hw, struct clk_frac_pll, hw)
+
+static int clk_wait_lock(struct clk_frac_pll *pll)
+{
+   u32 val;
+
+   return readl_poll_timeout(pll->base, val, val & PLL_LOCK_STATUS, 0,
+   PLL_FRAC_LOCK_TIMEOUT);
+}
+
+static int clk_wait_ack(struct clk_frac_pll *pll)
+{
+   u32 val;
+
+   /* return directly if the pll is in powerdown or in bypass */
+   if (readl_relaxed(pll->base) & (PLL_PD_MASK | PLL_BYPASS_MASK))
+   return 0;
+
+   /* Wait for the pll's divfi and divff to be reloaded */
+   return readl_poll_timeout(pll->base, val, val & PLL_NEWDIV_ACK, 0,
+   PLL_FRAC_ACK_TIMEOUT);
+}
+
+static int clk_pll_prepare(struct clk_hw *hw)
+{
+   struct clk_frac_pll *pll = to_clk_frac_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   val &= ~PLL_PD_MASK;
+   writel_relaxed(val, pll->base + PLL_CFG0);
+
+   return clk_wait_lock(pll);
+}
+
+static void clk_pll_unprepare(struct clk_hw *hw)
+{
+   struct clk_frac_pll *pll = to_clk_frac_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   val |= PLL_PD_MASK;
+   writel_relaxed(val, pll->base + PLL_CFG0);
+}
+
+static int clk_pll_is_prepared(struct clk_hw *hw)
+{
+   struct clk_frac_pll *pll = to_clk_frac_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   return (val & PLL_PD_MASK) ? 0 : 1;
+}
+
+static unsigned long clk_pll_recalc_rate(struct clk_hw *hw,
+unsigned long parent_rate)
+{
+   struct clk_frac_pll *pll = to_clk_frac_pll(hw);
+   u32 val, divff, divfi, divq;
+   u64 temp64 = parent_rate;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   divq = ((val & PLL_OUTPUT_DIV_MASK) + 1) * 2;
+   val = readl_relaxed(pll->base + PLL_CFG1);
+   divff = FIELD_GET(PLL_FRAC_DIV_MASK, val);
+   divfi = val & PLL_INT_DIV_MASK;
+
+   temp64 *= 8;
+   temp64 *= divff;
+   do_div(temp64, PLL_FRAC_DENOM);
+   temp64 /= divq;
+
+   return parent_rate * 8 * (divfi + 1) / divq + (unsigned long)temp64;
+}
+
+static long clk_pll_round_rate(struct clk_hw *hw, unsigned long rate,
+  unsigned long *prate)
+{
+   u64 parent_rate = *prate;
+   u32 divff, divfi;
+   u64 temp64;
+
+   parent_rate *= 8;
+   rate *= 2;
+   divfi = rate / parent_rate;
+   temp64 = rate - divfi * parent_rate;
+   temp64 *= PLL_FRAC_DENOM;
+   do_div(temp64, parent_rate);
+   divff = temp64;
+
+   temp64 = parent_rate;
+   temp64 *= divff;
+   do_div(temp64, PLL_FRAC_DENOM);
+
+   return (parent_rate * divfi + temp64) / 2;
+}
+
+/*
+ * To simplify the clock calculation, we can keep the 'PLL_OUTPUT_VAL' at zero
+ * (means the PLL

[PATCH v12 4/5] clk: imx: Add imx composite clock

2018-11-07 Thread Abel Vesa

Since a lot of clocks on imx8m are formed by a mux, gate, predivider and
divider, the idea here is to combine all of those into one composite clock,
but we need to deal with both predivider and divider at the same time and
therefore we add the imx8m_clk_composite_divider_ops and register
the composite clock with those.

Signed-off-by: Abel Vesa 
Suggested-by: Sascha Hauer 
Reviewed-by: Sascha Hauer 
---
 drivers/clk/imx/Makefile   |   1 +
 drivers/clk/imx/clk-composite-8m.c | 178 +
 drivers/clk/imx/clk.h  |  16 
 3 files changed, 195 insertions(+)
 create mode 100644 drivers/clk/imx/clk-composite-8m.c

diff --git a/drivers/clk/imx/Makefile b/drivers/clk/imx/Makefile
index b87513c..237444b 100644
--- a/drivers/clk/imx/Makefile
+++ b/drivers/clk/imx/Makefile
@@ -3,6 +3,7 @@
 obj-y += \
clk.o \
clk-busy.o \
+   clk-composite-8m.o \
clk-cpu.o \
clk-fixup-div.o \
clk-fixup-mux.o \
diff --git a/drivers/clk/imx/clk-composite-8m.c 
b/drivers/clk/imx/clk-composite-8m.c
new file mode 100644
index 000..bcd31d8
--- /dev/null
+++ b/drivers/clk/imx/clk-composite-8m.c
@@ -0,0 +1,178 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2018 NXP
+ */
+
+#include 
+#include 
+#include 
+
+#include "clk.h"
+
+#define PCG_PREDIV_SHIFT   16
+#define PCG_PREDIV_WIDTH   3
+#define PCG_PREDIV_MAX 8
+
+#define PCG_DIV_SHIFT  0
+#define PCG_DIV_WIDTH  6
+#define PCG_DIV_MAX64
+
+#define PCG_PCS_SHIFT  24
+#define PCG_PCS_MASK   0x7
+
+#define PCG_CGC_SHIFT  28
+
+static unsigned long imx8m_clk_composite_divider_recalc_rate(struct clk_hw *hw,
+   unsigned long parent_rate)
+{
+   struct clk_divider *divider = to_clk_divider(hw);
+   unsigned long prediv_rate;
+   unsigned int prediv_value;
+   unsigned int div_value;
+
+   prediv_value = readl(divider->reg) >> divider->shift;
+   prediv_value &= clk_div_mask(divider->width);
+
+   prediv_rate = divider_recalc_rate(hw, parent_rate, prediv_value,
+   NULL, divider->flags,
+   divider->width);
+
+   div_value = readl(divider->reg) >> PCG_DIV_SHIFT;
+   div_value &= clk_div_mask(PCG_DIV_WIDTH);
+
+   return divider_recalc_rate(hw, prediv_rate, div_value, NULL,
+  divider->flags, PCG_DIV_WIDTH);
+}
+
+static int imx8m_clk_composite_compute_dividers(unsigned long rate,
+   unsigned long parent_rate,
+   int *prediv, int *postdiv)
+{
+   int div1, div2;
+   int error = INT_MAX;
+   int ret = -EINVAL;
+
+   *prediv = 1;
+   *postdiv = 1;
+
+   for (div1 = 1; div1 <= PCG_PREDIV_MAX; div1++) {
+   for (div2 = 1; div2 <= PCG_DIV_MAX; div2++) {
+   int new_error = ((parent_rate / div1) / div2) - rate;
+
+   if (abs(new_error) < abs(error)) {
+   *prediv = div1;
+   *postdiv = div2;
+   error = new_error;
+   ret = 0;
+   }
+   }
+   }
+   return ret;
+}
+
+static long imx8m_clk_composite_divider_round_rate(struct clk_hw *hw,
+   unsigned long rate,
+   unsigned long *prate)
+{
+   int prediv_value;
+   int div_value;
+
+   imx8m_clk_composite_compute_dividers(rate, *prate,
+   _value, _value);
+   rate = DIV_ROUND_UP(*prate, prediv_value);
+
+   return DIV_ROUND_UP(rate, div_value);
+
+}
+
+static int imx8m_clk_composite_divider_set_rate(struct clk_hw *hw,
+   unsigned long rate,
+   unsigned long parent_rate)
+{
+   struct clk_divider *divider = to_clk_divider(hw);
+   unsigned long flags = 0;
+   int prediv_value;
+   int div_value;
+   int ret = 0;
+   u32 val;
+
+   ret = imx8m_clk_composite_compute_dividers(rate, parent_rate,
+   _value, _value);
+   if (ret)
+   return -EINVAL;
+
+   spin_lock_irqsave(divider->lock, flags);
+
+   val = readl(divider->reg);
+   val &= ~((clk_div_mask(divider->width) << divider->shift) |
+   (clk_div_mask(PCG_DIV_WIDTH) << PCG_DIV_SHIFT));
+
+   val |= (u32)(prediv_value  - 1) << divider->shift;
+   val |= (u32)(div_value - 1) << PCG_DIV_SHIFT;
+   writel(val, divider->reg);
+
+   spin_unlock_irqrestore(divider->lock, flags);
+
+   return ret;
+}
+
+static const struct clk_ops imx8m_clk_composite_divider_ops = {

Re: [PATCH v5 2/2] sched/fair: update scale invariance of PELT

2018-11-07 Thread Vincent Guittot

On Wed, 7 Nov 2018 at 11:47, Dietmar Eggemann  wrote:
>
> On 11/5/18 10:10 AM, Vincent Guittot wrote:
> > On Fri, 2 Nov 2018 at 16:36, Dietmar Eggemann  
> > wrote:
> >>
> >> On 10/26/18 6:11 PM, Vincent Guittot wrote:
>
> [...]
>
> >> Thinking about this new approach on a big.LITTLE platform:
> >>
> >> CPU Capacities big: 1024 LITTLE: 512, performance CPUfreq governor
> >>
> >> A 50% (runtime/period) task on a big CPU will become an always running
> >> task on the little CPU. The utilization signal of the task and the
> >> cfs_rq of the little CPU converges to 1024.
> >>
> >> With contrib scaling the utilization signal of the 50% task converges to
> >> 512 on the little CPU, even it is always running on it, and so does the
> >> one of the cfs_rq.
> >>
> >> Two 25% tasks on a big CPU will become two 50% tasks on a little CPU.
> >> The utilization signal of the tasks converges to 512 and the one of the
> >> cfs_rq of the little CPU converges to 1024.
> >>
> >> With contrib scaling the utilization signal of the 25% tasks converges
> >> to 256 on the little CPU, even they run each 50% on it, and the one of
> >> the cfs_rq converges to 512.
> >>
> >> So what do we consider system-wide invariance? I thought that e.g. a 25%
> >> task should have a utilization value of 256 no matter on which CPU it is
> >> running?
> >>
> >> In both cases, the little CPU is not going idle whereas the big CPU does.
> >
> > IMO, the key point here is that there is no idle time. As soon as
> > there is no idle time, you don't know if a task has enough compute
> > capacity so you can't make difference between the 50% running task or
> > an always running task on the little core.
>
> Agreed. My '2 25% tasks on a 512 cpu' was a special example in the sense
> that the tasks would stay invariant since they are not restricted by the
> cpu capacity yet. '2 35% tasks' would also have 256 utilization each
> with contrib scaling so that's not invariant either.
>
> Could we say that in the overutilized case with contrib scaling each of
> the n tasks get cpu_cap/n utilization where with time scaling they get
> 1024/n utilization? Even though there is no value in this information
> because of the over-utilized state.
>
> > That's also interesting to noticed that the task will reach the always
> > running state after more than 600ms on little core with utilization
> > starting from 0.
> >
> > Then considering the system-wide invariance, the task are not really
> > invariant. If we take a 50% running task that run 40ms in a period of
> > 80ms, the max utilization of the task will be 721 on the big core and
> > 512 on the little core.
>
> Agreed, the utilization of the task on the big CPU oscillates between
> 721 and 321 so the average is still ~512.
>
> > Then, if you take a 39ms running task instead, the utilization on the
> > big core will reach 709 but it will be 507 on little core. So your
> > utilization depends on the current capacity.
>
> OK, but the average should be ~ 507 on big as well. There is idle time

I don't know about the average, the utilization varies between 709-292
on big core and between 507-486 in little core

> now even on the little CPU. But yeah, with longer period value, there
> are quite big amplitudes.
>
> > With the new proposal, the max utilization will be 709 on big and
> > little cores for the 39ms running task. For the 40ms running task, the
> > utilization will be 721 on big core. then if the task moves on the
> > little, it will reach the value 721 after 80ms,  then 900 after more
> > than 160ms and 1000 after 320ms
>
> We consider max values here? In this case, agreed. So this is a reminder

Yes, we consider max value as it's what is mainly used especially with
util_est feature

> that even if the average utilization of a task compared to the CPU
> capacity would mark the system as non-overutilized (39ms/80ms on a 512
> CPU), the utilization of that task looks different because of the
> oscillation which is pretty noticeable with long periods.
>
> The important bit for EAS is that it only uses utilization in the
> non-overutilized case. Here, utilization signals should look the same
> between the two approaches, not considering tasks with long periods like
> the 39/80ms example above.
> There are also some advantages for EAS with time scaling: (1) faster
> overutilization detection when a big task runs on a little CPU, (2)
> higher (initial) task utilization value when this task migrates from
> little to big CPU.
>
> We should run our EAS task placement tests with your time scaling patches.

Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page

2018-11-07 Thread Liran Alon




> On 7 Nov 2018, at 14:47, Paolo Bonzini  wrote:
> 
> On 07/11/2018 13:10, Alexander Potapenko wrote:
>> This appears to be a real bug in KVM.
>> Please see a simplified reproducer attached.
> 
> Thanks, I agree it's a reael bug.  The basic issue is that the
> kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE
> ioctl, aka 0x4080aebf.
> 
> One way to fix it would be to just change kmalloc to kzalloc when
> allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl
> is wrong and should be rejected.  And the case where a shadow VMCS has
> to be loaded is even more wrong, and we have to fix it anyway, so I
> don't really like the idea of papering over the bug in the allocation.
> 
> I'll test this patch and submit it formally:
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c645f777b425..c546f0b1f3e0 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct
> kvm_vcpu *vcpu,
>   if (ret)
>   return ret;
> 
> - /* Empty 'VMXON' state is permitted */
> - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
> + /* Empty 'VMXON' state is permitted.  A partial VMCS12 is not.  */
> + if (kvm_state->size == sizeof(kvm_state))
>   return 0;
> 
> + if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE)
> + return -EINVAL;
> +

I don’t think that this test is sufficient to fully resolve issue.
What if malicious userspace supplies valid size but pages containing 
nested_state->vmcs12 is unmapped?
This will result in vmx_set_nested_state() still calling set_current_vmptr() 
but failing on copy_from_user()
which still leaks cached_vmcs12 on next VMPTRLD of guest.

Therefore, I think that the correct patch should be to change 
vmx_set_nested_state() to
first gather all relevant information from userspace and validate it,
and only then start applying it to KVM’s internal vCPU state.

>   if (kvm_state->vmx.vmcs_pa != -1ull) {
>   if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa ||
>   !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa))
> @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
> *vcpu,
>   }
> 
>   vmcs12 = get_vmcs12(vcpu);
> + BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE);

Why put this BUILD_BUG_ON() specifically here?
There are many places which assumes cached_vmcs12 is of size VMCS12_SIZE.
(Such as nested_release_vmcs12() and handle_vmptrld()).

>   if (copy_from_user(vmcs12, user_kvm_nested_state->data, 
> sizeof(*vmcs12)))
>   return -EFAULT;
> 
> @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
> *vcpu,
>   if (nested_cpu_has_shadow_vmcs(vmcs12) &&
>   vmcs12->vmcs_link_pointer != -1ull) {
>   struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
> - if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12))
> + if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE)
>   return -EINVAL;
> 
>   if (copy_from_user(shadow_vmcs12,
> 
> Paolo

-Liran

Re: [PATCH] irq/timings: Fix model validity

2018-11-07 Thread Peter Zijlstra

On Wed, Nov 07, 2018 at 11:52:31AM +0100, Daniel Lezcano wrote:
> > @@ -146,11 +152,38 @@ static void irqs_update(struct irqt_stat *irqs, u64 
> > ts)
> >  */
> > diff = interval - irqs->avg;
> >  
> > +   /*
> > +* Online average algorithm:
> > +*
> > +*  new_average = average + ((value - average) / count)
> > +*
> > +* The variance computation depends on the new average
> > +* to be computed here first.
> > +*
> > +*/
> > +   irqs->avg = irqs->avg + (diff >> IRQ_TIMINGS_SHIFT);
> > +
> > +   /*
> > +* Online variance algorithm:
> > +*
> > +*  new_variance = variance + (value - average) x (value - new_average)
> > +*
> > +* Warning: irqs->avg is updated with the line above, hence
> > +* 'interval - irqs->avg' is no longer equal to 'diff'
> > +*/
> > +   irqs->variance = irqs->variance + (diff * (interval - irqs->avg));
> > +
> > /*
> >  * Increment the number of samples.
> >  */
> > irqs->nr_samples++;

FWIW, I'm confused on this. The normal (Welford's) online algorithm
does:

count++;
delta = value - mean;
mean += delta / count;
M2 += delta * (value - mean);

But the above uses:

mean += delta / 32;

Which, for count >> 32, over-estimates the mean adjustment. But worse,
it significantly under-estimates the mean during training.

How is the computed variance still correct with this? I can not find any
comments that clarifies this. I'm thinking that since the mean will
slowly wander towards it's actual location (assuming an actual standard
distribution input) the resulting variance will be far too large, since
the (value - mean) term will be much larger than 'expected'.

> > @@ -158,16 +191,12 @@ static void irqs_update(struct irqt_stat *irqs, u64 
> > ts)
> >  * more than 32 and dividing by 32 instead of 31 is enough
> >  * precise.
> >  */
> > +   variance = irqs->variance >> IRQ_TIMINGS_SHIFT;

Worse; variance is actually (as the comment states):

s^2 = M2 / (count -1)

But instead you compute:

s^2 = M2 / 32;

Which is again much larger than the actual result; assuming count >> 32.

So you compute a variance that is inflated in two different ways.

I'm not seeing how this thing works reliably.

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Peter Zijlstra

On Mon, Oct 29, 2018 at 06:32:57PM +, Patrick Bellasi wrote:
> +/**
> + * uclamp_group_get: increase the reference count for a clamp group
> + * @uc_se: the utilization clamp data for the task
> + * @clamp_id: the clamp index affected by the task
> + * @clamp_value: the new clamp value for the task
> + *
> + * Each time a task changes its utilization clamp value, for a specified 
> clamp
> + * index, we need to find an available clamp group which can be used to track
> + * this new clamp value. The corresponding clamp group index will be used to
> + * reference count the corresponding clamp value while the task is enqueued 
> on
> + * a CPU.
> + */
> +static void uclamp_group_get(struct uclamp_se *uc_se, unsigned int clamp_id,
> +  unsigned int clamp_value)
> +{
> + union uclamp_map *uc_maps = _maps[clamp_id][0];
> + unsigned int prev_group_id = uc_se->group_id;
> + union uclamp_map uc_map_old, uc_map_new;
> + unsigned int free_group_id;
> + unsigned int group_id;
> + unsigned long res;
> +
> +retry:
> +
> + free_group_id = UCLAMP_GROUPS;
> + for (group_id = 0; group_id < UCLAMP_GROUPS; ++group_id) {
> + uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> + if (free_group_id == UCLAMP_GROUPS && !uc_map_old.se_count)
> + free_group_id = group_id;
> + if (uc_map_old.value == clamp_value)
> + break;
> + }
> + if (group_id >= UCLAMP_GROUPS) {
> +#ifdef CONFIG_SCHED_DEBUG
> +#define UCLAMP_MAPERR "clamp value [%u] mapping to clamp group failed\n"
> + if (unlikely(free_group_id == UCLAMP_GROUPS)) {
> + pr_err_ratelimited(UCLAMP_MAPERR, clamp_value);
> + return;
> + }
> +#endif
> + group_id = free_group_id;
> + uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> + }

You forgot to check for refcount overflow here ;-)

And I'm not really a fan of hiding that error in a define like you keep
doing.

What's wrong with something like:

if (SCHED_WARN(free_group_id == UCLAMP_GROUPS))
return;

and

> + uc_map_new.se_count = uc_map_old.se_count + 1;

if (SCHED_WARN(!new.se_count))
new.se_count = -1;

> + uc_map_new.value = clamp_value;
> + res = atomic_long_cmpxchg(_maps[group_id].adata,
> +   uc_map_old.data, uc_map_new.data);
> + if (res != uc_map_old.data)
> + goto retry;
> +
> + /* Update SE's clamp values and attach it to new clamp group */
> + uc_se->value = clamp_value;
> + uc_se->group_id = group_id;
> +
> + /* Release the previous clamp group */
> + if (uc_se->mapped)
> + uclamp_group_put(clamp_id, prev_group_id);
> + uc_se->mapped = true;
> +}

[PATCH trivial] microblaze: Typo s/use use/use/

2018-11-07 Thread Geert Uytterhoeven

Signed-off-by: Geert Uytterhoeven 
---
 arch/microblaze/include/asm/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/microblaze/include/asm/pgtable.h 
b/arch/microblaze/include/asm/pgtable.h
index f64ebb9c9a413535..bdfb2b3182b04c3b 100644
--- a/arch/microblaze/include/asm/pgtable.h
+++ b/arch/microblaze/include/asm/pgtable.h
@@ -200,7 +200,7 @@ static inline pte_t pte_mkspecial(pte_t pte){ 
return pte; }
  * is cleared in the TLB miss handler before the TLB entry is loaded.
  * - All other bits of the PTE are loaded into TLBLO without
  *  * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
- * software PTE bits.  We actually use use bits 21, 24, 25, and
+ * software PTE bits.  We actually use bits 21, 24, 25, and
  * 30 respectively for the software bits: ACCESSED, DIRTY, RW, and
  * PRESENT.
  */
-- 
2.17.1

Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai  wrote:
> When queue_interrupt() is called from fuse_dev_do_write(),
> it came from userspace directly. Userspace may pass any
> request id, even the request's we have not interrupted
> (or even background's request). This patch adds sanity
> check to make kernel safe against that.

Okay, I understand this far.

> The problem is real interrupt may be queued and requeued
> by two tasks on two cpus. This case, the requeuer has not
> guarantees it sees FR_INTERRUPTED bit on its cpu (since
> we know nothing about the way userspace manages requests
> between its threads and whether it uses smp barriers).

This sounds like BS.   There's an explicit  smp_mb__after_atomic()
after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
going to be visible on all CPU's after this, no need to fool around
with setting FR_INTERRUPTED again, etc...



>
> To eliminate this problem, queuer writes FR_INTERRUPTED
> bit again under fiq->waitq.lock, and this guarantees
> requeuer will see the bit, when checks it.
>
> I try to introduce solution, which does not affect on
> performance, and which does not force to take more
> locks. This is the reason, the below solution is worse:
>
>request_wait_answer()
>{
>  ...
>   +  spin_lock(>waitq.lock);
>  set_bit(FR_INTERRUPTED, >flags);
>   +  spin_unlock(>waitq.lock);
>  ...
>}
>
> Also, it does not look a better idea to extend fuse_dev_do_read()
> with the fix, since it's already a big function:
>
>fuse_dev_do_read()
>{
>  ...
>  if (test_bit(FR_INTERRUPTED, >flags)) {
>   +  /* Comment */
>   +  barrier();
>   +  set_bit(FR_INTERRUPTED, >flags);
>  queue_interrupt(fiq, req);
>  }
>  ...
>}
>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dev.c |   26 +-
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 315d395d5c02..3bfc5ed61c9a 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct 
> fuse_req *req)
> fuse_put_request(fc, req);
>  }
>
> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>  {
> bool kill = false;
>
> if (test_bit(FR_FINISHED, >flags))
> -   return;
> +   return 0;
> spin_lock(>waitq.lock);
> +   /* Check for we've sent request to interrupt this req */
> +   if (unlikely(!test_bit(FR_INTERRUPTED, >flags))) {
> +   spin_unlock(>waitq.lock);
> +   return -EINVAL;
> +   }
> +   /*
> +* Interrupt may be queued from fuse_dev_do_read(), and
> +* later requeued on other cpu by fuse_dev_do_write().
> +* To make FR_INTERRUPTED bit visible for the requeuer
> +* (under fiq->waitq.lock) we write it once again.
> +*/
> +   barrier();
> +   __set_bit(FR_INTERRUPTED, >flags);
> +
> if (list_empty(>intr_entry)) {
> list_add_tail(>intr_entry, >interrupts);
> /*
> @@ -492,7 +506,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, 
> struct fuse_req *req)
> if (test_bit(FR_FINISHED, >flags)) {
> list_del_init(>intr_entry);
> spin_unlock(>waitq.lock);
> -   return;
> +   return 0;
> }
> wake_up_locked(>waitq);
> kill = true;
> @@ -500,6 +514,7 @@ static void queue_interrupt(struct fuse_iqueue *fiq, 
> struct fuse_req *req)
> spin_unlock(>waitq.lock);
> if (kill)
> kill_fasync(>fasync, SIGIO, POLL_IN);
> +   return (int)kill;
>  }
>
>  static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
> @@ -1959,8 +1974,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
> nbytes = -EINVAL;
> else if (oh.error == -ENOSYS)
> fc->no_interrupt = 1;
> -   else if (oh.error == -EAGAIN)
> -   queue_interrupt(>iq, req);
> +   else if (oh.error == -EAGAIN &&
> +queue_interrupt(>iq, req) < 0)
> +   nbytes = -EINVAL;
>
> fuse_put_request(fc, req);
> fuse_copy_finish(cs);
>

[PATCH] objtool: fix .cold. functions parent symbols search

2018-11-07 Thread Artem Savkov

The way it is currently done it is possible for read_symbols() to find the
same symbol as parent for ".cold" functions. This leads to a bunch of
complications such as func length being set to 0 and a segfault in
add_switch_table(). Fix by copying the search string instead of modifying
it in place.

Signed-off-by: Artem Savkov 
---
 tools/objtool/elf.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 6dbb9fae0f9d..781c8afb29b9 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -298,6 +298,7 @@ static int read_symbols(struct elf *elf)
/* Create parent/child links for any cold subfunctions */
list_for_each_entry(sec, >sections, list) {
list_for_each_entry(sym, >symbol_list, list) {
+   char *pname;
if (sym->type != STT_FUNC)
continue;
sym->pfunc = sym->cfunc = sym;
@@ -305,9 +306,9 @@ static int read_symbols(struct elf *elf)
if (!coldstr)
continue;
 
-   coldstr[0] = '\0';
-   pfunc = find_symbol_by_name(elf, sym->name);
-   coldstr[0] = '.';
+   pname = strndup(sym->name, coldstr - sym->name);
+   pfunc = find_symbol_by_name(elf, pname);
+   free(pname);
 
if (!pfunc) {
WARN("%s(): can't find parent function",
-- 
2.17.2

Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski




> On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
> 
>> On 11/07, Elvira Khabirova wrote:
>> 
>> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
>> has no reliable means to find out that the syscall was, in fact,
>> a compat syscall, and misidentifies it.
>> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> 
> Yes, this was discussed many times...
> 
> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> debugger can use PTRACE_GETEVENTMSG to get this info.

As I said before, I strongly object to the use of “compat” here.  Compat meant 
“not the kernel’s native syscall API — uses the 32-bit structure format 
instead”.  This does not have a sensible meaning to user code, especially in 
the case where the tracer is 32-bit.

> 
>> Secondly, ptracers also have to support a lot of arch-specific code for
>> obtaining information about the tracee. For some architectures, this
>> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
>> argument and return value.
> 
> I am not sure about this change... I won't really argue, but imo this
> needs a separate patch.

Why?  Having a single struct that the tracer can read to get the full state is 
extremely helpful.

Also, we really want it to work for seccomp events as well as PTRACE_SYSCALL, 
and the event info trick doesn’t make sense for seccomp events.

> 
>> +#define PT_IN_SYSCALL_STOP0x0004/* task is in a syscall-stop */
> ...
>> -static inline int ptrace_report_syscall(struct pt_regs *regs)
>> +static inline int ptrace_report_syscall(struct pt_regs *regs,
>> +unsigned long message)
>> {
>>int ptrace = current->ptrace;
>> 
>>if (!(ptrace & PT_PTRACED))
>>return 0;
>> +current->ptrace |= PT_IN_SYSCALL_STOP;
>> 
>> +current->ptrace_message = message;
>>ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
>> 
>>/*
>> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
>> *regs)
>>current->exit_code = 0;
>>}
>> 
>> +current->ptrace &= ~PT_IN_SYSCALL_STOP;
>>return fatal_signal_pending(current);
> ...
> 
>> +case PTRACE_GET_SYSCALL_INFO:
>> +if (child->ptrace & PT_IN_SYSCALL_STOP)
>> +ret = ptrace_get_syscall(child, datavp);
>> +break;
> 
> Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee reported
> syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
> nothing bad can happen.
> 
> 

I think it’s considerably nicer to the user to avoid reporting garbage if the 
user misused the API.  (And Elvira got this right in the patch — I just missed 
it.)

>

Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.

2018-11-07 Thread Petr Mladek

On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote:
> How to use this API:
> 
>   (1) Call get_printk_buffer() and acquire "struct printk_buffer *".
> 
>   (2) Rewrite printk() calls in the following way. The "ptr" is
>   "struct printk_buffer *" obtained in step (1).
> 
>   printk(fmt, ...) => printk_buffered(ptr, fmt, ...)
>   vprintk(fmt, args)   => vprintk_buffered(ptr, fmt, args)
>   pr_emerg(fmt, ...)   => bpr_emerg(ptr, fmt, ...)
>   pr_alert(fmt, ...)   => bpr_alert(ptr, fmt, ...)
>   pr_crit(fmt, ...)=> bpr_crit(ptr, fmt, ...)
>   pr_err(fmt, ...) => bpr_err(ptr, fmt, ...)
>   pr_warning(fmt, ...) => bpr_warning(ptr, fmt, ...)
>   pr_warn(fmt, ...)=> bpr_warn(ptr, fmt, ...)
>   pr_notice(fmt, ...)  => bpr_notice(ptr, fmt, ...)
>   pr_info(fmt, ...)=> bpr_info(ptr, fmt, ...)
>   pr_cont(fmt, ...)=> bpr_cont(ptr, fmt, ...)

I am looking at the sample conversions. We actually won't need
bpr_cont(). We will use buffer_printk() instead.

Well, I think about renaming buffer_printk() to bprintk() or
define it as a wrapper at least.

Best Regards,
Petr

Re: [PATCH] ACPI / PMIC: xpower: fix IOSF_MBI dependency

2018-11-07 Thread Hans de Goede


Hi,

On 07-11-18 13:22, Rafael J. Wysocki wrote:

On Fri, Nov 2, 2018 at 12:07 PM Arnd Bergmann  wrote:


We still get a link failure with IOSF_MBI=m when the xpower driver
is built-in:

drivers/acpi/pmic/intel_pmic_xpower.o: In function 
`intel_xpower_pmic_update_power':
intel_pmic_xpower.c:(.text+0x4f2): undefined reference to 
`iosf_mbi_block_punit_i2c_access'
intel_pmic_xpower.c:(.text+0x5e2): undefined reference to 
`iosf_mbi_unblock_punit_i2c_access'

This makes the dependency stronger, so we can only build when IOSF_MBI
is built-in.

Fixes: 6a9b593d4b6f ("ACPI / PMIC: xpower: Add depends on IOSF_MBI to Kconfig 
entry")
Signed-off-by: Arnd Bergmann 
---
  drivers/acpi/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 18851e7eedd5..31a3c4a03f61 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -514,7 +514,7 @@ config CRC_PMIC_OPREGION

  config XPOWER_PMIC_OPREGION
 bool "ACPI operation region support for XPower AXP288 PMIC"
-   depends on MFD_AXP20X_I2C && IOSF_MBI
+   depends on MFD_AXP20X_I2C && IOSF_MBI=y
 help
   This config adds ACPI operation region support for XPower AXP288 
PMIC.

--


At this point I'm inclined to apply the patch as is as a short-term
fix and improvements can be made on top of it.

Any objections?


Not from me, go for it :)

Regards,

Hans

Re: [PATCH 1/6] fuse: Kill fasync only if interrupt is queued in queue_interrupt()

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai  wrote:
> We should sent signal only in case of interrupt is really queued.
> Not a real problem, but this makes the code clearer and intuitive.
>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dev.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index fb2530ed84b3..7705f75c77a3 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -468,6 +468,8 @@ static void request_end(struct fuse_conn *fc, struct 
> fuse_req *req)
>
>  static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>  {
> +   bool kill = false;
> +
> spin_lock(>waitq.lock);
> if (test_bit(FR_FINISHED, >flags)) {
> spin_unlock(>waitq.lock);
> @@ -476,9 +478,11 @@ static void queue_interrupt(struct fuse_iqueue *fiq, 
> struct fuse_req *req)
> if (list_empty(>intr_entry)) {
> list_add_tail(>intr_entry, >interrupts);
> wake_up_locked(>waitq);
> +   kill = true;
> }
> spin_unlock(>waitq.lock);
> -   kill_fasync(>fasync, SIGIO, POLL_IN);
> +   if (kill)
> +   kill_fasync(>fasync, SIGIO, POLL_IN);

All other cases just do the kill_fasync() inside the fiq->waitq.lock
locked region.  That seems the simpler and more readable solution to
this.

Thanks,
Miklos

Re: [PATCH] arm64: dts: meson-gx: Add hdmi_5v regulator as hdmi tx supply

2018-11-07 Thread Neil Armstrong

On 05/11/2018 17:21, Neil Armstrong wrote:
> The hdmi_5v regulator must be enabled to provide power to the physical HDMI
> PHY and enables the HDMI 5V presence loopback for the monitor.
> 
> Fixes: b409f625a6d5 ("ARM64: dts: meson-gx: Add HDMI_5V regulator on selected 
> boards")
> Signed-off-by: Neil Armstrong 
> ---
>  arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi  | 1 +
>  arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts   | 1 +
>  arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dts | 1 +
>  arch/arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts | 1 +
>  arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts| 1 +
>  5 files changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi 
> b/arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi
> index 765247b..eb8eec8 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-gx-p23x-q20x.dtsi
> @@ -125,6 +125,7 @@
>   status = "okay";
>   pinctrl-0 = <_hpd_pins>, <_i2c_pins>;
>   pinctrl-names = "default";
> + phy-supply = <_5v>;
>  };
>  
>  _tx_tmds_port {
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts 
> b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts
> index d32cf38..69c28c4 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-khadas-vim.dts
> @@ -78,6 +78,7 @@
>   status = "okay";
>   pinctrl-0 = <_hpd_pins>, <_i2c_pins>;
>   pinctrl-names = "default";
> + phy-supply = <_5v>;
>  };
>  
>  _tx_tmds_port {
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dts 
> b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dts
> index 90a56af..86a7be8 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dts
> @@ -155,6 +155,7 @@
>   status = "okay";
>   pinctrl-0 = <_hpd_pins>, <_i2c_pins>;
>   pinctrl-names = "default";
> + phy-supply = <_5v>;
>  };
>  
>  _tx_tmds_port {
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts 
> b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts
> index 5896e8a..8a39d95 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-gxl-s905x-p212.dts
> @@ -51,6 +51,7 @@
>   status = "okay";
>   pinctrl-0 = <_hpd_pins>, <_i2c_pins>;
>   pinctrl-names = "default";
> + phy-supply = <_5v>;
>  };
>  
>  _tx_tmds_port {
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts 
> b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
> index 313f88f..50d2be1 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
> +++ b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
> @@ -271,6 +271,7 @@
>   status = "okay";
>   pinctrl-0 = <_hpd_pins>, <_i2c_pins>;
>   pinctrl-names = "default";
> + phy-supply = <_5v>;
>  };
>  
>  _tx_tmds_port {
> 


Oops, it should be hdmi_supply...

I'll resend..

neil

Re: [PATCH v3] staging: olpc_dcon: olpc_dcon_xo_1.c: Switch to the gpio descriptor interface

2018-11-07 Thread Nishad Kamdar

On Wed, Nov 07, 2018 at 12:36:52PM +0100, Greg Kroah-Hartman wrote:
> On Tue, Nov 06, 2018 at 01:13:19PM +0530, Nishad Kamdar wrote:
> > Use the gpiod interface instead of the deprecated old non-descriptor
> > interface in olpc_dcon_xo_1.c.
> > ---
> > Changes in v3:
> >  - Resolve a few compilation errors.
> > Changes in v2:
> >  - Resolve a few compilation errors.
> >  - Add a level of indirection to read and write gpios.
> > Signed-off-by: Nishad Kamdar 
> 
> The signed-off-by has to be above the --- line otherwise it will get
> stripped off by git when the patch is applied :(
> 
Ok, I'll change it.

Thanks for the review.

regards,
Nishad

[PATCH v4] staging: olpc_dcon: olpc_dcon_xo_1.c: Switch to the gpio descriptor interface

2018-11-07 Thread Nishad Kamdar

Use the gpiod interface instead of the deprecated old non-descriptor
interface in olpc_dcon_xo_1.c.

Signed-off-by: Nishad Kamdar 
---
Changes in v4:
 - Move changelog after signed-off line.
Changes in v3:
 - Resolve a few compilation errors.
Changes in v2:
 - Resolve a few compilation errors.
 - Add a level of indirection to read and write gpios.
---
 drivers/staging/olpc_dcon/olpc_dcon_xo_1.c | 90 +++---
 1 file changed, 47 insertions(+), 43 deletions(-)

diff --git a/drivers/staging/olpc_dcon/olpc_dcon_xo_1.c 
b/drivers/staging/olpc_dcon/olpc_dcon_xo_1.c
index ff145d493e1b..80b8d4153414 100644
--- a/drivers/staging/olpc_dcon/olpc_dcon_xo_1.c
+++ b/drivers/staging/olpc_dcon/olpc_dcon_xo_1.c
@@ -11,35 +11,51 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include 
-#include 
+#include 
 #include 
+#include 
 #include 
 
 #include "olpc_dcon.h"
 
+enum dcon_gpios {
+   OLPC_DCON_STAT0,
+   OLPC_DCON_STAT1,
+   OLPC_DCON_IRQ,
+   OLPC_DCON_LOAD,
+   OLPC_DCON_BLANK,
+};
+
+struct dcon_gpio {
+   const char *name;
+   unsigned long flags;
+};
+
+static const struct dcon_gpio gpios_asis[] = {
+   [OLPC_DCON_STAT0] = { .name = "dcon_stat0", .flags = GPIOD_ASIS },
+   [OLPC_DCON_STAT1] = { .name = "dcon_stat1", .flags = GPIOD_ASIS },
+   [OLPC_DCON_IRQ] = { .name = "dcon_irq", .flags = GPIOD_ASIS },
+   [OLPC_DCON_LOAD] = { .name = "dcon_load", .flags = GPIOD_ASIS },
+   [OLPC_DCON_BLANK] = { .name = "dcon_blank", .flags = GPIOD_ASIS },
+};
+
+struct gpio_desc *gpios[5];
+
 static int dcon_init_xo_1(struct dcon_priv *dcon)
 {
unsigned char lob;
-
-   if (gpio_request(OLPC_GPIO_DCON_STAT0, "OLPC-DCON")) {
-   pr_err("failed to request STAT0 GPIO\n");
-   return -EIO;
-   }
-   if (gpio_request(OLPC_GPIO_DCON_STAT1, "OLPC-DCON")) {
-   pr_err("failed to request STAT1 GPIO\n");
-   goto err_gp_stat1;
-   }
-   if (gpio_request(OLPC_GPIO_DCON_IRQ, "OLPC-DCON")) {
-   pr_err("failed to request IRQ GPIO\n");
-   goto err_gp_irq;
-   }
-   if (gpio_request(OLPC_GPIO_DCON_LOAD, "OLPC-DCON")) {
-   pr_err("failed to request LOAD GPIO\n");
-   goto err_gp_load;
-   }
-   if (gpio_request(OLPC_GPIO_DCON_BLANK, "OLPC-DCON")) {
-   pr_err("failed to request BLANK GPIO\n");
-   goto err_gp_blank;
+   int ret, i;
+   struct dcon_gpio *pin = _asis[0];
+
+   for (i = 0; i < ARRAY_SIZE(gpios_asis); i++) {
+   gpios[i] = devm_gpiod_get(>client->dev, pin[i].name,
+ pin[i].flags);
+   if (IS_ERR(gpios[i])) {
+   ret = PTR_ERR(gpios[i]);
+   pr_err("failed to request %s GPIO: %d\n", pin[i].name,
+  ret);
+   return ret;
+   }
}
 
/* Turn off the event enable for GPIO7 just to be safe */
@@ -61,12 +77,12 @@ static int dcon_init_xo_1(struct dcon_priv *dcon)
dcon->pending_src = dcon->curr_src;
 
/* Set the directions for the GPIO pins */
-   gpio_direction_input(OLPC_GPIO_DCON_STAT0);
-   gpio_direction_input(OLPC_GPIO_DCON_STAT1);
-   gpio_direction_input(OLPC_GPIO_DCON_IRQ);
-   gpio_direction_input(OLPC_GPIO_DCON_BLANK);
-   gpio_direction_output(OLPC_GPIO_DCON_LOAD,
- dcon->curr_src == DCON_SOURCE_CPU);
+   gpiod_direction_input(gpios[OLPC_DCON_STAT0]);
+   gpiod_direction_input(gpios[OLPC_DCON_STAT1]);
+   gpiod_direction_input(gpios[OLPC_DCON_IRQ]);
+   gpiod_direction_input(gpios[OLPC_DCON_BLANK]);
+   gpiod_direction_output(gpios[OLPC_DCON_LOAD],
+  dcon->curr_src == DCON_SOURCE_CPU);
 
/* Set up the interrupt mappings */
 
@@ -84,7 +100,7 @@ static int dcon_init_xo_1(struct dcon_priv *dcon)
/* Register the interrupt handler */
if (request_irq(DCON_IRQ, _interrupt, 0, "DCON", dcon)) {
pr_err("failed to request DCON's irq\n");
-   goto err_req_irq;
+   return -EIO;
}
 
/* Clear INV_EN for GPIO7 (DCONIRQ) */
@@ -125,18 +141,6 @@ static int dcon_init_xo_1(struct dcon_priv *dcon)
cs5535_gpio_set(OLPC_GPIO_DCON_BLANK, GPIO_EVENTS_ENABLE);
 
return 0;
-
-err_req_irq:
-   gpio_free(OLPC_GPIO_DCON_BLANK);
-err_gp_blank:
-   gpio_free(OLPC_GPIO_DCON_LOAD);
-err_gp_load:
-   gpio_free(OLPC_GPIO_DCON_IRQ);
-err_gp_irq:
-   gpio_free(OLPC_GPIO_DCON_STAT1);
-err_gp_stat1:
-   gpio_free(OLPC_GPIO_DCON_STAT0);
-   return -EIO;
 }
 
 static void dcon_wiggle_xo_1(void)
@@ -180,13 +184,13 @@ static void dcon_wiggle_xo_1(void)
 
 static void dcon_set_dconload_1(int val)
 {
-   gpio_set_value(OLPC_GPIO_DCON_LOAD, val);
+   gpiod_set_value(gpios[OLPC_DCON_LOAD], val);
 }
 
 static int

Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.

2018-11-07 Thread Tetsuo Handa

On 2018/11/07 19:21, Petr Mladek wrote:
> On Tue 2018-11-06 23:35:02, Sergey Senozhatsky wrote:
>>> Since we want to remove "struct cont" eventually, we will try to remove
>>> both "implicit printk() users who are expecting KERN_CONT behavior" and
>>> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
>>> this API is recommended.
>>
>> - The printk-fallback sounds like a hint that the existing 'cont' handling
>>   better stay in the kernel. I don't see how the existing 'cont' is
>>   significantly worse than
>>  bpr_warn(NULL, ...)->printk() // no 'cont' support
>>   I don't see why would we want to do it, sorry. I don't see "it takes 16
>>   printk-buffers to make a thing go right" as a sure thing.
> 
> I see it the following way:
> 
>+ mixed cont lines are very rare but they happen
> 
>+ 16 buffers are more than 1 so it could only be better [*]
> 
>+ the printk_buffer() code is self-contained and does not
>  complicate the logic of the classic printk() code [**]
> 
> 
> [*] A missing put_printk_buffer() might cause that we would get
> out of buffers. But the same problem is with locks,
> disabled preemption, disabled interrupts, seq_buffer,
> alloc/free. Such problems happen but they are rare.
> 
> Also I do not expect that the same buffer would be shared
> between many functions. Therefore it should be easy
> to use it correctly.

Since we can allocate printk() buffer upon dup_task_struct()
and free it upon free_task_struct(), we can have enough printk()
buffers for task context. Also, since total number of exceptional
contexts (e.g. interrupts, oops) is finite, we can have enough
printk() buffers for exceptional contexts.

Is it possible to identify all locations which should use their own
printk() buffers (e.g. interrupt handlers, oops handlers) ? If yes,
despite Linus's objection, automatically switching printk() buffers
(like memalloc_nofs_save()/memalloc_nofs_restore() instead of
https://lkml.kernel.org/r/201709021512.djg00503.jhosoffmftv...@i-love.sakura.ne.jp
 )
will be easiest and least error prone.

[PATCH] regulator: bd718x7: Change next state after poweroff to ready

2018-11-07 Thread Matti Vaittinen

BD71837 and BD71847 have a HW functionality which leave power
rails OFF after powerof state:
- if they have been controlled by SW.
- if state transition from poweroff is done to SNVS

BD71837 can after reset transition from power-off to SNVS or
READY state depending on reset reason. By default only wathcdog
reset changes state from poweroff to ready. Change PMIC
configuration to always transition to READY in order to avoid
crucial power rails being OFF after reset.

If SNVS is required the crucial power rails should not be
controlled by SW - eg corresponding regulator control register
should have SEL bit kept zero. Currently the driver assumes all
regulators to be controlled by SW so it sets all SEL bits to 1.

Signed-off-by: Matti Vaittinen 
---
 drivers/regulator/bd718x7-regulator.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/regulator/bd718x7-regulator.c 
b/drivers/regulator/bd718x7-regulator.c
index 3a47e0372e77..63669f9038cc 100644
--- a/drivers/regulator/bd718x7-regulator.c
+++ b/drivers/regulator/bd718x7-regulator.c
@@ -1053,6 +1053,29 @@ static int bd718xx_probe(struct platform_device *pdev)
BD718XX_REG_REGLOCK);
}
 
+   /* At poweroff transition PMIC HW disables EN bit for regulators but
+* leaves SEL bit untouched. So if state transition from POWEROFF
+* is done to SNVS - then all power rails controlled by SW (having
+* SEL bit set) stay disabled as EN is cleared. This may result boot
+* failure if any crucial systems are powered by these rails.
+*
+* Change the next stage from poweroff to be READY instead of SNVS
+* for all reset types because OTP loading at READY will clear SEL
+* bit allowing HW defaults for power rails to be used
+*/
+   err = regmap_update_bits(mfd->regmap, BD718XX_REG_TRANS_COND1,
+BD718XX_ON_REQ_POWEROFF_MASK |
+BD718XX_SWRESET_POWEROFF_MASK |
+BD718XX_WDOG_POWEROFF_MASK |
+BD718XX_KEY_L_POWEROFF_MASK,
+BD718XX_POWOFF_TO_RDY);
+   if (err) {
+   dev_err(>dev, "Failed to change reset target\n");
+   goto err;
+   } else {
+   dev_dbg(>dev, "Changed all resets from SVNS to READY\n");
+   }
+
for (i = 0; i < pmic_regulators[mfd->chip_type].r_amount; i++) {
 
const struct regulator_desc *desc;
-- 
2.14.3

Re: [PATCH v6 1/3] printk: Add line-buffered printk() API.

2018-11-07 Thread Petr Mladek

On Fri 2018-11-02 22:31:55, Tetsuo Handa wrote:
> Sometimes we want to print a whole line without being disturbed by
> concurrent printk() from interrupts and/or other threads, for printk()
> which does not end with '\n' can be disturbed.
> 
> Since mixed printk() output makes it hard to interpret, this patch
> introduces API for line-buffered printk() output (so that we can make
> sure that printk() ends with '\n').
> 
> Since functions introduced by this patch are merely wrapping
> printk()/vprintk() calls in order to minimize possibility of using
> "struct cont", it is safe to replace printk()/vprintk() with this API.
> Since we want to remove "struct cont" eventually, we will try to remove
> both "implicit printk() users who are expecting KERN_CONT behavior" and
> "explicit pr_cont()/printk(KERN_CONT) users". Therefore, converting to
> this API is recommended.

[...]

> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index cf3eccf..92af345 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -173,6 +174,20 @@ int printk_emit(int facility, int level,
>  
>  asmlinkage __printf(1, 2) __cold
>  int printk(const char *fmt, ...);
> +struct printk_buffer *get_printk_buffer(void);
> +bool flush_printk_buffer(struct printk_buffer *ptr);
> +__printf(2, 3)
> +int printk_buffered(struct printk_buffer *ptr, const char *fmt, ...);
> +__printf(2, 0)
> +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list 
> args);
> +/*
> + * In order to avoid accidentally reusing "ptr" after 
> put_printk_buffer("ptr"),
> + * put_printk_buffer() is defined as a macro which explicitly resets "ptr" to
> + * NULL.
> + */
> +void __put_printk_buffer(struct printk_buffer *ptr);
> +#define put_printk_buffer(ptr)   \
> + do { __put_printk_buffer(ptr); ptr = NULL; } while (0)

I have mixed feelings about this. It is clever but it complicates the
code. I wonder why we need to be more paranoid than kfree() and
friends. Especially that the reuse of printk buffer is much less
dangerous than reusing freed pointers.

Please, remove it unless it gets more supporters.


> --- /dev/null
> +++ b/kernel/printk/printk_buffered.c
> +int vprintk_buffered(struct printk_buffer *ptr, const char *fmt, va_list 
> args)
> +{
> + va_list tmp_args;
> + int r;
> + int pos;
> +
> + if (!ptr)
> + return vprintk(fmt, args);
> + /*
> +  * Skip KERN_CONT here based on an assumption that KERN_CONT will be
> +  * given via "fmt" argument when KERN_CONT is given.
> +  */
> + pos = (printk_get_level(fmt) == 'c') ? 2 : 0;
> + while (true) {
> + va_copy(tmp_args, args);
> + r = vsnprintf(ptr->buf + ptr->len, sizeof(ptr->buf) - ptr->len,
> +   fmt + pos, tmp_args);
> + va_end(tmp_args);
> + if (likely(r + ptr->len < sizeof(ptr->buf)))
> + break;
> + if (!flush_printk_buffer(ptr))
> + return vprintk(fmt, args);
> + }
> + ptr->len += r;
> + /* Flush already completed lines if any. */
> + for (pos = ptr->len - 1; pos >= 0; pos--) {
> + if (ptr->buf[pos] != '\n')
> + continue;
> + ptr->buf[pos++] = '\0';
> + printk("%s\n", ptr->buf);
> + ptr->len -= pos;
> + memmove(ptr->buf, ptr->buf + pos, ptr->len);
> + /* This '\0' will be overwritten by next vsnprintf() above. */
> + ptr->buf[ptr->len] = '\0';

I am not against explicitly adding '\0' this way. Just note that there
is always the trailing '\0' in the original string, see below.

I mean that we could get the same result with memmove(,, ptr->len+1);


> + break;
> + }
> + return r;
> +}
> +
> +/**
> + * flush_printk_buffer - Flush incomplete line in printk_buffer.
> + *
> + * @ptr: Pointer to "struct printk_buffer". It can be NULL.
> + *
> + * Returns true if flushed something, false otherwise.
> + *
> + * Flush if @ptr contains partial data. But usually there is no need to call
> + * this function because @ptr is flushed by put_printk_buffer().
> + */
> +bool flush_printk_buffer(struct printk_buffer *ptr)
> +{
> + if (!ptr || !ptr->len)
> + return false;
> + /*
> +  * vprintk_buffered() keeps 0 <= ptr->len < sizeof(ptr->buf) true.
> +  * But ptr->buf[ptr->len] != '\0' if this function is called due to
> +  * vsnprintf() + ptr->len >= sizeof(ptr->buf).
> +  */
> + ptr->buf[ptr->len] = '\0';

This is not necessary. Note that vsnprintf() always adds the trailing
'\0' even when the string is shrunken.


> + printk("%s", ptr->buf);
> + ptr->len = 0;
> + return true;
> +}
> +EXPORT_SYMBOL(flush_printk_buffer);
> +
> +/**
> + * __put_printk_buffer - Release printk_buffer.
> + *
> + * @ptr: Pointer to "struct printk_buffer". It can be NULL.
> + *
> + * Returns nothing.
> + *
>

[PATCH v2] PCI: imx: Add imx6sx suspend/resume support

2018-11-07 Thread Leonard Crestez

Enable PCI suspend/resume support on imx6sx socs. This is similar to
imx7d with a few differences:

* The PM_Turn_Off bit is exposed through an IOMUX GPR, like all other
pcie control bits on 6sx.
* The pcie_inbound_axi clk needs to be turned off in suspend. On resume
it is restored via resume -> deassert_core_reset -> enable_ref_clk.

Most of the resume logic is shared with the initial reset after probe.

Signed-off-by: Leonard Crestez 

---
Changes since v1:
* Use a switch statement in imx6_pcie_pm_turnoff. The DT-based turnoff
path is still an if statement.
* Did not split imx6_pcie_clk_disable or call it from other paths, this
would bring complications and is somewhat unrelated.
* See v1 comments: https://lore.kernel.org/patchwork/patch/996806/

 drivers/pci/controller/dwc/pci-imx6.c   | 44 ++---
 include/linux/mfd/syscon/imx6q-iomuxc-gpr.h |  1 +
 2 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-imx6.c 
b/drivers/pci/controller/dwc/pci-imx6.c
index 2cbef2d7c207..54625569d0bc 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -771,41 +771,75 @@ static void imx6_pcie_ltssm_disable(struct device *dev)
}
 }
 
 static void imx6_pcie_pm_turnoff(struct imx6_pcie *imx6_pcie)
 {
-   reset_control_assert(imx6_pcie->turnoff_reset);
-   reset_control_deassert(imx6_pcie->turnoff_reset);
+   struct device *dev = imx6_pcie->pci->dev;
+
+   /* Some variants have a turnoff reset in DT */
+   if (imx6_pcie->turnoff_reset) {
+   reset_control_assert(imx6_pcie->turnoff_reset);
+   reset_control_deassert(imx6_pcie->turnoff_reset);
+   goto pm_turnoff_sleep;
+   }
+
+   /* Others poke directly at IOMUXC registers */
+   switch (imx6_pcie->variant) {
+   case IMX6SX:
+   regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12,
+   IMX6SX_GPR12_PCIE_PM_TURN_OFF,
+   IMX6SX_GPR12_PCIE_PM_TURN_OFF);
+   regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12,
+   IMX6SX_GPR12_PCIE_PM_TURN_OFF, 0);
+   break;
+   default:
+   dev_err(dev, "PME_Turn_Off not implemented\n");
+   return;
+   }
 
/*
 * Components with an upstream port must respond to
 * PME_Turn_Off with PME_TO_Ack but we can't check.
 *
 * The standard recommends a 1-10ms timeout after which to
 * proceed anyway as if acks were received.
 */
+pm_turnoff_sleep:
usleep_range(1000, 1);
 }
 
 static void imx6_pcie_clk_disable(struct imx6_pcie *imx6_pcie)
 {
clk_disable_unprepare(imx6_pcie->pcie);
clk_disable_unprepare(imx6_pcie->pcie_phy);
clk_disable_unprepare(imx6_pcie->pcie_bus);
 
-   if (imx6_pcie->variant == IMX7D) {
+   switch (imx6_pcie->variant) {
+   case IMX6SX:
+   clk_disable_unprepare(imx6_pcie->pcie_inbound_axi);
+   break;
+   case IMX7D:
regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12,
   IMX7D_GPR12_PCIE_PHY_REFCLK_SEL,
   IMX7D_GPR12_PCIE_PHY_REFCLK_SEL);
+   break;
+   default:
+   break;
}
 }
 
+static inline bool imx6_pcie_supports_suspend(struct imx6_pcie *imx6_pcie)
+{
+   return (imx6_pcie->variant == IMX7D ||
+   imx6_pcie->variant == IMX6SX);
+}
+
 static int imx6_pcie_suspend_noirq(struct device *dev)
 {
struct imx6_pcie *imx6_pcie = dev_get_drvdata(dev);
 
-   if (imx6_pcie->variant != IMX7D)
+   if (!imx6_pcie_supports_suspend(imx6_pcie))
return 0;
 
imx6_pcie_pm_turnoff(imx6_pcie);
imx6_pcie_clk_disable(imx6_pcie);
imx6_pcie_ltssm_disable(dev);
@@ -817,11 +851,11 @@ static int imx6_pcie_resume_noirq(struct device *dev)
 {
int ret;
struct imx6_pcie *imx6_pcie = dev_get_drvdata(dev);
struct pcie_port *pp = _pcie->pci->pp;
 
-   if (imx6_pcie->variant != IMX7D)
+   if (!imx6_pcie_supports_suspend(imx6_pcie))
return 0;
 
imx6_pcie_assert_core_reset(imx6_pcie);
imx6_pcie_init_phy(imx6_pcie);
imx6_pcie_deassert_core_reset(imx6_pcie);
diff --git a/include/linux/mfd/syscon/imx6q-iomuxc-gpr.h 
b/include/linux/mfd/syscon/imx6q-iomuxc-gpr.h
index 6c1ad160ed87..c1b25f5e386d 100644
--- a/include/linux/mfd/syscon/imx6q-iomuxc-gpr.h
+++ b/include/linux/mfd/syscon/imx6q-iomuxc-gpr.h
@@ -438,10 +438,11 @@
 #define IMX6SX_GPR5_DISP_MUX_DCIC1_LCDIF1  (0x0 << 1)
 #define IMX6SX_GPR5_DISP_MUX_DCIC1_LVDS(0x1 << 1)
 #define IMX6SX_GPR5_DISP_MUX_DCIC1_MASK(0x1 << 1)
 
 #define IMX6SX_GPR12_PCIE_TEST_POWERDOWN   BIT(30)
+#define IMX6SX_GPR12_PCIE_PM_TURN_OFF

Re: [PATCH 2/3] mm: Use line-buffered printk() for show_free_areas().

2018-11-07 Thread Petr Mladek

On Fri 2018-11-02 22:31:56, Tetsuo Handa wrote:
> syzbot is sometimes getting mixed output like below due to concurrent
> printk(). Mitigate such output by using line-buffered printk() API.
> 
>   Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB 
>   syz-executor0: page allocation failure: order:0, 
> mode:0x484020(GFP_ATOMIC|__GFP_COMP), nodemask=(null)
>   (U) 
>   syz-executor0 cpuset=
>   2*64kB 
>   syz0
>   (U) 
>mems_allowed=0
>   1*128kB 
>   CPU: 0 PID: 7592 Comm: syz-executor0 Not tainted 4.19.0-rc6+ #118
>   (U) 
>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
>   1*256kB (U) 
>   Call Trace:
>   0*512kB 
>
>   1*1024kB 
>__dump_stack lib/dump_stack.c:77 [inline]
>dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
>   (U) 
>   1*2048kB 
>warn_alloc.cold.119+0xb7/0x1bd mm/page_alloc.c:3426
>   (M) 
> 
> Signed-off-by: Tetsuo Handa 
> Cc: Andrew Morton 
> ---
>  mm/page_alloc.c | 32 +---
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a919ba5..4411d5a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4694,10 +4694,10 @@ unsigned long nr_free_pagecache_pages(void)
>   return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER_MOVABLE));
>  }
>  
> -static inline void show_node(struct zone *zone)
> +static inline void show_node(struct zone *zone, struct printk_buffer *buf)
>  {
>   if (IS_ENABLED(CONFIG_NUMA))
> - printk("Node %d ", zone_to_nid(zone));
> + printk_buffered(buf, "Node %d ", zone_to_nid(zone));

The conversion looks fine to me. I just think about renaming
printk_buffered to bprintk().

Best Regards,
Petr

Re: [PATCH trivial] Documentation: ABI: led-trigger-pattern: Fix typos

2018-11-07 Thread Pavel Machek

On Wed 2018-11-07 14:45:24, Geert Uytterhoeven wrote:
>   - Spelling s/brigntess/brightness/,
>   - Double "use".
> 
> Signed-off-by: Geert Uytterhoeven 

Acked-by: Pavel Machek 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH 4/6] fuse: Implement fuse_attr_version_inc()

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:43 AM, Kirill Tkhai  wrote:
> For cleanup purpose archive repeating pattern:
>
> spin_lock(>lock);
> fi->attr_version = ++fc->attr_version;
> spin_unlock(>lock);
>
> into separate function.
>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dir.c|8 ++--
>  fs/fuse/file.c   |   16 
>  fs/fuse/fuse_i.h |   11 +++
>  fs/fuse/inode.c  |4 +---
>  4 files changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 794b9f48fb8e..35f3b3d1e044 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -674,9 +674,7 @@ static int fuse_unlink(struct inode *dir, struct dentry 
> *entry)
> struct fuse_inode *fi = get_fuse_inode(inode);
>
> spin_lock(>lock);
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> /*
>  * If i_nlink == 0 then unlink doesn't make sense, yet this 
> can
>  * happen if userspace filesystem is careless.  It would be
> @@ -830,9 +828,7 @@ static int fuse_link(struct dentry *entry, struct inode 
> *newdir,
> struct fuse_inode *fi = get_fuse_inode(inode);
>
> spin_lock(>lock);
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> inc_nlink(inode);
> spin_unlock(>lock);
> fuse_invalidate_attr(inode);
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 48e2889ba6a6..1164d3580c62 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -186,9 +186,7 @@ void fuse_finish_open(struct inode *inode, struct file 
> *file)
> struct fuse_inode *fi = get_fuse_inode(inode);
>
> spin_lock(>lock);
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> i_size_write(inode, 0);
> spin_unlock(>lock);
> fuse_invalidate_attr(inode);
> @@ -604,9 +602,7 @@ static void fuse_aio_complete(struct fuse_io_priv *io, 
> int err, ssize_t pos)
> struct fuse_inode *fi = get_fuse_inode(inode);
>
> spin_lock(>lock);
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> spin_unlock(>lock);
> }
>
> @@ -685,9 +681,7 @@ static void fuse_read_update_size(struct inode *inode, 
> loff_t size,
> spin_lock(>lock);
> if (attr_ver == fi->attr_version && size < inode->i_size &&
> !test_bit(FUSE_I_SIZE_UNSTABLE, >state)) {
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> i_size_write(inode, size);
> }
> spin_unlock(>lock);
> @@ -1006,9 +1000,7 @@ bool fuse_write_update_size(struct inode *inode, loff_t 
> pos)
> bool ret = false;
>
> spin_lock(>lock);
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +   fi->attr_version = fuse_attr_version_inc_return(fc);
> if (pos > inode->i_size) {
> i_size_write(inode, pos);
> ret = true;
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 075da40499b8..f7442bcecbb0 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -775,6 +775,17 @@ static inline int invalid_nodeid(u64 nodeid)
> return !nodeid || nodeid == FUSE_ROOT_ID;
>  }
>
> +static inline u64 fuse_attr_version_inc_return(struct fuse_conn *fc)
> +{
> +   u64 attr_version;
> +
> +   spin_lock(>lock);
> +   attr_version = ++fc->attr_version;
> +   spin_unlock(>lock);
> +
> +   return attr_version;
> +}

atomic64_t?

And move this patch one up, 3/6 is complicated as is without having to
add extra locking around fc->attr_version.



> +
>  /** Device operations */
>  extern const struct file_operations fuse_dev_operations;
>
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index f62d45686b42..5f488b019cd9 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -167,9 +167,7 @@ void fuse_change_attributes_common(struct inode *inode, 
> struct fuse_attr *attr,
>
> lockdep_assert_held(>lock);
>
> -   spin_lock(>lock);
> -   fi->attr_version = ++fc->attr_version;
> -   spin_unlock(>lock);
> +

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Peter Zijlstra

On Wed, Nov 07, 2018 at 01:57:38PM +, Patrick Bellasi wrote:
> On 07-Nov 14:16, Peter Zijlstra wrote:

> > Please write cmpxchg loops in the form:
> > 
> > atomic_long_t *ptr = _maps[clamp_id][group_id].adata;
> > union uclamp_map old, new;
> > 
> > old.data = atomic_long_read(ptr);
> > do {
> > new.data = old.data;
> > new.se_cound--;
> > } while (!atomic_long_try_cmpxchg(ptr, , new.data));
> > 
> > 
> > (same for all the others of course)
> 
> Ok, I did that to save some indentation, but actually it's most
> commonly used in a while loop... will update in v6.
> 
> Out of curiosity, apart from code consistency, is that required also
> specifically for any possible compiler related (mis)behavior ?

No; it is just the 'normal' form my brain likes :-)

And the try_cmpxchg() thing is slightly more efficient on x86 vs the
traditional form:

while (cmpxchg(ptr, old, new) != old)

Re: [PATCH v2] slimbus: ngd: QCOM_QMI_HELPERS has to be selected

2018-11-07 Thread Srinivas Kandagatla





On 07/11/18 13:59, Greg KH wrote:

Really?  I do this and then I get this build error on x86:

drivers/slimbus/qcom-ngd-ctrl.c: In function ‘of_qcom_slim_ngd_register’:
drivers/slimbus/qcom-ngd-ctrl.c:1333:63: warning: dereferencing ‘void *’ pointer
   data = of_match_node(qcom_slim_ngd_dt_match, parent->of_node)->data;
^~
drivers/slimbus/qcom-ngd-ctrl.c:1333:63: error: request for member ‘data’ in 
something not a structure or union


So I can't take this, something else must be wrong here...

That is fine!
Yes, there seems to be one more issue here, on non DT platforms 
of_match_node is set to be NULL, which is why we are seeing this error I 
guess!


I will fix this up, do some test and send both the fixes together.

thanks,
srini

Re: [PATCH RFC 1/3] mmc: sdhci: add support for using external DMA devices

2018-11-07 Thread Adrian Hunter

On 5/11/18 5:16 AM, Chunyan Zhang wrote:
> Some standard SD host controller can support both external dma
> controllers as well as ADMA in which the controller acts as
> DMA master.
> 
> Currently the generic SDHCI code supports ADMA/SDMA integrated into
> the host controller but does not have any support for external DMA
> controllers implemented using dmaengine meaning that custom code is
> needed for any systems that use a generic DMA controller with SDHCI.

I have no object to supporting external DMA, but there are a few comments below.

> 
> Signed-off-by: Chunyan Zhang 
> ---
>  drivers/mmc/host/Kconfig |  13 +
>  drivers/mmc/host/sdhci.c | 137 
> ++-
>  drivers/mmc/host/sdhci.h |  12 +
>  3 files changed, 161 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> index 1b58739..c4745d8 100644
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -977,3 +977,16 @@ config MMC_SDHCI_OMAP
> If you have a controller with this interface, say Y or M here.
>  
> If unsure, say N.
> +
> +config MMC_SDHCI_EXTDMA
> +bool "Support external DMA in standard SD host controller"
> + depends on MMC_SDHCI
> + depends on DMA_ENGINE
> + help
> +   This is an option for using external DMA device via dmaengine
> +   framework.
> +
> +   If you have a controller which supports using external DMA device
> +   for data transfer, can say Y.
> +
> +   If unsure, say N.
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index 99bdae5..ffb1d2b 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -14,6 +14,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1309,6 +1310,128 @@ static void sdhci_del_timer(struct sdhci_host *host, 
> struct mmc_request *mrq)
>   del_timer(>timer);
>  }
>  
> +#if IS_ENABLED(CONFIG_MMC_SDHCI_EXTDMA)
> +static int sdhci_extdma_init_chan(struct sdhci_host *host)

I would prefer "external_dma" to extdma e.g.
CONFIG_MMC_SDHCI_EXTERNAL_DMA
sdhci_external_dma_init
sdhci_external_dma_channel
sdhci_external_dma_setup
etc


> +{
> + int ret = 0;
> + struct mmc_host *mmc = host->mmc;
> + struct sdhci_extdma *dma = >extdma;
> +
> + dma->tx_chan = dma_request_chan(mmc->parent, "tx");
> + if (IS_ERR(dma->tx_chan)) {
> + ret = PTR_ERR(dma->tx_chan);
> + dma->tx_chan = NULL;
> + pr_warn("Failed to request TX DMA channel.\n");
> + return ret;
> + }
> +
> + dma->rx_chan = dma_request_chan(mmc->parent, "rx");
> + if (IS_ERR(dma->rx_chan)) {
> + ret = PTR_ERR(dma->rx_chan);
> + if (ret == -EPROBE_DEFER && dma->tx_chan)
> + dma_release_channel(dma->tx_chan);
> +
> + dma->rx_chan = NULL;
> + pr_warn("Failed to request RX DMA channel.\n");
> + }

I guess the channels need to be released in sdhci_cleanup_host() and
sdhci_remove_host()

> +
> + return ret;
> +}
> +
> +static inline struct dma_chan *
> +sdhci_extdma_get_chan(struct sdhci_extdma *dma, struct mmc_data *data)
> +{
> + return data->flags & MMC_DATA_WRITE ? dma->tx_chan : dma->rx_chan;
> +}
> +
> +static int sdhci_extdma_setup(struct sdhci_host *host, struct mmc_command 
> *cmd)
> +{
> + int ret = 0, i;
> + struct dma_async_tx_descriptor *desc;
> + struct mmc_data *data = cmd->data;
> + struct dma_chan *chan;
> + struct dma_slave_config cfg;
> +
> + if (!host->mapbase)
> + return -EINVAL;
> +
> + cfg.src_addr = host->mapbase + SDHCI_BUFFER;
> + cfg.dst_addr = host->mapbase + SDHCI_BUFFER;
> + cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
> + cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
> + cfg.src_maxburst = data->blksz / 4;
> + cfg.dst_maxburst = data->blksz / 4;
> +
> + /* Sanity check: all the SG entries must be aligned by block size. */
> + for (i = 0; i < data->sg_len; i++) {
> + if ((data->sg + i)->length % data->blksz)
> + return -EINVAL;
> + }
> +
> + chan = sdhci_extdma_get_chan(>extdma, data);
> +
> + ret = dmaengine_slave_config(chan, );
> + if (ret)
> + return ret;
> +
> + desc = dmaengine_prep_slave_sg(chan, data->sg, data->sg_len,
> +mmc_get_dma_dir(data),
> +DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> + if (!desc)
> + return -EINVAL;
> +
> + desc->callback = NULL;
> + desc->callback_param = NULL;
> +
> + dmaengine_submit(desc);

Doesn't the DMA need to be cleaned up somewhere if there are transfer errors?

> +
> + return 0;
> +}
> +
> +static void sdhci_extdma_prepare_data(struct sdhci_host *host,
> +   struct mmc_command *cmd)
>

Re: [PATCH trivial] microblaze: Typo s/use use/use/

2018-11-07 Thread Michal Simek

On 07. 11. 18 14:47, Geert Uytterhoeven wrote:
> Signed-off-by: Geert Uytterhoeven 
> ---
>  arch/microblaze/include/asm/pgtable.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/microblaze/include/asm/pgtable.h 
> b/arch/microblaze/include/asm/pgtable.h
> index f64ebb9c9a413535..bdfb2b3182b04c3b 100644
> --- a/arch/microblaze/include/asm/pgtable.h
> +++ b/arch/microblaze/include/asm/pgtable.h
> @@ -200,7 +200,7 @@ static inline pte_t pte_mkspecial(pte_t pte)  { 
> return pte; }
>   * is cleared in the TLB miss handler before the TLB entry is loaded.
>   * - All other bits of the PTE are loaded into TLBLO without
>   *  * modification, leaving us only the bits 20, 21, 24, 25, 26, 30 for
> - * software PTE bits.  We actually use use bits 21, 24, 25, and
> + * software PTE bits.  We actually use bits 21, 24, 25, and
>   * 30 respectively for the software bits: ACCESSED, DIRTY, RW, and
>   * PRESENT.
>   */
> 

Applied.
Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP SoCs




signature.asc
Description: OpenPGP digital signature

Re: Target bindeb-pkg no longer install DTBs

2018-11-07 Thread Rob Herring

On Wed, Nov 7, 2018 at 12:10 AM Nuno Gonçalves  wrote:
>
> Hi,
>
> Since 37c8a5fafa3bb7dcdd51774be353be6cb2912b86 [kbuild: consolidate
> Devicetree dtb build rules], the target bindeb-pkg no longer installs
> DTBs in the .deb package.

I see the problem and am testing a fix. Will post it soon.

Rob

Re: [PATCH 6/6] device property: Remove struct property_set

2018-11-07 Thread Heikki Krogerus

On Tue, Nov 06, 2018 at 04:46:41PM +0200, Andy Shevchenko wrote:
> On Mon, Nov 05, 2018 at 05:59:28PM +0300, Heikki Krogerus wrote:
> > Replacing struct property_set with the software nodes that
> > were just introduced.
> > 
> > The API and functionality for adding properties to devices
> > remains the same, however, the goal is to convert the
> > drivers to use the API for software nodes when the device
> > has no real firmware node, and use the old API only when
> > "extra" build-in properties are needed.
> 
> It might be slightly easier to review if you preserve the ordering of
> functions, i.e. device_add_properties() and device_remove_properties().

True, that change does not belong to this series.

thanks,

-- 
heikki

Re: [PATCH 3/4] of/property: Introduce of_fwnode_name()

2018-11-07 Thread Heikki Krogerus

On Tue, Nov 06, 2018 at 12:13:30PM -0600, Rob Herring wrote:
> On Tue, Nov 6, 2018 at 9:53 AM Andy Shevchenko
>  wrote:
> >
> > On Tue, Nov 06, 2018 at 05:05:03PM +0200, Heikki Krogerus wrote:
> >
> > > Maybe it would be best to just read the "name" device property in
> > > fwnode_name() and not have of_fwnode_name at all.
> >
> > If it's a mandatory property or somehow its presence is guaranteed, it 
> > would work.
> 
> It is currently, but after removing the name ptr, my current plan is
> to remove the 'name' property too for FDT. On real OpenFirmware, it is
> a real property so it will remain for sure in some cases.

OK. I think we'll use the full_name after all.

thanks,

-- 
heikki

[PATCH v12 5/5] clk: imx: Add clock driver for i.MX8MQ CCM

2018-11-07 Thread Abel Vesa

Add driver for the Clock Control Module found on i.MX8MQ.

Signed-off-by: Anson Huang 
Signed-off-by: Bai Ping 
Signed-off-by: Lucas Stach 
Signed-off-by: Abel Vesa 
Reviewed-by: Sascha Hauer 
---
 drivers/clk/imx/Makefile |   1 +
 drivers/clk/imx/clk-imx8mq.c | 589 +++
 drivers/clk/imx/clk.h|  36 +++
 3 files changed, 626 insertions(+)
 create mode 100644 drivers/clk/imx/clk-imx8mq.c

diff --git a/drivers/clk/imx/Makefile b/drivers/clk/imx/Makefile
index 237444b..6952f05 100644
--- a/drivers/clk/imx/Makefile
+++ b/drivers/clk/imx/Makefile
@@ -29,4 +29,5 @@ obj-$(CONFIG_SOC_IMX6SLL) += clk-imx6sll.o
 obj-$(CONFIG_SOC_IMX6SX) += clk-imx6sx.o
 obj-$(CONFIG_SOC_IMX6UL) += clk-imx6ul.o
 obj-$(CONFIG_SOC_IMX7D)  += clk-imx7d.o
+obj-$(CONFIG_SOC_IMX8MQ) += clk-imx8mq.o
 obj-$(CONFIG_SOC_VF610)  += clk-vf610.o
diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
new file mode 100644
index 000..26b57f4
--- /dev/null
+++ b/drivers/clk/imx/clk-imx8mq.c
@@ -0,0 +1,589 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2018 NXP.
+ * Copyright (C) 2017 Pengutronix, Lucas Stach 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "clk.h"
+
+static u32 share_count_sai1;
+static u32 share_count_sai2;
+static u32 share_count_sai3;
+static u32 share_count_sai4;
+static u32 share_count_sai5;
+static u32 share_count_sai6;
+static u32 share_count_dcss;
+static u32 share_count_nand;
+
+static struct clk *clks[IMX8MQ_CLK_END];
+
+static const char *pll_ref_sels[] = { "osc_25m", "osc_27m", "dummy", "dummy", 
};
+static const char *arm_pll_bypass_sels[] = {"arm_pll", "arm_pll_ref_sel", };
+static const char *gpu_pll_bypass_sels[] = {"gpu_pll", "gpu_pll_ref_sel", };
+static const char *vpu_pll_bypass_sels[] = {"vpu_pll", "vpu_pll_ref_sel", };
+static const char *audio_pll1_bypass_sels[] = {"audio_pll1", 
"audio_pll1_ref_sel", };
+static const char *audio_pll2_bypass_sels[] = {"audio_pll2", 
"audio_pll2_ref_sel", };
+static const char *video_pll1_bypass_sels[] = {"video_pll1", 
"video_pll1_ref_sel", };
+
+static const char *sys1_pll1_out_sels[] = {"sys1_pll1", "sys1_pll1_ref_sel", };
+static const char *sys2_pll1_out_sels[] = {"sys2_pll1", "sys1_pll1_ref_sel", };
+static const char *sys3_pll1_out_sels[] = {"sys3_pll1", "sys3_pll1_ref_sel", };
+static const char *dram_pll1_out_sels[] = {"dram_pll1", "dram_pll1_ref_sel", };
+
+static const char *sys1_pll2_out_sels[] = {"sys1_pll2_div", 
"sys1_pll1_ref_sel", };
+static const char *sys2_pll2_out_sels[] = {"sys2_pll2_div", 
"sys2_pll1_ref_sel", };
+static const char *sys3_pll2_out_sels[] = {"sys3_pll2_div", 
"sys2_pll1_ref_sel", };
+static const char *dram_pll2_out_sels[] = {"dram_pll2_div", 
"dram_pll1_ref_sel", };
+
+/* CCM ROOT */
+static const char *imx8mq_a53_sels[] = {"osc_25m", "arm_pll_out", 
"sys2_pll_500m", "sys2_pll_1000m",
+   "sys1_pll_800m", "sys1_pll_400m", 
"audio_pll1_out", "sys3_pll2_out", };
+
+static const char *imx8mq_vpu_sels[] = {"osc_25m", "arm_pll_out", 
"sys2_pll_500m", "sys2_pll_1000m",
+   "sys1_pll_800m", "sys1_pll_400m", 
"audio_pll1_out", "vpu_pll_out", };
+
+static const char *imx8mq_gpu_core_sels[] = {"osc_25m", "gpu_pll_out", 
"sys1_pll_800m", "sys3_pll2_out",
+"sys2_pll_1000m", 
"audio_pll1_out", "video_pll1_out", "audio_pll2_out", };
+
+static const char *imx8mq_gpu_shader_sels[] = {"osc_25m", "gpu_pll_out", 
"sys1_pll_800m", "sys3_pll2_out",
+  "sys2_pll_1000m", 
"audio_pll1_out", "video_pll1_out", "audio_pll2_out", };
+
+static const char *imx8mq_main_axi_sels[] = {"osc_25m", "sys2_pll_333m", 
"sys1_pll_800m", "sys2_pll_250m",
+"sys2_pll_1000m", 
"audio_pll1_out", "video_pll1_out", "sys1_pll_100m",};
+
+static const char *imx8mq_enet_axi_sels[] = {"osc_25m", "sys1_pll_266m", 
"sys1_pll_800m", "sys2_pll_250m",
+"sys2_pll_200m", "audio_pll1_out", 
"video_pll1_out", "sys3_pll2_out", };
+
+static const char *imx8mq_nand_usdhc_sels[] = {"osc_25m", "sys1_pll_266m", 
"sys1_pll_800m", "sys2_pll_200m",
+  "sys1_pll_133m", 
"sys3_pll2_out", "sys2_pll_250m", "audio_pll1_out", };
+
+static const char *imx8mq_vpu_bus_sels[] = {"osc_25m", "sys1_pll_800m", 
"vpu_pll_out", "audio_pll2_out", "sys3_pll2_out", "sys2_pll_1000m", 
"sys2_pll_200m", "sys1_pll_100m", };
+
+static const char *imx8mq_disp_axi_sels[] = {"osc_25m", "sys2_pll_125m", 
"sys1_pll_800m", "sys3_pll2_out", "sys1_pll_400m", "audio_pll2_out", 
"clk_ext1", "clk_ext4", };
+
+static const char *imx8mq_disp_apb_sels[] = {"osc_25m", "sys2_pll_125m", 
"sys1_pll_800m", "sys3_pll2_out",
+"sys1_pll_40m", "audio_pll2_out", 
"clk_ext1", "clk_ext3",

[PATCH v12 1/5] dt-bindings: add binding for i.MX8MQ CCM

2018-11-07 Thread Abel Vesa

From: Lucas Stach 

This adds the binding for the i.MX8MQ Clock Controller Module.

Signed-off-by: Lucas Stach 
Signed-off-by: Abel Vesa 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/clock/imx8mq-clock.txt |  20 ++
 include/dt-bindings/clock/imx8mq-clock.h   | 395 +
 2 files changed, 415 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/clock/imx8mq-clock.txt
 create mode 100644 include/dt-bindings/clock/imx8mq-clock.h

diff --git a/Documentation/devicetree/bindings/clock/imx8mq-clock.txt 
b/Documentation/devicetree/bindings/clock/imx8mq-clock.txt
new file mode 100644
index 000..52de826
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx8mq-clock.txt
@@ -0,0 +1,20 @@
+* Clock bindings for NXP i.MX8M Quad
+
+Required properties:
+- compatible: Should be "fsl,imx8mq-ccm"
+- reg: Address and length of the register set
+- #clock-cells: Should be <1>
+- clocks: list of clock specifiers, must contain an entry for each required
+  entry in clock-names
+- clock-names: should include the following entries:
+- "ckil"
+- "osc_25m"
+- "osc_27m"
+- "clk_ext1"
+- "clk_ext2"
+- "clk_ext3"
+- "clk_ext4"
+
+The clock consumer should specify the desired clock by having the clock
+ID in its "clocks" phandle cell.  See include/dt-bindings/clock/imx8mq-clock.h
+for the full list of i.MX8M Quad clock IDs.
diff --git a/include/dt-bindings/clock/imx8mq-clock.h 
b/include/dt-bindings/clock/imx8mq-clock.h
new file mode 100644
index 000..b53be41
--- /dev/null
+++ b/include/dt-bindings/clock/imx8mq-clock.h
@@ -0,0 +1,395 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2016 Freescale Semiconductor, Inc.
+ * Copyright 2017 NXP
+ */
+
+#ifndef __DT_BINDINGS_CLOCK_IMX8MQ_H
+#define __DT_BINDINGS_CLOCK_IMX8MQ_H
+
+#define IMX8MQ_CLK_DUMMY   0
+#define IMX8MQ_CLK_32K 1
+#define IMX8MQ_CLK_25M 2
+#define IMX8MQ_CLK_27M 3
+#define IMX8MQ_CLK_EXT14
+#define IMX8MQ_CLK_EXT25
+#define IMX8MQ_CLK_EXT36
+#define IMX8MQ_CLK_EXT47
+
+/* ANAMIX PLL clocks */
+/* FRAC PLLs */
+/* ARM PLL */
+#define IMX8MQ_ARM_PLL_REF_SEL 8
+#define IMX8MQ_ARM_PLL_REF_DIV 9
+#define IMX8MQ_ARM_PLL 10
+#define IMX8MQ_ARM_PLL_BYPASS  11
+#define IMX8MQ_ARM_PLL_OUT 12
+
+/* GPU PLL */
+#define IMX8MQ_GPU_PLL_REF_SEL 13
+#define IMX8MQ_GPU_PLL_REF_DIV 14
+#define IMX8MQ_GPU_PLL 15
+#define IMX8MQ_GPU_PLL_BYPASS  16
+#define IMX8MQ_GPU_PLL_OUT 17
+
+/* VPU PLL */
+#define IMX8MQ_VPU_PLL_REF_SEL 18
+#define IMX8MQ_VPU_PLL_REF_DIV 19
+#define IMX8MQ_VPU_PLL 20
+#define IMX8MQ_VPU_PLL_BYPASS  21
+#define IMX8MQ_VPU_PLL_OUT 22
+
+/* AUDIO PLL1 */
+#define IMX8MQ_AUDIO_PLL1_REF_SEL  23
+#define IMX8MQ_AUDIO_PLL1_REF_DIV  24
+#define IMX8MQ_AUDIO_PLL1  25
+#define IMX8MQ_AUDIO_PLL1_BYPASS   26
+#define IMX8MQ_AUDIO_PLL1_OUT  27
+
+/* AUDIO PLL2 */
+#define IMX8MQ_AUDIO_PLL2_REF_SEL  28
+#define IMX8MQ_AUDIO_PLL2_REF_DIV  29
+#define IMX8MQ_AUDIO_PLL2  30
+#define IMX8MQ_AUDIO_PLL2_BYPASS   31
+#define IMX8MQ_AUDIO_PLL2_OUT  32
+
+/* VIDEO PLL1 */
+#define IMX8MQ_VIDEO_PLL1_REF_SEL  33
+#define IMX8MQ_VIDEO_PLL1_REF_DIV  34
+#define IMX8MQ_VIDEO_PLL1  35
+#define IMX8MQ_VIDEO_PLL1_BYPASS   36
+#define IMX8MQ_VIDEO_PLL1_OUT  37
+
+/* SYS1 PLL */
+#define IMX8MQ_SYS1_PLL1_REF_SEL   38
+#define IMX8MQ_SYS1_PLL1_REF_DIV   39
+#define IMX8MQ_SYS1_PLL1   40
+#define IMX8MQ_SYS1_PLL1_OUT   41
+#define IMX8MQ_SYS1_PLL1_OUT_DIV   42
+#define IMX8MQ_SYS1_PLL2   43
+#define IMX8MQ_SYS1_PLL2_DIV   44
+#define IMX8MQ_SYS1_PLL2_OUT   45
+
+/* SYS2 PLL */
+#define IMX8MQ_SYS2_PLL1_REF_SEL   46
+#define IMX8MQ_SYS2_PLL1_REF_DIV   47
+#define IMX8MQ_SYS2_PLL1   48
+#define IMX8MQ_SYS2_PLL1_OUT   49
+#define IMX8MQ_SYS2_PLL1_OUT_DIV   50
+#define IMX8MQ_SYS2_PLL2   51
+#define IMX8MQ_SYS2_PLL2_DIV   52
+#define IMX8MQ_SYS2_PLL2_OUT   53
+
+/* SYS3 PLL */
+#define IMX8MQ_SYS3_PLL1_REF_SEL   54
+#define IMX8MQ_SYS3_PLL1_REF_DIV   55
+#define IMX8MQ_SYS3_PLL1   56
+#define IMX8MQ_SYS3_PLL1_OUT   57
+#define IMX8MQ_SYS3_PLL1_OUT_DIV   58
+#define IMX8MQ_SYS3_PLL2   59
+#define IMX8MQ_SYS3_PLL2_DIV   60
+#define IMX8MQ_SYS3_PLL2_OUT   61
+
+/* DRAM PLL */
+#define IMX8MQ_DRAM_PLL1_REF_SEL   62
+#define IMX8MQ_DRAM_PLL1_REF_DIV   63
+#define IMX8MQ_DRAM_PLL1   64
+#define IMX8MQ_DRAM_PLL1_OUT   65
+#define IMX8MQ_DRAM_PLL1_OUT_DIV   66
+#define IMX8MQ_DRAM_PLL2

[PATCH v12 0/5] Add i.MX8MQ clock driver

2018-11-07 Thread Abel Vesa

So this is all cleaned up now. The switch from clk to clk_hw registration
is done only for the newly added clock types because changing the older
ones will imply a bigger change. I will spend some time on that, but this
can't be delayed by that since this is needed in order to boot up.

Here is a link to the 11th version:
https://lkml.org/lkml/2018/10/10/292

Changes since v11:
 * changed the authorship of the CCM driver
 since it has changed drastically from the original version.
 * changed the CCM driver to platform driver
 * changed all new clock types to clk_hw based registration
 * fixed the all other comments from Stephen

Abel Vesa (2):
  clk: imx: Add imx composite clock
  clk: imx: Add clock driver for i.MX8MQ CCM

Lucas Stach (3):
  dt-bindings: add binding for i.MX8MQ CCM
  clk: imx: add fractional PLL output clock
  clk: imx: Add SCCG PLL type

 .../devicetree/bindings/clock/imx8mq-clock.txt |  20 +
 drivers/clk/imx/Makefile   |   6 +-
 drivers/clk/imx/clk-composite-8m.c | 178 +++
 drivers/clk/imx/clk-frac-pll.c | 221 
 drivers/clk/imx/clk-imx8mq.c   | 589 +
 drivers/clk/imx/clk-sccg-pll.c | 256 +
 drivers/clk/imx/clk.h  |  64 +++
 include/dt-bindings/clock/imx8mq-clock.h   | 395 ++
 8 files changed, 1728 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/clock/imx8mq-clock.txt
 create mode 100644 drivers/clk/imx/clk-composite-8m.c
 create mode 100644 drivers/clk/imx/clk-frac-pll.c
 create mode 100644 drivers/clk/imx/clk-imx8mq.c
 create mode 100644 drivers/clk/imx/clk-sccg-pll.c
 create mode 100644 include/dt-bindings/clock/imx8mq-clock.h

-- 
2.7.4

[PATCH v12 3/5] clk: imx: Add SCCG PLL type

2018-11-07 Thread Abel Vesa

From: Lucas Stach 

The SCCG is a new PLL type introduced on i.MX8.

The description of this SCCG clock can be found here:

https://www.nxp.com/docs/en/reference-manual/IMX8MDQLQRM.pdf#page=834

Signed-off-by: Lucas Stach 
Signed-off-by: Abel Vesa 
Reviewed-by: Sascha Hauer 
---
 drivers/clk/imx/Makefile   |   3 +-
 drivers/clk/imx/clk-sccg-pll.c | 256 +
 drivers/clk/imx/clk.h  |   9 ++
 3 files changed, 267 insertions(+), 1 deletion(-)
 create mode 100644 drivers/clk/imx/clk-sccg-pll.c

diff --git a/drivers/clk/imx/Makefile b/drivers/clk/imx/Makefile
index 4893c1f..b87513c 100644
--- a/drivers/clk/imx/Makefile
+++ b/drivers/clk/imx/Makefile
@@ -12,7 +12,8 @@ obj-y += \
clk-pllv1.o \
clk-pllv2.o \
clk-pllv3.o \
-   clk-pfd.o
+   clk-pfd.o \
+   clk-sccg-pll.o
 
 obj-$(CONFIG_SOC_IMX1)   += clk-imx1.o
 obj-$(CONFIG_SOC_IMX21)  += clk-imx21.o
diff --git a/drivers/clk/imx/clk-sccg-pll.c b/drivers/clk/imx/clk-sccg-pll.c
new file mode 100644
index 000..4666b96
--- /dev/null
+++ b/drivers/clk/imx/clk-sccg-pll.c
@@ -0,0 +1,256 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+/*
+ * Copyright 2018 NXP.
+ *
+ * This driver supports the SCCG plls found in the imx8m SOCs
+ *
+ * Documentation for this SCCG pll can be found at:
+ *   https://www.nxp.com/docs/en/reference-manual/IMX8MDQLQRM.pdf#page=834
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "clk.h"
+
+/* PLL CFGs */
+#define PLL_CFG0   0x0
+#define PLL_CFG1   0x4
+#define PLL_CFG2   0x8
+
+#define PLL_DIVF1_MASK GENMASK(18, 13)
+#define PLL_DIVF2_MASK GENMASK(12, 7)
+#define PLL_DIVR1_MASK GENMASK(27, 25)
+#define PLL_DIVR2_MASK GENMASK(24, 19)
+#define PLL_REF_MASK   GENMASK(2, 0)
+
+#define PLL_LOCK_MASK  BIT(31)
+#define PLL_PD_MASKBIT(7)
+
+#define OSC_25M2500
+#define OSC_27M2700
+
+#define PLL_SCCG_LOCK_TIMEOUT  70
+
+struct clk_sccg_pll {
+   struct clk_hw   hw;
+   void __iomem*base;
+};
+
+#define to_clk_sccg_pll(_hw) container_of(_hw, struct clk_sccg_pll, hw)
+
+static int clk_pll_wait_lock(struct clk_sccg_pll *pll)
+{
+   u32 val;
+
+   return readl_poll_timeout(pll->base, val, val & PLL_LOCK_MASK, 0,
+   PLL_SCCG_LOCK_TIMEOUT);
+}
+
+static int clk_pll1_is_prepared(struct clk_hw *hw)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   return (val & PLL_PD_MASK) ? 0 : 1;
+}
+
+static unsigned long clk_pll1_recalc_rate(struct clk_hw *hw,
+unsigned long parent_rate)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val, divf;
+
+   val = readl_relaxed(pll->base + PLL_CFG2);
+   divf = FIELD_GET(PLL_DIVF1_MASK, val);
+
+   return parent_rate * 2 * (divf + 1);
+}
+
+static long clk_pll1_round_rate(struct clk_hw *hw, unsigned long rate,
+  unsigned long *prate)
+{
+   unsigned long parent_rate = *prate;
+   u32 div;
+
+   if (!parent_rate)
+   return 0;
+
+   div = rate / (parent_rate * 2);
+
+   return parent_rate * div * 2;
+}
+
+static int clk_pll1_set_rate(struct clk_hw *hw, unsigned long rate,
+   unsigned long parent_rate)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val;
+   u32 divf;
+
+   if (!parent_rate)
+   return -EINVAL;
+
+   divf = rate / (parent_rate * 2);
+
+   val = readl_relaxed(pll->base + PLL_CFG2);
+   val &= ~PLL_DIVF1_MASK;
+   val |= FIELD_PREP(PLL_DIVF1_MASK, divf - 1);
+   writel_relaxed(val, pll->base + PLL_CFG2);
+
+   return clk_pll_wait_lock(pll);
+}
+
+static int clk_pll1_prepare(struct clk_hw *hw)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   val &= ~PLL_PD_MASK;
+   writel_relaxed(val, pll->base + PLL_CFG0);
+
+   return clk_pll_wait_lock(pll);
+}
+
+static void clk_pll1_unprepare(struct clk_hw *hw)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   val |= PLL_PD_MASK;
+   writel_relaxed(val, pll->base + PLL_CFG0);
+
+}
+
+static unsigned long clk_pll2_recalc_rate(struct clk_hw *hw,
+unsigned long parent_rate)
+{
+   struct clk_sccg_pll *pll = to_clk_sccg_pll(hw);
+   u32 val, ref, divr1, divf1, divr2, divf2;
+   u64 temp64;
+
+   val = readl_relaxed(pll->base + PLL_CFG0);
+   switch (FIELD_GET(PLL_REF_MASK, val)) {
+   case 0:
+   ref = OSC_25M;
+   break;
+   case 1:
+   ref = OSC_27M;
+

Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page

2018-11-07 Thread Liran Alon




> On 7 Nov 2018, at 14:10, Alexander Potapenko  wrote:
> 
> On Wed, Nov 7, 2018 at 2:38 AM syzbot
>  wrote:
>> 
>> Hello,
>> 
>> syzbot found the following crash on:
>> 
>> HEAD commit:88b95ef4c780 kmsan: use MSan assembly instrumentation
>> git tree:   
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_google_kmsan.git_master=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=A5G9_UvV5TpBoJitbGWS2VXmfVMXFUgggq64QM-6nqc=
>> console output: 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D12505e3340=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=ZGVw04dRMjYdKZf4amN1yl3IheljZZq6V9h3mO9U-vM=
>> kernel config:  
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3D8df5fc509a1b351b=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=OUIhmSMzBSMhswtebZqyLnc6SkAagVPBGr0EmCLL2Fg=
>> dashboard link: 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3Dded1696f6b50b615b630=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=jeCou6OWbHHIf190FBGsLr1wGsDvBWlpKnfNMmqGIqI=
>> compiler:   clang version 8.0.0 (trunk 343298)
>> syz repro:  
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D15ce62f540=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=PVi1m-uNP3XRO4XbDZJicGiqXdVmOPFDxCP20jmXuAs=
>> C reproducer:   
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.c-3Fx-3D174efca340=DwIFaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I=XzvJtd3__2LEBvhAi4F6RTLbxV9mELkY07bMDSDLRMc=
>> 
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+ded1696f6b50b615b...@syzkaller.appspotmail.com
>> 
>> ==
>> BUG: KMSAN: kernel-infoleak in __copy_to_user include/linux/uaccess.h:121
>> [inline]
>> BUG: KMSAN: kernel-infoleak in __kvm_write_guest_page
>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 [inline]
>> BUG: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page+0x39a/0x510
>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870
>> CPU: 0 PID: 7918 Comm: syz-executor542 Not tainted 4.19.0+ #77
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x32d/0x480 lib/dump_stack.c:113
>>  kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:911
>>  kmsan_internal_check_memory+0x34c/0x430 mm/kmsan/kmsan.c:991
>>  kmsan_copy_to_user+0x85/0xe0 mm/kmsan/kmsan_hooks.c:552
>>  __copy_to_user include/linux/uaccess.h:121 [inline]
>>  __kvm_write_guest_page arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849
>> [inline]
>>  kvm_vcpu_write_guest_page+0x39a/0x510
>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870
>>  nested_release_vmcs12 arch/x86/kvm/vmx.c:8441 [inline]
>>  handle_vmptrld+0x2384/0x26b0 arch/x86/kvm/vmx.c:8907
>>  vmx_handle_exit+0x1e81/0xbac0 arch/x86/kvm/vmx.c:10128
>>  vcpu_enter_guest arch/x86/kvm/x86.c:7667 [inline]
>>  vcpu_run arch/x86/kvm/x86.c:7730 [inline]
>>  kvm_arch_vcpu_ioctl_run+0xac32/0x11d80 arch/x86/kvm/x86.c:7930
>>  kvm_vcpu_ioctl+0xfb1/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2590
>>  do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46
>>  ksys_ioctl fs/ioctl.c:702 [inline]
>>  __do_sys_ioctl fs/ioctl.c:709 [inline]
>>  __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707
>>  __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707
>>  do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
>>  entry_SYSCALL_64_after_hwframe+0x63/0xe7
>> RIP: 0033:0x44b6e9
>> Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
>> ff 0f 83 2b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00
>> RSP: 002b:7f096b292ce8 EFLAGS: 0206 ORIG_RAX: 0010
>> RAX: ffda RBX: 006e3c48 RCX: 0044b6e9
>> RDX:  RSI: ae80 RDI: 0005
>> RBP: 006e3c40 R08:  R09: 
>> R10:  R11: 0206 R12: 006e3c4c
>> R13: 7ffd978aeb2f R14: 7f096b2939c0 R15: 006e3d4c
>> 
>> Uninit was created at:
>>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:252 [inline]
>>  kmsan_internal_poison_shadow+0xc8/0x1e0 mm/kmsan/kmsan.c:177
>>  kmsan_kmalloc+0x98/0x110

Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages

2018-11-07 Thread Balbir Singh

On Wed, Nov 07, 2018 at 08:35:48AM +0100, Michal Hocko wrote:
> On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote:
> > > From: Michal Hocko 
> > > 
> > > Page state checks are racy. Under a heavy memory workload (e.g. stress
> > > -m 200 -t 2h) it is quite easy to hit a race window when the page is
> > > allocated but its state is not fully populated yet. A debugging patch to
> > > dump the struct page state shows
> > > : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> > > : [  476.582103] page:ea0437fb count:1 mapcount:1 
> > > mapping:880e05239841 index:0x7f26e5000 compound_mapcount: 1
> > > : [  476.592645] flags: 
> > > 0x5fc0090034(uptodate|lru|active|head|swapbacked)
> > > 
> > > Note that the state has been checked for both PageLRU and PageSwapBacked
> > > already. Closing this race completely would require some sort of retry
> > > logic. This can be tricky and error prone (think of potential endless
> > > or long taking loops).
> > > 
> > > Workaround this problem for movable zones at least. Such a zone should
> > > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> > > has_unmovable_pages more robust") has told us that this is not strictly
> > > true though. Bootmem pages should be marked reserved though so we can
> > > move the original check after the PageReserved check. Pages from other
> > > zones are still prone to races but we even do not pretend that memory
> > > hotremove works for those so pre-mature failure doesn't hurt that much.
> > > 
> > > Reported-and-tested-by: Baoquan He 
> > > Acked-by: Baoquan He 
> > > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> > > Signed-off-by: Michal Hocko 
> > > ---
> > >
> > > Hi,
> > > this has been reported [1] and we have tried multiple things to address
> > > the issue. The only reliable way was to reintroduce the movable zone
> > > check into has_unmovable_pages. This time it should be safe also for
> > > the bug originally fixed by 15c30bc09085.
> > > 
> > > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
> > >  mm/page_alloc.c | 8 
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 863d46da6586..c6d900ee4982 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct 
> > > page *page, int count,
> > >   if (PageReserved(page))
> > >   goto unmovable;
> > >  
> > > + /*
> > > +  * If the zone is movable and we have ruled out all reserved
> > > +  * pages then it should be reasonably safe to assume the rest
> > > +  * is movable.
> > > +  */
> > > + if (zone_idx(zone) == ZONE_MOVABLE)
> > > + continue;
> > > +
> > >   /*
> > 
> > 
> > There is a WARN_ON() in case of failure at the end of the routine,
> > is that triggered when we hit the bug? If we're adding this patch,
> > the WARN_ON needs to go as well.
> 
> No the warning should stay in case we encounter reserved pages in zone
> movable.
>

Fair enough!
 
> > The check seems to be quite aggressive and in a loop that iterates
> > pages, but has nothing to do with the page, did you mean to make
> > the check
> > 
> > zone_idx(page_zone(page)) == ZONE_MOVABLE
> 
> Does it make any difference? Can we actually encounter a page from a
> different zone here?
> 

Just to avoid page state related issues, do we want to go ahead
with the migration if zone_idx(page_zone(page)) != ZONE_MOVABLE.

> > it also skips all checks for pinned pages and other checks
> 
> Yes, this is intentional and the comment tries to explain why. I wish we
> could be add a more specific checks for movable pages - e.g. detect long
> term pins that would prevent migration - but we do not have any facility
> for that. Please note that the worst case of a false positive is a
> repeated migration failure and user has a way to break out of migration
> by a signal.
>

Basically isolate_pages() will fail as opposed to hotplug failing upfront.
The basic assertion this patch makes is that all ZONE_MOVABLE pages that
are not reserved are hotpluggable.

Balbir Singh.

Re: [PATCH v1 1/2] bus: mc-bus: Add support for mapping shareable portals

2018-11-07 Thread Laurentiu Tudor

Hi Roy,
On 30.10.2018 22:30, Roy Pledge wrote:
> Starting with v5 of NXP QBMan devices the hardware supports using
> regular cacheable/shareable memory as the backing store for the
> portals.
> 
> This patch adds support for the new portal mode by switching to
> use the DPRC get object region v2 command which returns both
> a base address and offset for the portal memory. The new portal
> region is identified as shareable through the addition of a new
> flag.
> 
> Signed-off-by: Roy Pledge 
> ---
>   drivers/bus/fsl-mc/dprc.c   |  3 ++-
>   drivers/bus/fsl-mc/fsl-mc-bus.c | 14 --
>   drivers/bus/fsl-mc/fsl-mc-private.h | 17 ++---
>   3 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/bus/fsl-mc/dprc.c b/drivers/bus/fsl-mc/dprc.c
> index 1c3f621..bde856d 100644
> --- a/drivers/bus/fsl-mc/dprc.c
> +++ b/drivers/bus/fsl-mc/dprc.c
> @@ -461,8 +461,9 @@ int dprc_get_obj_region(struct fsl_mc_io *mc_io,
>   
>   /* retrieve response parameters */
>   rsp_params = (struct dprc_rsp_get_obj_region *)cmd.params;
> - region_desc->base_offset = le64_to_cpu(rsp_params->base_addr);
> + region_desc->base_offset = le64_to_cpu(rsp_params->base_offset);
>   region_desc->size = le32_to_cpu(rsp_params->size);
> + region_desc->base_address = le64_to_cpu(rsp_params->base_addr);
>   
>   return 0;
>   }
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index f0404c6..25ad422 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -487,10 +487,18 @@ static int fsl_mc_device_get_mmio_regions(struct 
> fsl_mc_device *mc_dev,
>   "dprc_get_obj_region() failed: %d\n", error);
>   goto error_cleanup_regions;
>   }
> -
> - error = translate_mc_addr(mc_dev, mc_region_type,
> + /* Older MC only returned region offset and no base address
> +  * If base address is in the region_desc use it otherwise
> +  * revert to old mechanism
> +  */
> + if (region_desc.base_address)
> + regions[i].start = region_desc.base_address +
> + region_desc.base_offset;
> + else
> + error = translate_mc_addr(mc_dev, mc_region_type,
> region_desc.base_offset,
> [i].start);
> +
>   if (error < 0) {
>   dev_err(parent_dev,
>   "Invalid MC offset: %#x (for %s.%d\'s region 
> %d)\n",
> @@ -504,6 +512,8 @@ static int fsl_mc_device_get_mmio_regions(struct 
> fsl_mc_device *mc_dev,
>   regions[i].flags = IORESOURCE_IO;
>   if (region_desc.flags & DPRC_REGION_CACHEABLE)
>   regions[i].flags |= IORESOURCE_CACHEABLE;
> + if (region_desc.flags & DPRC_REGION_SHAREABLE)
> + regions[i].flags |= IORESOURCE_MEM;
>   }
>   
>   mc_dev->regions = regions;
> diff --git a/drivers/bus/fsl-mc/fsl-mc-private.h 
> b/drivers/bus/fsl-mc/fsl-mc-private.h
> index ea11b4f..28e40d1 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-private.h
> +++ b/drivers/bus/fsl-mc/fsl-mc-private.h
> @@ -79,9 +79,11 @@ int dpmcp_reset(struct fsl_mc_io *mc_io,
>   
>   /* DPRC command versioning */
>   #define DPRC_CMD_BASE_VERSION   1
> +#define DPRC_CMD_2ND_VERSION 2
>   #define DPRC_CMD_ID_OFFSET  4
>   
>   #define DPRC_CMD(id)(((id) << DPRC_CMD_ID_OFFSET) | 
> DPRC_CMD_BASE_VERSION)
> +#define DPRC_CMD_V2(id)  (((id) << DPRC_CMD_ID_OFFSET) | 
> DPRC_CMD_2ND_VERSION)
>   
>   /* DPRC command IDs */
>   #define DPRC_CMDID_CLOSEDPRC_CMD(0x800)
> @@ -99,7 +101,7 @@ int dpmcp_reset(struct fsl_mc_io *mc_io,
>   #define DPRC_CMDID_GET_CONT_ID  DPRC_CMD(0x830)
>   #define DPRC_CMDID_GET_OBJ_COUNTDPRC_CMD(0x159)
>   #define DPRC_CMDID_GET_OBJ  DPRC_CMD(0x15A)
> -#define DPRC_CMDID_GET_OBJ_REG  DPRC_CMD(0x15E)
> +#define DPRC_CMDID_GET_OBJ_REG  DPRC_CMD_V2(0x15E)

I see you're bumping this command's version to v2. Will we still be 
compatible with older MC firmware versions?

---
Best Regards, Laurentiu

Re: [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai  wrote:
> We take global fiq->waitq.lock every time, when we are
> in this function, but interrupted requests are just small
> subset of all requests. This patch optimizes request_end()
> and makes it to take the lock when it's really needed.
>
> queue_interrupt() needs small change for that. After req
> is linked to interrupt list, we do smp_mb() and check
> for FR_FINISHED again. In case of FR_FINISHED bit has
> appeared, we remove req and leave the function:
>
> CPU 0CPU 1
> queue_interrupt()request_end()
>
>   spin_lock(>waitq.lock)
>
>
>   list_add_tail(>intr_entry, >interrupts)
> test_and_set_bit(FR_FINISHED, >flags)
>   smp_mb()  implied test_and_set_bit()>
>   if (test_bit(FR_FINISHED, >flags))  if 
> (!list_empty(>intr_entry))
>
> list_del_init(>intr_entry)  
> spin_lock(>waitq.lock)
>  
> list_del_init(>intr_entry)
>
> Check the change is visible in perf report:
>
> 1)Firstly mount fusexmp_fh:
>   $fuse/example/.libs/fusexmp_fh mnt
>
> 2)Run test doing
> futimes(fd, tv1);
> futimes(fd, tv2);
>   in many threads endlessly.
>
> 3)perf record -g (all the system load)
>
> Without the patch in request_end() we spend 62.58% of do_write() time:
> (= 12.58 / 20.10 * 100%)
>
>55,22% entry_SYSCALL_64
>  20,10% do_writev
>   ...
>   18,08% fuse_dev_do_write
>12,58% request_end
> 10,08% __wake_up_common_lock
> 1,97% queued_spin_lock_slowpath
>1,31% fuse_copy_args
>1,04% fuse_copy_one
>0,85% queued_spin_lock_slowpath
>
> With the patch, the perf report becomes better, and only 58.16%
> of do_write() time we spend in request_end():
>
>54,15% entry_SYSCALL_64
>  18,24% do_writev
>   ...
>   16,25% fuse_dev_do_write
>10,61% request_end
> 10,25% __wake_up_common_lock
>1,34% fuse_copy_args
>1,06% fuse_copy_one
>0,88% queued_spin_lock_slowpath
>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dev.c |   30 ++
>  1 file changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 7705f75c77a3..391498e680ec 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -427,10 +427,16 @@ static void request_end(struct fuse_conn *fc, struct 
> fuse_req *req)
>
> if (test_and_set_bit(FR_FINISHED, >flags))
> goto put_request;
> -
> -   spin_lock(>waitq.lock);
> -   list_del_init(>intr_entry);
> -   spin_unlock(>waitq.lock);
> +   /*
> +* test_and_set_bit() implies smp_mb() between bit
> +* changing and below intr_entry check. Pairs with
> +* smp_mb() from queue_interrupt().
> +*/
> +   if (!list_empty(>intr_entry)) {
> +   spin_lock(>waitq.lock);
> +   list_del_init(>intr_entry);
> +   spin_unlock(>waitq.lock);
> +   }
> WARN_ON(test_bit(FR_PENDING, >flags));
> WARN_ON(test_bit(FR_SENT, >flags));
> if (test_bit(FR_BACKGROUND, >flags)) {
> @@ -470,13 +476,21 @@ static void queue_interrupt(struct fuse_iqueue *fiq, 
> struct fuse_req *req)
>  {
> bool kill = false;
>
> -   spin_lock(>waitq.lock);
> -   if (test_bit(FR_FINISHED, >flags)) {
> -   spin_unlock(>waitq.lock);
> +   if (test_bit(FR_FINISHED, >flags))

The only case this test would make sense if this was a performance
sensitive path, which it's not.

> return;
> -   }
> +   spin_lock(>waitq.lock);
> if (list_empty(>intr_entry)) {
> list_add_tail(>intr_entry, >interrupts);
> +   /*
> +* Pairs with smp_mb() implied by test_and_set_bit()
> +* from request_end().
> +*/
> +   smp_mb();
> +   if (test_bit(FR_FINISHED, >flags)) {
> +   list_del_init(>intr_entry);
> +   spin_unlock(>waitq.lock);
> +   return;
> +   }
> wake_up_locked(>waitq);
> kill = true;
> }
>

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Peter Zijlstra

On Mon, Oct 29, 2018 at 06:32:57PM +, Patrick Bellasi wrote:

> +static void uclamp_group_put(unsigned int clamp_id, unsigned int group_id)
>  {
> + union uclamp_map *uc_maps = _maps[clamp_id][0];
> + union uclamp_map uc_map_old, uc_map_new;
> + long res;
> +
> +retry:
> +
> + uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> + uc_map_new = uc_map_old;
> + uc_map_new.se_count -= 1;
> + res = atomic_long_cmpxchg(_maps[group_id].adata,
> +   uc_map_old.data, uc_map_new.data);
> + if (res != uc_map_old.data)
> + goto retry;
> +}

Please write cmpxchg loops in the form:

atomic_long_t *ptr = _maps[clamp_id][group_id].adata;
union uclamp_map old, new;

old.data = atomic_long_read(ptr);
do {
new.data = old.data;
new.se_cound--;
} while (!atomic_long_try_cmpxchg(ptr, , new.data));


(same for all the others of course)

Re: [PATCH 4/6] fuse: Check for FR_SENT bit in fuse_dev_do_write()

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai  wrote:
> It's not possible to have answer to a request,
> before the request is actually sent. Add sanity
> check for that.

It's checking for the impossible.  That sometimes makes sense as a
WARN_ON() or in special cases a BUG_ON().


>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dev.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 739968ee8b0c..c603f1ebf0fd 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -1947,7 +1947,7 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
> goto err_unlock_pq;
>
> req = request_find(fpq, oh.unique & ~FUSE_INT_REQ_BIT);
> -   if (!req)
> +   if (!req || !test_bit(FR_SENT, >flags))
> goto err_unlock_pq;
>
> /* Is it an interrupt reply ID? */
>

Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page

2018-11-07 Thread Paolo Bonzini

On 07/11/2018 13:58, Liran Alon wrote:
> 
> 
>> On 7 Nov 2018, at 14:47, Paolo Bonzini  wrote:
>>
>> On 07/11/2018 13:10, Alexander Potapenko wrote:
>>> This appears to be a real bug in KVM.
>>> Please see a simplified reproducer attached.
>>
>> Thanks, I agree it's a reael bug.  The basic issue is that the
>> kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE
>> ioctl, aka 0x4080aebf.
>>
>> One way to fix it would be to just change kmalloc to kzalloc when
>> allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl
>> is wrong and should be rejected.  And the case where a shadow VMCS has
>> to be loaded is even more wrong, and we have to fix it anyway, so I
>> don't really like the idea of papering over the bug in the allocation.
>>
>> I'll test this patch and submit it formally:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c645f777b425..c546f0b1f3e0 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct
>> kvm_vcpu *vcpu,
>>  if (ret)
>>  return ret;
>>
>> -/* Empty 'VMXON' state is permitted */
>> -if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
>> +/* Empty 'VMXON' state is permitted.  A partial VMCS12 is not.  */
>> +if (kvm_state->size == sizeof(kvm_state))
>>  return 0;
>>
>> +if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE)
>> +return -EINVAL;
>> +
> 
> I don’t think that this test is sufficient to fully resolve issue.
> What if malicious userspace supplies valid size but pages containing 
> nested_state->vmcs12 is unmapped?
> This will result in vmx_set_nested_state() still calling set_current_vmptr() 
> but failing on copy_from_user()
> which still leaks cached_vmcs12 on next VMPTRLD of guest.

Makes sense; since SET_NESTED_STATE is not a fast path, we can just
memdup_user and pass a kernel pointer to vmx_set_nested_state.

> Therefore, I think that the correct patch should be to change 
> vmx_set_nested_state() to
> first gather all relevant information from userspace and validate it,
> and only then start applying it to KVM’s internal vCPU state.
> 
>>  if (kvm_state->vmx.vmcs_pa != -1ull) {
>>  if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa ||
>>  !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa))
>> @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
>> *vcpu,
>>  }
>>
>>  vmcs12 = get_vmcs12(vcpu);
>> +BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE);
> 
> Why put this BUILD_BUG_ON() specifically here?
> There are many places which assumes cached_vmcs12 is of size VMCS12_SIZE.
> (Such as nested_release_vmcs12() and handle_vmptrld()).

Unlike those places, here the copy has sizeof(*vmcs12) bytes and an
overflow would cause a userspace write to kernel memory.  Though, that
means there is still a possibility of leaking kernel data when
nested_release_vmcs12 is called.  So it also makes sense to use
VMCS12_SIZE for the memory copies, and kzalloc.

Thanks,

Paolo

>>  if (copy_from_user(vmcs12, user_kvm_nested_state->data, 
>> sizeof(*vmcs12)))
>>  return -EFAULT;
>>
>> @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu
>> *vcpu,
>>  if (nested_cpu_has_shadow_vmcs(vmcs12) &&
>>  vmcs12->vmcs_link_pointer != -1ull) {
>>  struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
>> -if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12))
>> +if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE)
>>  return -EINVAL;
>>
>>  if (copy_from_user(shadow_vmcs12,
>>
>> Paolo
> 
> -Liran
> 
>

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Peter Zijlstra

On Mon, Oct 29, 2018 at 06:32:57PM +, Patrick Bellasi wrote:
> +/**
> + * uclamp_group_get: increase the reference count for a clamp group
> + * @uc_se: the utilization clamp data for the task
> + * @clamp_id: the clamp index affected by the task
> + * @clamp_value: the new clamp value for the task
> + *
> + * Each time a task changes its utilization clamp value, for a specified 
> clamp
> + * index, we need to find an available clamp group which can be used to track
> + * this new clamp value. The corresponding clamp group index will be used to
> + * reference count the corresponding clamp value while the task is enqueued 
> on
> + * a CPU.
> + */
> +static void uclamp_group_get(struct uclamp_se *uc_se, unsigned int clamp_id,
> +  unsigned int clamp_value)
> +{
> + union uclamp_map *uc_maps = _maps[clamp_id][0];
> + unsigned int prev_group_id = uc_se->group_id;
> + union uclamp_map uc_map_old, uc_map_new;
> + unsigned int free_group_id;
> + unsigned int group_id;
> + unsigned long res;
> +
> +retry:
> +
> + free_group_id = UCLAMP_GROUPS;
> + for (group_id = 0; group_id < UCLAMP_GROUPS; ++group_id) {
> + uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> + if (free_group_id == UCLAMP_GROUPS && !uc_map_old.se_count)
> + free_group_id = group_id;
> + if (uc_map_old.value == clamp_value)
> + break;
> + }
> + if (group_id >= UCLAMP_GROUPS) {
> +#ifdef CONFIG_SCHED_DEBUG
> +#define UCLAMP_MAPERR "clamp value [%u] mapping to clamp group failed\n"
> + if (unlikely(free_group_id == UCLAMP_GROUPS)) {
> + pr_err_ratelimited(UCLAMP_MAPERR, clamp_value);
> + return;
> + }
> +#endif

Can you please put in a comment, either here or on top, on why this can
not in fact happen? And we're always guaranteed a free group.

> + group_id = free_group_id;
> + uc_map_old.data = atomic_long_read(_maps[group_id].adata);
> + }
> +
> + uc_map_new.se_count = uc_map_old.se_count + 1;
> + uc_map_new.value = clamp_value;
> + res = atomic_long_cmpxchg(_maps[group_id].adata,
> +   uc_map_old.data, uc_map_new.data);
> + if (res != uc_map_old.data)
> + goto retry;
> +
> + /* Update SE's clamp values and attach it to new clamp group */
> + uc_se->value = clamp_value;
> + uc_se->group_id = group_id;
> +
> + /* Release the previous clamp group */
> + if (uc_se->mapped)
> + uclamp_group_put(clamp_id, prev_group_id);
> + uc_se->mapped = true;
> +}

Re: [PATCH 1/6] fuse: Change argument of fuse_flush_writepages()

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:43 AM, Kirill Tkhai  wrote:
> Next patches introduce fuse_inode::lock, which will be used
> in __releases() and __acquires() instead of fc->lock.
> This patch makes this function to use argument fuse_inode
> instead of inode as preparation for that.

This patch seems like perfectly useless, just change fc->lock to
fi->lock in the next patch and be done with that.


>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dir.c|2 +-
>  fs/fuse/file.c   |8 
>  fs/fuse/fuse_i.h |2 +-
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 47395b0c3b35..c71f7e9ee0f7 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -1360,7 +1360,7 @@ static void __fuse_release_nowrite(struct inode *inode)
>
> BUG_ON(fi->writectr != FUSE_NOWRITE);
> fi->writectr = 0;
> -   fuse_flush_writepages(inode);
> +   fuse_flush_writepages(fi);
>  }
>
>  void fuse_release_nowrite(struct inode *inode)
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index cc2121b37bf5..d5bd29610875 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1522,12 +1522,12 @@ __acquires(fc->lock)
>   *
>   * Called with fc->lock
>   */
> -void fuse_flush_writepages(struct inode *inode)
> +void fuse_flush_writepages(struct fuse_inode *fi)
>  __releases(fc->lock)
>  __acquires(fc->lock)
>  {
> +   struct inode *inode = >inode;
> struct fuse_conn *fc = get_fuse_conn(inode);
> -   struct fuse_inode *fi = get_fuse_inode(inode);
> size_t crop = i_size_read(inode);
> struct fuse_req *req;
>
> @@ -1670,7 +1670,7 @@ static int fuse_writepage_locked(struct page *page)
> spin_lock(>lock);
> list_add(>writepages_entry, >writepages);
> list_add_tail(>list, >queued_writes);
> -   fuse_flush_writepages(inode);
> +   fuse_flush_writepages(fi);
> spin_unlock(>lock);
>
> end_page_writeback(page);
> @@ -1728,7 +1728,7 @@ static void fuse_writepages_send(struct 
> fuse_fill_wb_data *data)
> req->ff = fuse_file_get(data->ff);
> spin_lock(>lock);
> list_add_tail(>list, >queued_writes);
> -   fuse_flush_writepages(inode);
> +   fuse_flush_writepages(fi);
> spin_unlock(>lock);
>
> for (i = 0; i < num_pages; i++)
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index e9f712e81c7d..38bd7ca1908a 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -995,7 +995,7 @@ void fuse_update_ctime(struct inode *inode);
>
>  int fuse_update_attributes(struct inode *inode, struct file *file);
>
> -void fuse_flush_writepages(struct inode *inode);
> +void fuse_flush_writepages(struct fuse_inode *fi);
>
>  void fuse_set_nowrite(struct inode *inode);
>  void fuse_release_nowrite(struct inode *inode);
>

Re: [PATCH v2] slimbus: ngd: QCOM_QMI_HELPERS has to be selected

2018-11-07 Thread Greg KH

On Mon, Oct 15, 2018 at 09:44:41PM +0200, Niklas Cassel wrote:
> QCOM_QMI_HELPERS is a hidden kconfig, so the proper usage is
> to select it, not depend upon it.
> 
> Because of this change, we now also need to depend on the same
> Kconfigs as QCOM_QMI_HELPERS depends on.
> 
> Signed-off-by: Niklas Cassel 
> Acked-by: Srinivas Kandagatla 
> ---
> Hello Greg, Srini,
> 
> I'm sorry for this.
> (Although I'm a bit curious why 0-day test bot didn't catch this.)
> 
> Considering that I've just changed QCOM_QMI_HELPERS in:
> ccfb464cd106 ("soc: qcom: Allow COMPILE_TEST of qcom SoC Kconfigs")
> which is currently in linux-next, and that this Kconfig should
> depend on the same Kconfigs as QCOM_QMI_HELPERS depends on,
> I chose to have this Kconfig match the QCOM_QMI_HELPERS that is
> currently in linux-next.

Really?  I do this and then I get this build error on x86:

drivers/slimbus/qcom-ngd-ctrl.c: In function ‘of_qcom_slim_ngd_register’:
drivers/slimbus/qcom-ngd-ctrl.c:1333:63: warning: dereferencing ‘void *’ pointer
  data = of_match_node(qcom_slim_ngd_dt_match, parent->of_node)->data;
   ^~
drivers/slimbus/qcom-ngd-ctrl.c:1333:63: error: request for member ‘data’ in 
something not a structure or union


So I can't take this, something else must be wrong here...

thanks,

greg k-h

Re: Question: perf dso support for /proc/kallsyms

2018-11-07 Thread Jiri Olsa

On Fri, Nov 02, 2018 at 10:55:16AM +0800, leo@linaro.org wrote:
> Hi all,
> 
> Now I found that if use the command 'perf script' for Arm CoreSight trace
> data, it fails to parse kernel symbols if we don't specify kernel vmlinux
> file.   So when we don't specify kernel symbol files then perf tool will
> roll back to use /proc/kallsyms for kernel symbols parsing, as result it will
> run into below flow:

yep, AFAIK if there's no vmlinux found we fallback to /proc/kallsyms

> 
>   thread__find_addr_map(thread, cpumode, MAP__FUNCTION, address, );
>   map__load(al.map);
>   dso__data_read_offset(al.map->dso, machine, offset, buffer, size);
> `-> data_read_offset()

so what is the actual error you see in the perf script?
unresolved samples? could you please describe your config
and workload?

thanks,
jirka

> 
> I can observe the function data_read_offset() returns failure, this is caused
> by checking the offset sanity "if (offset > dso->data.file_size)"  (I pasted
> the whole function code at below in case you want to get more context for it),
> but if perf use "/proc/kallsyms" to load kernel symbols, the variable
> 'dso->data.file_size' will be set to zero thus the sanity checking always
> thinks the offset is out of the file size bound.
> 
> Now I still don't understand how the dso/map support "/proc/kallsyms" and
> have no idea to fix this issue, though I spent some time to look into it.
> 
> Could you give some suggestion for this?  Or even better if you have fixing
> for this, I am glad to test at my side.
> 
> static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
> u64 offset, u8 *data, ssize_t size)
> {
> if (data_file_size(dso, machine))
> return -1;
> 
> /* Check the offset sanity. */
> if (offset > dso->data.file_size)
> return -1;
> 
> if (offset + size < offset)
> return -1;
> 
> return cached_read(dso, machine, offset, data, size);
> }
> 
> Thanks,
> Leo Yan

Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups

2018-11-07 Thread Patrick Bellasi

On 07-Nov 13:19, Peter Zijlstra wrote:
> On Mon, Oct 29, 2018 at 06:32:57PM +, Patrick Bellasi wrote:
> > +struct uclamp_se {
> > +   unsigned int value  : SCHED_CAPACITY_SHIFT + 1;
> > +   unsigned int group_id   : order_base_2(UCLAMP_GROUPS);
> 
> Are you sure about ob2() ? seems weird we'll end up with 0 for
> UCLAMP_GROUPS==1.

So, we have:

   #define UCLAMP_GROUPS (CONFIG_UCLAMP_GROUPS_COUNT + 1)

because one clamp group is always reserved for defaults.
Thus, ob2(in) is always called with n>=2.

... should be safe no ?

However, will check better again on v6 for possible corner-cases...

> > +   unsigned int mapped : 1;
> > +};

-- 
#include 

Patrick Bellasi

Re: [PATCH 5/6] fuse: Protect fuse_inode::nlookup with fuse_inode::lock

2018-11-07 Thread Miklos Szeredi

On Tue, Nov 6, 2018 at 10:44 AM, Kirill Tkhai  wrote:
> This continues previous patch and introduces the same
> protection for nlookup field. It goes as separate patch
> since it's separate logic change (sadly, but it looks
> impossible to split previous patch more then in this way).

Well, the way you can split things up better is to just add the fine
grained lock first, which you can do in multiple patches, since all of
it is going to be a no-op.  And when that is done, remove the coarse
grained lock in yet another patch, which hopefully will result in
better performance and no other change.

>
> Signed-off-by: Kirill Tkhai 
> ---
>  fs/fuse/dir.c |4 ++--
>  fs/fuse/inode.c   |4 ++--
>  fs/fuse/readdir.c |4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 35f3b3d1e044..ac8519285327 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -222,9 +222,9 @@ static int fuse_dentry_revalidate(struct dentry *entry, 
> unsigned int flags)
> fuse_queue_forget(fc, forget, outarg.nodeid, 
> 1);
> goto invalid;
> }
> -   spin_lock(>lock);
> +   spin_lock(>lock);
> fi->nlookup++;
> -   spin_unlock(>lock);
> +   spin_unlock(>lock);
> }
> kfree(forget);
> if (ret == -ENOMEM)
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 5f488b019cd9..b8092d49a4b2 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -327,9 +327,9 @@ struct inode *fuse_iget(struct super_block *sb, u64 
> nodeid,
> }
>
> fi = get_fuse_inode(inode);
> -   spin_lock(>lock);
> +   spin_lock(>lock);
> fi->nlookup++;
> -   spin_unlock(>lock);
> +   spin_unlock(>lock);
> fuse_change_attributes(inode, attr, attr_valid, attr_version);
>
> return inode;
> diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
> index ab18b78f4755..574d03f8a573 100644
> --- a/fs/fuse/readdir.c
> +++ b/fs/fuse/readdir.c
> @@ -213,9 +213,9 @@ static int fuse_direntplus_link(struct file *file,
> }
>
> fi = get_fuse_inode(inode);
> -   spin_lock(>lock);
> +   spin_lock(>lock);
> fi->nlookup++;
> -   spin_unlock(>lock);
> +   spin_unlock(>lock);
>
> forget_all_cached_acls(inode);
> fuse_change_attributes(inode, >attr,
>

Re: [PATCH 6/6] fuse: Verify userspace asks to requeue interrupt that we really sent

2018-11-07 Thread Kirill Tkhai

On 07.11.2018 16:55, Miklos Szeredi wrote:
> On Tue, Nov 6, 2018 at 10:31 AM, Kirill Tkhai  wrote:
>> When queue_interrupt() is called from fuse_dev_do_write(),
>> it came from userspace directly. Userspace may pass any
>> request id, even the request's we have not interrupted
>> (or even background's request). This patch adds sanity
>> check to make kernel safe against that.
> 
> Okay, I understand this far.
> 
>> The problem is real interrupt may be queued and requeued
>> by two tasks on two cpus. This case, the requeuer has not
>> guarantees it sees FR_INTERRUPTED bit on its cpu (since
>> we know nothing about the way userspace manages requests
>> between its threads and whether it uses smp barriers).
> 
> This sounds like BS. There's an explicit  smp_mb__after_atomic()
> after the set_bit(FR_INTERRUPTED,...).  Which means FR_INTERRUPTED is
> going to be visible on all CPU's after this, no need to fool around
> with setting FR_INTERRUPTED again, etc...

Hm, but how does it make the bit visible on all CPUS?

The problem is that smp_mb_xxx() barrier on a cpu has a sense
only in pair with the appropriate barrier on the second cpu.
There is no guarantee for visibility, if second cpu does not
have a barrier:

  CPU 1  CPU2CPU3   
CPU4CPU5

  set FR_INTERRUPTED set FR_SENT
   
  test FR_SENT (== 0)test FR_INTERRUPTED (==1)
 list_add[>intr_entry]  read[req by intr_entry]
 
 goto userspace
 write in userspace buffer

read from userspace buffer  

write to userspace buffer

 read from userspace buffer

 enter kernel

 

 test FR_INTERRUPTED <- Not visible

The sequence:

set_bit(FR_INTERRUPTED, ...)
smp_mb__after_atomic();
test_bit(FR_SENT, >flags)

just guarantees the expected order on CPU2, which uses ,
but CPU5 does not have any guarantees.


This 5 CPUs picture is a scheme, which illustrates the possible way userspace
may manage interrupts. Tags  show places, where
we not have barriers yet, but where we may introduce them. But even in case
of we introduce them, there is no a way, that such the barriers help against 
CPU4.
So, this is the reason we have to set FR_INTERRUPTED bit again to make it 
visible
under the spinlock on CPU5.

Thanks,
Kirill

>>
>> To eliminate this problem, queuer writes FR_INTERRUPTED
>> bit again under fiq->waitq.lock, and this guarantees
>> requeuer will see the bit, when checks it.
>>
>> I try to introduce solution, which does not affect on
>> performance, and which does not force to take more
>> locks. This is the reason, the below solution is worse:
>>
>>request_wait_answer()
>>{
>>  ...
>>   +  spin_lock(>waitq.lock);
>>  set_bit(FR_INTERRUPTED, >flags);
>>   +  spin_unlock(>waitq.lock);
>>  ...
>>}
>>
>> Also, it does not look a better idea to extend fuse_dev_do_read()
>> with the fix, since it's already a big function:
>>
>>fuse_dev_do_read()
>>{
>>  ...
>>  if (test_bit(FR_INTERRUPTED, >flags)) {
>>   +  /* Comment */
>>   +  barrier();
>>   +  set_bit(FR_INTERRUPTED, >flags);
>>  queue_interrupt(fiq, req);
>>  }
>>  ...
>>}
>>
>> Signed-off-by: Kirill Tkhai 
>> ---
>>  fs/fuse/dev.c |   26 +-
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>> index 315d395d5c02..3bfc5ed61c9a 100644
>> --- a/fs/fuse/dev.c
>> +++ b/fs/fuse/dev.c
>> @@ -475,13 +475,27 @@ static void request_end(struct fuse_conn *fc, struct 
>> fuse_req *req)
>> fuse_put_request(fc, req);
>>  }
>>
>> -static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>> +static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
>>  {
>> bool kill = false;
>>
>> if (test_bit(FR_FINISHED, >flags))
>> -   return;
>> +   return 0;
>> spin_lock(>waitq.lock);
>> +   /* Check for we've sent request to interrupt this req */
>> +   if (unlikely(!test_bit(FR_INTERRUPTED, >flags))) {
>> +   spin_unlock(>waitq.lock);
>> +   return

Re: [PATCH] perf script: add newline after uregs output

2018-11-07 Thread Arnaldo Carvalho de Melo

Em Wed, Nov 07, 2018 at 10:37:05AM +0100, Milian Wolff escreveu:
> This change makes it much easier to easily distinguish
> between consecutive samples by keeping the empty line
> between them, like we see when we do not enable uregs
> output.

Thanks, applied.

- Arnaldo

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1460 matches

Mail list logo