Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page

2017-03-17 Thread Tom Lendacky

On 3/16/2017 3:04 PM, Tom Lendacky wrote:

On 3/7/2017 8:59 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky 
---



diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem
*__ioremap_caller(resource_size_t phys_addr,
pcm = new_pcm;
}

+   /*
+* If the page being mapped is in memory and SEV is active then
+* make sure the memory encryption attribute is enabled in the
+* resulting mapping.
+*/
prot = PAGE_KERNEL_IO;
+   if (sev_active() && page_is_mem(pfn))


Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...


diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);

+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long
nr_pages)
+{
+struct resource res;
+unsigned long pfn, end_pfn;
+u64 orig_end;
+int ret = -1;
+
+res.start = (u64) start_pfn << PAGE_SHIFT;
+res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+orig_end = res.end;
+while ((res.start < res.end) &&
+(find_next_iomem_res(, IORES_DESC_NONE, true) >= 0)) {
+pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+end_pfn = (res.end + 1) >> PAGE_SHIFT;
+if (end_pfn > pfn)
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+if (ret)
+break;
+res.start = res.end + 1;
+res.end = orig_end;
+}
+return ret;
+}


So the relevant difference between this one and walk_system_ram_range()
is this:

-ret = (*func)(pfn, end_pfn - pfn, arg);
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...


I'll take a look at what it takes to consolidate these with a pre-patch.
Then I'll add the new support.


It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?

Thanks,
Tom



Thanks,
Tom





Re: [PATCH 4/4] crypto: s5p-sss - Use mutex instead of spinlock

2017-03-17 Thread Bartlomiej Zolnierkiewicz

Hi,

On Friday, March 17, 2017 04:49:22 PM Krzysztof Kozlowski wrote:
> Driver uses threaded interrupt handler so there is no real need for
> using spinlocks for synchronization.  Mutexes would do fine and are
> friendlier for overall system preemptivness and real-time behavior.

Are you sure that this conversion is safe?  This driver also uses
a tasklet and tasklets run in the interrupt context.

> @@ -667,18 +666,17 @@ static void s5p_tasklet_cb(unsigned long data)
>   struct s5p_aes_dev *dev = (struct s5p_aes_dev *)data;
>   struct crypto_async_request *async_req, *backlog;
>   struct s5p_aes_reqctx *reqctx;
> - unsigned long flags;
>  
> - spin_lock_irqsave(>lock, flags);
> + mutex_lock(>lock);
>   backlog   = crypto_get_backlog(>queue);
>   async_req = crypto_dequeue_request(>queue);
>  
>   if (!async_req) {
>   dev->busy = false;
> - spin_unlock_irqrestore(>lock, flags);
> + mutex_unlock(>lock);
>   return;
>   }
> - spin_unlock_irqrestore(>lock, flags);
> + mutex_unlock(>lock);

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R Institute Poland
Samsung Electronics



[PATCH 1/4] crypto: s5p-sss - Close possible race for completed requests

2017-03-17 Thread Krzysztof Kozlowski
Driver is capable of handling only one request at a time and it stores
it in its state container struct s5p_aes_dev.  This stored request must be
protected between concurrent invocations (e.g. completing current
request and scheduling new one).  Combination of lock and "busy" field
is used for that purpose.

When "busy" field is true, the driver will not accept new request thus
it will not overwrite currently handled data.

However commit 28b62b145868 ("crypto: s5p-sss - Fix spinlock recursion
on LRW(AES)") moved some of the write to "busy" field out of a lock
protected critical section.  This might lead to potential race between
completing current request and scheduling a new one.  Effectively the
request completion might try to operate on new crypto request.

Cc:  # v4.10.x
Fixes: 28b62b145868 ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)")
Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index 1b9da3dc799b..6c620487e9c2 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -287,7 +287,6 @@ static void s5p_sg_done(struct s5p_aes_dev *dev)
 static void s5p_aes_complete(struct s5p_aes_dev *dev, int err)
 {
dev->req->base.complete(>req->base, err);
-   dev->busy = false;
 }
 
 static void s5p_unset_outdata(struct s5p_aes_dev *dev)
@@ -462,7 +461,7 @@ static irqreturn_t s5p_aes_interrupt(int irq, void *dev_id)
spin_unlock_irqrestore(>lock, flags);
 
s5p_aes_complete(dev, 0);
-   dev->busy = true;
+   /* Device is still busy */
tasklet_schedule(>tasklet);
} else {
/*
@@ -483,6 +482,7 @@ static irqreturn_t s5p_aes_interrupt(int irq, void *dev_id)
 
 error:
s5p_sg_done(dev);
+   dev->busy = false;
spin_unlock_irqrestore(>lock, flags);
s5p_aes_complete(dev, err);
 
@@ -634,6 +634,7 @@ static void s5p_aes_crypt_start(struct s5p_aes_dev *dev, 
unsigned long mode)
 
 indata_error:
s5p_sg_done(dev);
+   dev->busy = false;
spin_unlock_irqrestore(>lock, flags);
s5p_aes_complete(dev, err);
 }
-- 
2.9.3



[PATCH 2/4] crypto: s5p-sss - Remove unused variant field from state container

2017-03-17 Thread Krzysztof Kozlowski
The driver uses type of device (variant) only during probe so there is
no need to store it for later.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index 6c620487e9c2..35ea84b7d775 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -190,8 +190,6 @@ struct s5p_aes_dev {
struct crypto_queue queue;
boolbusy;
spinlock_t  lock;
-
-   struct samsung_aes_variant  *variant;
 };
 
 static struct s5p_aes_dev *s5p_dev;
@@ -852,7 +850,6 @@ static int s5p_aes_probe(struct platform_device *pdev)
}
 
pdata->busy = false;
-   pdata->variant = variant;
pdata->dev = dev;
platform_set_drvdata(pdev, pdata);
s5p_dev = pdata;
-- 
2.9.3



Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Alexander Potapenko
On Fri, Mar 17, 2017 at 1:31 PM, Alexander Potapenko  wrote:
> On Fri, Mar 17, 2017 at 1:08 PM, Peter Zijlstra  wrote:
>> On Thu, Mar 16, 2017 at 05:15:19PM -0700, Michael Davidson wrote:
>>> Replace a variable length array in a struct by allocating
>>> the memory for the entire struct in a char array on the stack.
>>>
>>> Signed-off-by: Michael Davidson 
>>> ---
>>>  drivers/md/raid10.c | 9 -
>>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index 063c43d83b72..158ebdff782c 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev 
>>> *mddev,
>>>   /* Use sync reads to get the blocks from somewhere else */
>>>   int sectors = r10_bio->sectors;
>>>   struct r10conf *conf = mddev->private;
>>> - struct {
>>> - struct r10bio r10_bio;
>>> - struct r10dev devs[conf->copies];
>>> - } on_stack;
>>> - struct r10bio *r10b = _stack.r10_bio;
>>> + char on_stack_r10_bio[sizeof(struct r10bio) +
>>> +   conf->copies * sizeof(struct r10dev)]
>>> +   __aligned(__alignof__(struct r10bio));
>>> + struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio;
>>>   int slot = 0;
>>>   int idx = 0;
>>>   struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;
>>
>>
>> That's disgusting. Why not fix LLVM to support this?
>
> IIUC there's only a handful of VLAIS instances in LLVM code, why not
Sorry, "kernel code", not "LLVM code".
> just drop them for the sake of better code portability?
> (To quote Linus, "this feature is an abomination":
> https://lkml.org/lkml/2013/9/23/500)
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


Re: [ANNOUNCE] /dev/random - a new approach (code for 4.11-rc1)

2017-03-17 Thread Jason A. Donenfeld
Hey Stephan,

Have you considered submitting this without so many options? For
example -- just unconditionally using ChaCha20 instead of the
configurable crypto API functions? And either removing the FIPS140
compliance code, and either unconditionally including it, or just
getting rid of it? And finally just making this a part of the kernel
directly, instead of adding this as a standalone optional component?

Jason


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Peter Zijlstra
On Fri, Mar 17, 2017 at 08:05:16PM +0100, Dmitry Vyukov wrote:
> You can also find some reasons in the Why section of LLVM-Linux project:
> http://llvm.linuxfoundation.org/index.php/Main_Page

>From that:

 - LLVM/Clang is a fast moving project with many things fixed quickly
   and features added.

So what's the deal with that 5 year old bug you want us to work around?

Also, clang doesn't support asm cc flags output and a few other
extensions last time I checked.



Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Peter Zijlstra
On Fri, Mar 17, 2017 at 08:26:42PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 17, 2017 at 08:05:16PM +0100, Dmitry Vyukov wrote:
> > You can also find some reasons in the Why section of LLVM-Linux project:
> > http://llvm.linuxfoundation.org/index.php/Main_Page
> 
> From that:
> 
>  - LLVM/Clang is a fast moving project with many things fixed quickly
>and features added.
> 
> So what's the deal with that 5 year old bug you want us to work around?
> 
> Also, clang doesn't support asm cc flags output and a few other
> extensions last time I checked.
> 

Another great one:

 - BSD License (some people prefer this license to the GPL)

Seems a very weak argument to make when talking about the Linux Kernel
which is very explicitly GPLv2 (and not later).


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Peter Zijlstra
On Fri, Mar 17, 2017 at 11:52:01AM -0700, Michael Davidson wrote:
> On Fri, Mar 17, 2017 at 5:44 AM, Peter Zijlstra  wrote:
> >
> > Be that as it may; what you construct above is disgusting. Surely the
> > code can be refactored to not look like dog vomit?
> >
> > Also; its not immediately obvious conf->copies is 'small' and this
> > doesn't blow up the stack; I feel that deserves a comment somewhere.
> >
> 
> I agree that the code is horrible.
> 
> It is, in fact, exactly the same solution that was used to remove
> variable length arrays in structs from several of the crypto drivers a
> few years ago - see the definition of SHASH_DESC_ON_STACK() in
> "crypto/hash.h" - I did not, however, hide the horrors in a macro
> preferring to leave the implementation visible as a warning to whoever
> might touch the code next.
> 
> I believe that the actual stack usage is exactly the same as it was 
> previously.
> 
> I can certainly wrap this  up in a macro and add comments with
> appropriately dire warnings in it if you feel that is both necessary
> and sufficient.

We got away with ugly in the past, so we should get to do it again?


[PATCH] crypto: zip - Memory corruption in zip_clear_stats()

2017-03-17 Thread Dan Carpenter
There is a typo here.  It should be "stats" instead of "state".  The
impact is that we clear 224 bytes instead of 80 and we zero out memory
that we shouldn't.

Fixes: 09ae5d37e093 ("crypto: zip - Add Compression/Decompression statistics")
Signed-off-by: Dan Carpenter 

diff --git a/drivers/crypto/cavium/zip/zip_main.c 
b/drivers/crypto/cavium/zip/zip_main.c
index 0951e20b395b..6ff13d80d82e 100644
--- a/drivers/crypto/cavium/zip/zip_main.c
+++ b/drivers/crypto/cavium/zip/zip_main.c
@@ -530,7 +530,7 @@ static int zip_clear_stats(struct seq_file *s, void *unused)
for (index = 0; index < MAX_ZIP_DEVICES; index++) {
if (zip_dev[index]) {
memset(_dev[index]->stats, 0,
-  sizeof(struct zip_state));
+  sizeof(struct zip_stats));
seq_printf(s, "Cleared stats for zip %d\n", index);
}
}


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread hpa
On March 17, 2017 12:27:46 PM PDT, Peter Zijlstra  wrote:
>On Fri, Mar 17, 2017 at 11:52:01AM -0700, Michael Davidson wrote:
>> On Fri, Mar 17, 2017 at 5:44 AM, Peter Zijlstra
> wrote:
>> >
>> > Be that as it may; what you construct above is disgusting. Surely
>the
>> > code can be refactored to not look like dog vomit?
>> >
>> > Also; its not immediately obvious conf->copies is 'small' and this
>> > doesn't blow up the stack; I feel that deserves a comment
>somewhere.
>> >
>> 
>> I agree that the code is horrible.
>> 
>> It is, in fact, exactly the same solution that was used to remove
>> variable length arrays in structs from several of the crypto drivers
>a
>> few years ago - see the definition of SHASH_DESC_ON_STACK() in
>> "crypto/hash.h" - I did not, however, hide the horrors in a macro
>> preferring to leave the implementation visible as a warning to
>whoever
>> might touch the code next.
>> 
>> I believe that the actual stack usage is exactly the same as it was
>previously.
>> 
>> I can certainly wrap this  up in a macro and add comments with
>> appropriately dire warnings in it if you feel that is both necessary
>> and sufficient.
>
>We got away with ugly in the past, so we should get to do it again?

Seriously, you should have taken the hack the first time that this needs to be 
fixed.  Just because this is a fairly uncommon construct in the kernel doesn't 
mean it is not in userspace.

I would like to say this falls in the category of "fix your compiler this 
time".  Once is one thing, twice is unacceptable.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags

2017-03-17 Thread H. Peter Anvin
On 03/17/17 14:32, H. Peter Anvin wrote:
> 
> NAK.  Fix your compiler, or use a wrapper script or something.  It is
> absolutely *not* acceptable to disable this since future versions of
> clang *should* support that.
> 
> That being said, it might make sense to look for a key pattern like
> "(un|not )supported" on stderr the try-run macro.  Is there really no
> -Wno- or -Werror= option to turn off this craziness?
> 

Well, guess what... I found it myself.

-W{no-,error=}ignored-optimization-argument

Either variant will make this sane.

-hpa




Re: [PATCH 4/4] crypto: s5p-sss - Use mutex instead of spinlock

2017-03-17 Thread Krzysztof Kozlowski
On Fri, Mar 17, 2017 at 06:28:29PM +0100, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi,
> 
> On Friday, March 17, 2017 04:49:22 PM Krzysztof Kozlowski wrote:
> > Driver uses threaded interrupt handler so there is no real need for
> > using spinlocks for synchronization.  Mutexes would do fine and are
> > friendlier for overall system preemptivness and real-time behavior.
> 
> Are you sure that this conversion is safe?  This driver also uses
> a tasklet and tasklets run in the interrupt context.
>

Yes, you're right. This is not safe and patch should be dropped. Thanks
for spotting this.

Best regards,
Krzysztof



Re: [PATCH 1/4] crypto: s5p-sss - Close possible race for completed requests

2017-03-17 Thread Bartlomiej Zolnierkiewicz
On Friday, March 17, 2017 04:49:19 PM Krzysztof Kozlowski wrote:
> Driver is capable of handling only one request at a time and it stores
> it in its state container struct s5p_aes_dev.  This stored request must be
> protected between concurrent invocations (e.g. completing current
> request and scheduling new one).  Combination of lock and "busy" field
> is used for that purpose.
> 
> When "busy" field is true, the driver will not accept new request thus
> it will not overwrite currently handled data.
> 
> However commit 28b62b145868 ("crypto: s5p-sss - Fix spinlock recursion
> on LRW(AES)") moved some of the write to "busy" field out of a lock
> protected critical section.  This might lead to potential race between
> completing current request and scheduling a new one.  Effectively the
> request completion might try to operate on new crypto request.
> 
> Cc:  # v4.10.x
> Fixes: 28b62b145868 ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)")
> Signed-off-by: Krzysztof Kozlowski 

Reviewed-by: Bartlomiej Zolnierkiewicz 

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R Institute Poland
Samsung Electronics



Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Dmitry Vyukov
On Fri, Mar 17, 2017 at 7:03 PM, Borislav Petkov  wrote:
> On Fri, Mar 17, 2017 at 01:32:00PM +0100, Alexander Potapenko wrote:
>> > IIUC there's only a handful of VLAIS instances in LLVM code, why not
>> Sorry, "kernel code", not "LLVM code".
>> > just drop them for the sake of better code portability?
>
> And what happens if someone else adds a variable thing like this
> somewhere else, builds with gcc, everything's fine and patch gets
> applied? Or something else llvm can't stomach.
>
> Does that mean there'll be the occasional, every-so-often whack-a-mole
> patchset from someone, fixing the kernel build with llvm yet again?


This problem is more general and is not specific to clang. It equally
applies to different versions of gcc, different arches and different
configs (namely, anything else than what a developer used for
testing). A known, reasonably well working solution to this problem is
a system of try bots that test patches before commit with different
compilers/configs/archs. We already have such system in the form of
0-day bots. It would be useful to extend it with clang as soon as
kernel builds.


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Borislav Petkov
On Fri, Mar 17, 2017 at 07:47:33PM +0100, Dmitry Vyukov wrote:
> This problem is more general and is not specific to clang. It equally
> applies to different versions of gcc, different arches and different
> configs (namely, anything else than what a developer used for
> testing).

I guess. We do carry a bunch of gcc workarounds along with the cc-*
macros in scripts/Kbuild.include.

> A known, reasonably well working solution to this problem is
> a system of try bots that test patches before commit with different
> compilers/configs/archs. We already have such system in the form of
> 0-day bots. It would be useful to extend it with clang as soon as
> kernel builds.

Has someone actually already talked to Fengguang about it?

Oh, and the stupid question: why the effort to build the kernel
with clang at all? Just because or are there some actual, palpable
advantages?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Dmitry Vyukov
On Fri, Mar 17, 2017 at 7:57 PM, Borislav Petkov  wrote:
> On Fri, Mar 17, 2017 at 07:47:33PM +0100, Dmitry Vyukov wrote:
>> This problem is more general and is not specific to clang. It equally
>> applies to different versions of gcc, different arches and different
>> configs (namely, anything else than what a developer used for
>> testing).
>
> I guess. We do carry a bunch of gcc workarounds along with the cc-*
> macros in scripts/Kbuild.include.
>
>> A known, reasonably well working solution to this problem is
>> a system of try bots that test patches before commit with different
>> compilers/configs/archs. We already have such system in the form of
>> 0-day bots. It would be useful to extend it with clang as soon as
>> kernel builds.
>
> Has someone actually already talked to Fengguang about it?

+Fengguang

> Oh, and the stupid question: why the effort to build the kernel
> with clang at all? Just because or are there some actual, palpable
> advantages?

On our side it is:
 - clang make it possible to implement KMSAN (dynamic detection of
uses of uninit memory)
 - better code coverage for fuzzing
 - why simpler and faster development (e.g. we can port our user-space
hardening technologies -- CFI and SafeStack)

You can also find some reasons in the Why section of LLVM-Linux project:
http://llvm.linuxfoundation.org/index.php/Main_Page


Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0

2017-03-17 Thread Kevin Hilman
Herbert Xu  writes:

> On Thu, Mar 16, 2017 at 11:24:31AM -0700, Kevin Hilman wrote:
>> Hi Herbert,
>> 
>> Herbert Xu  writes:
>> 
>> > On Wed, Feb 22, 2017 at 07:55:24AM +0100, Heiner Kallweit wrote:
>> >> Expose clock CLKID_RNG0 which is needed for the HW random number 
>> >> generator.
>> >> 
>> >> Signed-off-by: Heiner Kallweit 
>> >
>> > All patches applied.  Thanks.
>> 
>> Actually, can you just apply [PATCH 4/4] to your tree?
>> 
>> The clock and DT patches need to go through their respective trees or
>> will otherwise have conflicts with other things going in via those
>> trees.
>
> It's too late now.  Please speak up sooner next time.  These
> patches were posted a month ago.

Sorry, I didn't realize you would be applying everything.  Also, I'm not
the original author, just the platform maintainer that noticed it and
now has to deal with the conflicts. :(

Most other driver maintainers are only applying patches that directly
apply to their subsystem and leave patches to other drivers (e.g. clk) and
platform-specific stuff (e.g. DT) to go in via their proper trees, so
that's what I was expecting to happen here too.

Kevin







Re: [PATCH 3/7] x86, LLVM: suppress clang warnings about unaligned accesses

2017-03-17 Thread hpa
On March 16, 2017 5:15:16 PM PDT, Michael Davidson  wrote:
>Suppress clang warnings about potential unaliged accesses
>to members in packed structs. This gets rid of almost 10,000
>warnings about accesses to the ring 0 stack pointer in the TSS.
>
>Signed-off-by: Michael Davidson 
>---
> arch/x86/Makefile | 5 +
> 1 file changed, 5 insertions(+)
>
>diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>index 894a8d18bf97..7f21703c475d 100644
>--- a/arch/x86/Makefile
>+++ b/arch/x86/Makefile
>@@ -128,6 +128,11 @@ endif
> KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
> endif
> 
>+ifeq ($(cc-name),clang)
>+# Suppress clang warnings about potential unaligned accesses.
>+KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
>+endif
>+
> ifdef CONFIG_X86_X32
>   x32_ld_ok := $(call try-run,\
>   /bin/echo -e '1: .quad 1b' | \

Why conditional on clang?
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Fengguang Wu

Hi Dmitry,

On Fri, Mar 17, 2017 at 08:05:16PM +0100, Dmitry Vyukov wrote:

On Fri, Mar 17, 2017 at 7:57 PM, Borislav Petkov  wrote:

On Fri, Mar 17, 2017 at 07:47:33PM +0100, Dmitry Vyukov wrote:

This problem is more general and is not specific to clang. It equally
applies to different versions of gcc, different arches and different
configs (namely, anything else than what a developer used for
testing).


I guess. We do carry a bunch of gcc workarounds along with the cc-*
macros in scripts/Kbuild.include.


A known, reasonably well working solution to this problem is
a system of try bots that test patches before commit with different
compilers/configs/archs. We already have such system in the form of
0-day bots. It would be useful to extend it with clang as soon as
kernel builds.


Has someone actually already talked to Fengguang about it?


+Fengguang


I've actually tried clang long time ago. It quickly fails the build
for vanilla kernel. So it really depends on when the various clang
build fix patches can be accepted into mainline kernel.

Thanks,
Fengguang


Re: [PATCH 1/1] ARM: dts: NSP: Add crypto (SPU) to dtsi

2017-03-17 Thread Florian Fainelli
On 03/06/2017 11:22 AM, Florian Fainelli wrote:
> On 02/28/2017 12:31 PM, Florian Fainelli wrote:
>> On 02/22/2017 01:22 PM, Steve Lin wrote:
>>> Adds crypto hardware (SPU) to Northstar Plus device tree file.
>>>
>>> Signed-off-by: Steve Lin 
>>
>> Applied, thanks!
> 
> And dropped, since there is a dependency on "ARM: dts: NSP: Add mailbox
> (PDC) to NSP" to be applied first.
> 
> Let's wait for the mailbox maintainer to chime in before I apply the
> following patches (in that order):
> 
> ARM: dts: NSP: Add mailbox (PDC) to NSP
> ARM: dts: NSP: Add crypto (SPU) to dtsi

Applied again.
-- 
Florian


[PATCH 3/4] crypto: s5p-sss - Document the struct s5p_aes_dev

2017-03-17 Thread Krzysztof Kozlowski
Add kernel-doc to s5p_aes_dev structure.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index 35ea84b7d775..7ac657f46d15 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -170,6 +170,32 @@ struct s5p_aes_ctx {
int keylen;
 };
 
+/**
+ * struct s5p_aes_dev - Crypto device state container
+ * @dev:   Associated device
+ * @clk:   Clock for accessing hardware
+ * @ioaddr:Mapped IO memory region
+ * @aes_ioaddr:Per-varian offset for AES block IO memory
+ * @irq_fc:Feed control interrupt line
+ * @req:   Crypto request currently handled by the device
+ * @ctx:   Configuration for currently handled crypto request
+ * @sg_src:Scatter list with source data for currently handled block
+ * in device.  This is DMA-mapped into device.
+ * @sg_dst:Scatter list with destination data for currently handled block
+ * in device. This is DMA-mapped into device.
+ * @sg_src_cpy:In case of unaligned access, copied scatter list
+ * with source data.
+ * @sg_dst_cpy:In case of unaligned access, copied scatter list
+ * with destination data.
+ * @tasklet:   New request scheduling jib
+ * @queue: Crypto queue
+ * @busy:  Indicates whether the device is currently handling some request
+ * thus it uses some of the fields from this state, like:
+ * req, ctx, sg_src/dst (and copies).  This essentially
+ * protects against concurrent access to these fields.
+ * @lock:  Lock for protecting both access to device hardware registers
+ * and fields related to current request (including the busy 
field).
+ */
 struct s5p_aes_dev {
struct device   *dev;
struct clk  *clk;
@@ -182,7 +208,6 @@ struct s5p_aes_dev {
struct scatterlist  *sg_src;
struct scatterlist  *sg_dst;
 
-   /* In case of unaligned access: */
struct scatterlist  *sg_src_cpy;
struct scatterlist  *sg_dst_cpy;
 
-- 
2.9.3



[PATCH 4/4] crypto: s5p-sss - Use mutex instead of spinlock

2017-03-17 Thread Krzysztof Kozlowski
Driver uses threaded interrupt handler so there is no real need for
using spinlocks for synchronization.  Mutexes would do fine and are
friendlier for overall system preemptivness and real-time behavior.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 35 ---
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index 7ac657f46d15..1893cf5dedc0 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -214,7 +215,7 @@ struct s5p_aes_dev {
struct tasklet_struct   tasklet;
struct crypto_queue queue;
boolbusy;
-   spinlock_t  lock;
+   struct mutexlock;
 };
 
 static struct s5p_aes_dev *s5p_dev;
@@ -443,11 +444,10 @@ static irqreturn_t s5p_aes_interrupt(int irq, void 
*dev_id)
int err_dma_tx = 0;
int err_dma_rx = 0;
bool tx_end = false;
-   unsigned long flags;
uint32_t status;
int err;
 
-   spin_lock_irqsave(>lock, flags);
+   mutex_lock(>lock);
 
/*
 * Handle rx or tx interrupt. If there is still data (scatterlist did 
not
@@ -481,7 +481,7 @@ static irqreturn_t s5p_aes_interrupt(int irq, void *dev_id)
if (tx_end) {
s5p_sg_done(dev);
 
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
 
s5p_aes_complete(dev, 0);
/* Device is still busy */
@@ -498,7 +498,7 @@ static irqreturn_t s5p_aes_interrupt(int irq, void *dev_id)
if (err_dma_rx == 1)
s5p_set_dma_indata(dev, dev->sg_src);
 
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
}
 
return IRQ_HANDLED;
@@ -506,7 +506,7 @@ static irqreturn_t s5p_aes_interrupt(int irq, void *dev_id)
 error:
s5p_sg_done(dev);
dev->busy = false;
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
s5p_aes_complete(dev, err);
 
return IRQ_HANDLED;
@@ -599,7 +599,6 @@ static void s5p_aes_crypt_start(struct s5p_aes_dev *dev, 
unsigned long mode)
 {
struct ablkcipher_request *req = dev->req;
uint32_t aes_control;
-   unsigned long flags;
int err;
 
aes_control = SSS_AES_KEY_CHANGE_MODE;
@@ -625,7 +624,7 @@ static void s5p_aes_crypt_start(struct s5p_aes_dev *dev, 
unsigned long mode)
|  SSS_AES_BYTESWAP_KEY
|  SSS_AES_BYTESWAP_CNT;
 
-   spin_lock_irqsave(>lock, flags);
+   mutex_lock(>lock);
 
SSS_WRITE(dev, FCINTENCLR,
  SSS_FCINTENCLR_BTDMAINTENCLR | SSS_FCINTENCLR_BRDMAINTENCLR);
@@ -648,7 +647,7 @@ static void s5p_aes_crypt_start(struct s5p_aes_dev *dev, 
unsigned long mode)
SSS_WRITE(dev, FCINTENSET,
  SSS_FCINTENSET_BTDMAINTENSET | SSS_FCINTENSET_BRDMAINTENSET);
 
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
 
return;
 
@@ -658,7 +657,7 @@ static void s5p_aes_crypt_start(struct s5p_aes_dev *dev, 
unsigned long mode)
 indata_error:
s5p_sg_done(dev);
dev->busy = false;
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
s5p_aes_complete(dev, err);
 }
 
@@ -667,18 +666,17 @@ static void s5p_tasklet_cb(unsigned long data)
struct s5p_aes_dev *dev = (struct s5p_aes_dev *)data;
struct crypto_async_request *async_req, *backlog;
struct s5p_aes_reqctx *reqctx;
-   unsigned long flags;
 
-   spin_lock_irqsave(>lock, flags);
+   mutex_lock(>lock);
backlog   = crypto_get_backlog(>queue);
async_req = crypto_dequeue_request(>queue);
 
if (!async_req) {
dev->busy = false;
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
return;
}
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
 
if (backlog)
backlog->complete(backlog, -EINPROGRESS);
@@ -693,18 +691,17 @@ static void s5p_tasklet_cb(unsigned long data)
 static int s5p_aes_handle_req(struct s5p_aes_dev *dev,
  struct ablkcipher_request *req)
 {
-   unsigned long flags;
int err;
 
-   spin_lock_irqsave(>lock, flags);
+   mutex_lock(>lock);
err = ablkcipher_enqueue_request(>queue, req);
if (dev->busy) {
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
goto exit;
}
dev->busy = true;
 
-   spin_unlock_irqrestore(>lock, flags);
+   mutex_unlock(>lock);
 
tasklet_schedule(>tasklet);
 
@@ -856,7 +853,7 @@ static int s5p_aes_probe(struct 

Re: [RFC PATCH v2 08/32] x86: Use PAGE_KERNEL protection for ioremap of memory page

2017-03-17 Thread Tom Lendacky

On 3/17/2017 9:32 AM, Tom Lendacky wrote:

On 3/16/2017 3:04 PM, Tom Lendacky wrote:

On 3/7/2017 8:59 AM, Borislav Petkov wrote:

On Thu, Mar 02, 2017 at 10:13:32AM -0500, Brijesh Singh wrote:

From: Tom Lendacky 

In order for memory pages to be properly mapped when SEV is active, we
need to use the PAGE_KERNEL protection attribute as the base
protection.
This will insure that memory mapping of, e.g. ACPI tables, receives the
proper mapping attributes.

Signed-off-by: Tom Lendacky 
---



diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c400ab5..481c999 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -151,7 +151,15 @@ static void __iomem
*__ioremap_caller(resource_size_t phys_addr,
pcm = new_pcm;
}

+   /*
+* If the page being mapped is in memory and SEV is active then
+* make sure the memory encryption attribute is enabled in the
+* resulting mapping.
+*/
prot = PAGE_KERNEL_IO;
+   if (sev_active() && page_is_mem(pfn))


Hmm, a resource tree walk per ioremap call. This could get expensive for
ioremap-heavy workloads.

__ioremap_caller() gets called here during boot 55 times so not a whole
lot but I wouldn't be surprised if there were some nasty use cases which
ioremap a lot.

...


diff --git a/kernel/resource.c b/kernel/resource.c
index 9b5f044..db56ba3 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -518,6 +518,46 @@ int __weak page_is_ram(unsigned long pfn)
 }
 EXPORT_SYMBOL_GPL(page_is_ram);

+/*
+ * This function returns true if the target memory is marked as
+ * IORESOURCE_MEM and IORESOUCE_BUSY and described as other than
+ * IORES_DESC_NONE (e.g. IORES_DESC_ACPI_TABLES).
+ */
+static int walk_mem_range(unsigned long start_pfn, unsigned long
nr_pages)
+{
+struct resource res;
+unsigned long pfn, end_pfn;
+u64 orig_end;
+int ret = -1;
+
+res.start = (u64) start_pfn << PAGE_SHIFT;
+res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+orig_end = res.end;
+while ((res.start < res.end) &&
+(find_next_iomem_res(, IORES_DESC_NONE, true) >= 0)) {
+pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
+end_pfn = (res.end + 1) >> PAGE_SHIFT;
+if (end_pfn > pfn)
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;
+if (ret)
+break;
+res.start = res.end + 1;
+res.end = orig_end;
+}
+return ret;
+}


So the relevant difference between this one and walk_system_ram_range()
is this:

-ret = (*func)(pfn, end_pfn - pfn, arg);
+ret = (res.desc != IORES_DESC_NONE) ? 1 : 0;

so it seems to me you can have your own *func() pointer which does that
IORES_DESC_NONE comparison. And then you can define your own workhorse
__walk_memory_range() which gets called by both walk_mem_range() and
walk_system_ram_range() instead of almost duplicating them.

And looking at walk_system_ram_res(), that one looks similar too except
the pfn computation. But AFAICT the pfn/end_pfn things are computed from
res.start and res.end so it looks to me like all those three functions
are crying for unification...


I'll take a look at what it takes to consolidate these with a pre-patch.
Then I'll add the new support.


It looks pretty straight forward to combine walk_iomem_res_desc() and
walk_system_ram_res(). The walk_system_ram_range() function would fit
easily into this, also, except for the fact that the callback function
takes unsigned longs vs the u64s of the other functions.  Is it worth
modifying all of the callers of walk_system_ram_range() (which are only
about 8 locations) to change the callback functions to accept u64s in
order to consolidate the walk_system_ram_range() function, too?


The more I dig, the more I find that the changes keep expanding. I'll
leave walk_system_ram_range() out of the consolidation for now.

Thanks,
Tom



Thanks,
Tom



Thanks,
Tom





[PATCH 0/4] crypto: s5p-sss - Fix and minor improvements

2017-03-17 Thread Krzysztof Kozlowski
Hi,

I still did not fix the NULL pointer dereference reported by
Nathan Royce [1], but I got some other improvements.

Testing done on Odroid U3 (Exynos4412) with tcrypt and cryptsetup.

Best regards,
Krzysztof


[1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351149.html

Krzysztof Kozlowski (4):
  crypto: s5p-sss - Close possible race for completed requests
  crypto: s5p-sss - Remove unused variant field from state container
  crypto: s5p-sss - Document the struct s5p_aes_dev
  crypto: s5p-sss - Use mutex instead of spinlock

 drivers/crypto/s5p-sss.c | 70 +++-
 1 file changed, 45 insertions(+), 25 deletions(-)

-- 
2.9.3



Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Michael Davidson
On Fri, Mar 17, 2017 at 5:44 AM, Peter Zijlstra  wrote:
>
> Be that as it may; what you construct above is disgusting. Surely the
> code can be refactored to not look like dog vomit?
>
> Also; its not immediately obvious conf->copies is 'small' and this
> doesn't blow up the stack; I feel that deserves a comment somewhere.
>

I agree that the code is horrible.

It is, in fact, exactly the same solution that was used to remove
variable length arrays in structs from several of the crypto drivers a
few years ago - see the definition of SHASH_DESC_ON_STACK() in
"crypto/hash.h" - I did not, however, hide the horrors in a macro
preferring to leave the implementation visible as a warning to whoever
might touch the code next.

I believe that the actual stack usage is exactly the same as it was previously.

I can certainly wrap this  up in a macro and add comments with
appropriately dire warnings in it if you feel that is both necessary
and sufficient.


Re: [PATCH 0/7] LLVM: make x86_64 kernel build with clang.

2017-03-17 Thread Dmitry Vyukov
On Fri, Mar 17, 2017 at 1:15 AM, Michael Davidson  wrote:
> This patch set is sufficient to get the x86_64 kernel to build
> and boot correctly with clang-3.8 or greater.
>
> The resulting build still has about 300 warnings, very few of
> which appear to be significant. Most of them should be fixable
> with some minor code refactoring although a few of them, such
> as the complaints about implict conversions between enumerated
> types may be candidates for just being disabled.

Thanks, Michael!
This will help us a lot with KMSAN (uninit use detector) and code coverage.


> Michael Davidson (7):
>   Makefile, LLVM: add -no-integrated-as to KBUILD_[AC]FLAGS
>   Makefile, x86, LLVM: disable unsupported optimization flags
>   x86, LLVM: suppress clang warnings about unaligned accesses
>   x86, boot, LLVM: #undef memcpy etc in string.c
>   x86, boot, LLVM: Use regparm=0 for memcpy and memset
>   md/raid10, LLVM: get rid of variable length array
>   crypto, x86, LLVM: aesni - fix token pasting
>
>  Makefile|  4 
>  arch/x86/Makefile   |  7 +++
>  arch/x86/boot/copy.S| 15 +--
>  arch/x86/boot/string.c  |  9 +
>  arch/x86/boot/string.h  | 13 +
>  arch/x86/crypto/aes_ctrby8_avx-x86_64.S |  7 ++-
>  drivers/md/raid10.c |  9 -
>  7 files changed, 52 insertions(+), 12 deletions(-)
>
> --
> 2.12.0.367.g23dc2f6d3c-goog
>


Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0

2017-03-17 Thread Herbert Xu
On Thu, Mar 16, 2017 at 11:24:31AM -0700, Kevin Hilman wrote:
> Hi Herbert,
> 
> Herbert Xu  writes:
> 
> > On Wed, Feb 22, 2017 at 07:55:24AM +0100, Heiner Kallweit wrote:
> >> Expose clock CLKID_RNG0 which is needed for the HW random number generator.
> >> 
> >> Signed-off-by: Heiner Kallweit 
> >
> > All patches applied.  Thanks.
> 
> Actually, can you just apply [PATCH 4/4] to your tree?
> 
> The clock and DT patches need to go through their respective trees or
> will otherwise have conflicts with other things going in via those
> trees.

It's too late now.  Please speak up sooner next time.  These
patches were posted a month ago.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH 0/7] crypto: caam - add Queue Interface (QI) support

2017-03-17 Thread Horia Geantă
RFC -> v1:
-rebased on latest cryptodev-2.6 tree
open-code tsk_cpus_allowed() - sync with commit
0c98d344fe5c "sched/core: Remove the tsk_cpus_allowed() wrapper"
-addressed Scott's feedback - removed most of the accessors
added in soc/qman (patch 4) and instead open-coded them in caam/qi


The patchset adds support for CAAM Queue Interface (QI), the additional
interface (besides job ring) available for submitting jobs to the engine
on platforms having DPAA (Datapath Acceleration Architecture).

Patches 1-4 are QMan dependencies.
During RFC stage, we agreed to go with them through the crypto tree:
https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg24105.html

Patch 5 adds a missing double inclusion guard in desc_constr.h.

Patch 6 adds the caam/qi job submission backend.

Patch 7 adds algorithms (ablkcipher and authenc) that run on top
of caam/qi. For now, their priority is set lower than caam/jr.

Thanks,
Horia

Horia Geantă (7):
  soc/qman: export volatile dequeue related structs
  soc/qman: add dedicated channel ID for CAAM
  soc/qman: export non-programmable FQD fields query
  soc/qman: add macros needed by caam/qi driver
  crypto: caam - avoid double inclusion in desc_constr.h
  crypto: caam - add Queue Interface (QI) backend support
  crypto: caam/qi - add ablkcipher and authenc algorithms

 drivers/crypto/caam/Kconfig|   20 +-
 drivers/crypto/caam/Makefile   |5 +
 drivers/crypto/caam/caamalg.c  |9 +-
 drivers/crypto/caam/caamalg_desc.c |   77 +-
 drivers/crypto/caam/caamalg_desc.h |   15 +-
 drivers/crypto/caam/caamalg_qi.c   | 2387 
 drivers/crypto/caam/ctrl.c |   58 +-
 drivers/crypto/caam/desc_constr.h  |5 +
 drivers/crypto/caam/intern.h   |   24 +
 drivers/crypto/caam/qi.c   |  805 
 drivers/crypto/caam/qi.h   |  201 +++
 drivers/crypto/caam/sg_sw_qm.h |  108 ++
 drivers/soc/fsl/qbman/qman.c   |4 +-
 drivers/soc/fsl/qbman/qman_ccsr.c  |6 +-
 drivers/soc/fsl/qbman/qman_priv.h  |   97 --
 include/soc/fsl/qman.h |  109 ++
 16 files changed, 3786 insertions(+), 144 deletions(-)
 create mode 100644 drivers/crypto/caam/caamalg_qi.c
 create mode 100644 drivers/crypto/caam/qi.c
 create mode 100644 drivers/crypto/caam/qi.h
 create mode 100644 drivers/crypto/caam/sg_sw_qm.h

-- 
2.12.0.264.gd6db3f216544



Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Peter Zijlstra
On Thu, Mar 16, 2017 at 05:15:19PM -0700, Michael Davidson wrote:
> Replace a variable length array in a struct by allocating
> the memory for the entire struct in a char array on the stack.
> 
> Signed-off-by: Michael Davidson 
> ---
>  drivers/md/raid10.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 063c43d83b72..158ebdff782c 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev 
> *mddev,
>   /* Use sync reads to get the blocks from somewhere else */
>   int sectors = r10_bio->sectors;
>   struct r10conf *conf = mddev->private;
> - struct {
> - struct r10bio r10_bio;
> - struct r10dev devs[conf->copies];
> - } on_stack;
> - struct r10bio *r10b = _stack.r10_bio;
> + char on_stack_r10_bio[sizeof(struct r10bio) +
> +   conf->copies * sizeof(struct r10dev)]
> +   __aligned(__alignof__(struct r10bio));
> + struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio;
>   int slot = 0;
>   int idx = 0;
>   struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;


That's disgusting. Why not fix LLVM to support this?


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Peter Zijlstra
On Fri, Mar 17, 2017 at 01:31:23PM +0100, Alexander Potapenko wrote:
> On Fri, Mar 17, 2017 at 1:08 PM, Peter Zijlstra  wrote:
> > On Thu, Mar 16, 2017 at 05:15:19PM -0700, Michael Davidson wrote:
> >> Replace a variable length array in a struct by allocating
> >> the memory for the entire struct in a char array on the stack.
> >>
> >> Signed-off-by: Michael Davidson 
> >> ---
> >>  drivers/md/raid10.c | 9 -
> >>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> >> index 063c43d83b72..158ebdff782c 100644
> >> --- a/drivers/md/raid10.c
> >> +++ b/drivers/md/raid10.c
> >> @@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev 
> >> *mddev,
> >>   /* Use sync reads to get the blocks from somewhere else */
> >>   int sectors = r10_bio->sectors;
> >>   struct r10conf *conf = mddev->private;
> >> - struct {
> >> - struct r10bio r10_bio;
> >> - struct r10dev devs[conf->copies];
> >> - } on_stack;
> >> - struct r10bio *r10b = _stack.r10_bio;
> >> + char on_stack_r10_bio[sizeof(struct r10bio) +
> >> +   conf->copies * sizeof(struct r10dev)]
> >> +   __aligned(__alignof__(struct r10bio));
> >> + struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio;
> >>   int slot = 0;
> >>   int idx = 0;
> >>   struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;
> >
> >
> > That's disgusting. Why not fix LLVM to support this?
> 
> IIUC there's only a handful of VLAIS instances in LLVM code, why not
> just drop them for the sake of better code portability?
> (To quote Linus, "this feature is an abomination":
> https://lkml.org/lkml/2013/9/23/500)

Be that as it may; what you construct above is disgusting. Surely the
code can be refactored to not look like dog vomit?

Also; its not immediately obvious conf->copies is 'small' and this
doesn't blow up the stack; I feel that deserves a comment somewhere.




Re: [PATCH 5/7] x86, boot, LLVM: Use regparm=0 for memcpy and memset

2017-03-17 Thread Peter Zijlstra
On Thu, Mar 16, 2017 at 05:15:18PM -0700, Michael Davidson wrote:
> Use the standard regparm=0 calling convention for memcpy and
> memset when building with clang.
> 
> This is a work around for a long standing clang bug
> (see https://llvm.org/bugs/show_bug.cgi?id=3997) where
> clang always uses the standard regparm=0 calling convention
> for any implcit calls to memcpy and memset that it generates
> (eg for structure assignments and initialization) even if an
> alternate calling convention such as regparm=3 has been specified.

Seriously, fix LLVM already.


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Alexander Potapenko
On Fri, Mar 17, 2017 at 1:08 PM, Peter Zijlstra  wrote:
> On Thu, Mar 16, 2017 at 05:15:19PM -0700, Michael Davidson wrote:
>> Replace a variable length array in a struct by allocating
>> the memory for the entire struct in a char array on the stack.
>>
>> Signed-off-by: Michael Davidson 
>> ---
>>  drivers/md/raid10.c | 9 -
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 063c43d83b72..158ebdff782c 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev 
>> *mddev,
>>   /* Use sync reads to get the blocks from somewhere else */
>>   int sectors = r10_bio->sectors;
>>   struct r10conf *conf = mddev->private;
>> - struct {
>> - struct r10bio r10_bio;
>> - struct r10dev devs[conf->copies];
>> - } on_stack;
>> - struct r10bio *r10b = _stack.r10_bio;
>> + char on_stack_r10_bio[sizeof(struct r10bio) +
>> +   conf->copies * sizeof(struct r10dev)]
>> +   __aligned(__alignof__(struct r10bio));
>> + struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio;
>>   int slot = 0;
>>   int idx = 0;
>>   struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;
>
>
> That's disgusting. Why not fix LLVM to support this?

IIUC there's only a handful of VLAIS instances in LLVM code, why not
just drop them for the sake of better code portability?
(To quote Linus, "this feature is an abomination":
https://lkml.org/lkml/2013/9/23/500)

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


[PATCH v2] dt-bindings: rng: clocks property on omap_rng not always mandatory

2017-03-17 Thread Thomas Petazzoni
Commit 52060836f79 ("dt-bindings: omap-rng: Document SafeXcel IP-76
device variant") update the omap_rng Device Tree binding to add support
for the IP-76 variation of the IP. As part of this change, a "clocks"
property was added, but is indicated as "Required", without indicated
it's actually only required for some compatible strings.

This commit fixes that, by explicitly stating that the clocks property
is only required with the inside-secure,safexcel-eip76 compatible
string.

Fixes: 52060836f79 ("dt-bindings: omap-rng: Document SafeXcel IP-76 device 
variant")
Cc: 
Signed-off-by: Thomas Petazzoni 
---
Changes since v1:
 - Instead of indicating the property as optional, indicate it as
   mandatory for the inside-secure,safexcel-eip76 compatible string, as
   suggested by Rob Herring.
---
 Documentation/devicetree/bindings/rng/omap_rng.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/rng/omap_rng.txt 
b/Documentation/devicetree/bindings/rng/omap_rng.txt
index 4714772..9cf7876 100644
--- a/Documentation/devicetree/bindings/rng/omap_rng.txt
+++ b/Documentation/devicetree/bindings/rng/omap_rng.txt
@@ -12,7 +12,8 @@ Required properties:
 - reg : Offset and length of the register set for the module
 - interrupts : the interrupt number for the RNG module.
Used for "ti,omap4-rng" and "inside-secure,safexcel-eip76"
-- clocks: the trng clock source
+- clocks: the trng clock source. Only mandatory for the
+  "inside-secure,safexcel-eip76" compatible.
 
 Example:
 /* AM335x */
-- 
2.7.4



[PATCH 7/7] crypto: caam/qi - add ablkcipher and authenc algorithms

2017-03-17 Thread Horia Geantă
Add support to submit ablkcipher and authenc algorithms
via the QI backend:
-ablkcipher:
cbc({aes,des,des3_ede})
ctr(aes), rfc3686(ctr(aes))
xts(aes)
-authenc:
authenc(hmac(md5),cbc({aes,des,des3_ede}))
authenc(hmac(sha*),cbc({aes,des,des3_ede}))

caam/qi being a new driver, let's wait some time to settle down without
interfering with existing caam/jr driver.
Accordingly, for now all caam/qi algorithms (caamalg_qi module) are
marked to be of lower priority than caam/jr ones (caamalg module).

Signed-off-by: Vakul Garg 
Signed-off-by: Alex Porosanu 
Signed-off-by: Horia Geantă 
---
 drivers/crypto/caam/Kconfig|   20 +-
 drivers/crypto/caam/Makefile   |1 +
 drivers/crypto/caam/caamalg.c  |9 +-
 drivers/crypto/caam/caamalg_desc.c |   77 +-
 drivers/crypto/caam/caamalg_desc.h |   15 +-
 drivers/crypto/caam/caamalg_qi.c   | 2387 
 drivers/crypto/caam/sg_sw_qm.h |  108 ++
 7 files changed, 2601 insertions(+), 16 deletions(-)
 create mode 100644 drivers/crypto/caam/caamalg_qi.c
 create mode 100644 drivers/crypto/caam/sg_sw_qm.h

diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig
index bc0d3569f8d9..e36aeacd7635 100644
--- a/drivers/crypto/caam/Kconfig
+++ b/drivers/crypto/caam/Kconfig
@@ -87,6 +87,23 @@ config CRYPTO_DEV_FSL_CAAM_CRYPTO_API
  To compile this as a module, choose M here: the module
  will be called caamalg.
 
+config CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI
+   tristate "Queue Interface as Crypto API backend"
+   depends on CRYPTO_DEV_FSL_CAAM_JR && FSL_DPAA && NET
+   default y
+   select CRYPTO_AUTHENC
+   select CRYPTO_BLKCIPHER
+   help
+ Selecting this will use CAAM Queue Interface (QI) for sending
+ & receiving crypto jobs to/from CAAM. This gives better performance
+ than job ring interface when the number of cores are more than the
+ number of job rings assigned to the kernel. The number of portals
+ assigned to the kernel should also be more than the number of
+ job rings.
+
+ To compile this as a module, choose M here: the module
+ will be called caamalg_qi.
+
 config CRYPTO_DEV_FSL_CAAM_AHASH_API
tristate "Register hash algorithm implementations with Crypto API"
depends on CRYPTO_DEV_FSL_CAAM_JR
@@ -136,4 +153,5 @@ config CRYPTO_DEV_FSL_CAAM_DEBUG
  information in the CAAM driver.
 
 config CRYPTO_DEV_FSL_CAAM_CRYPTO_API_DESC
-   def_tristate CRYPTO_DEV_FSL_CAAM_CRYPTO_API
+   def_tristate (CRYPTO_DEV_FSL_CAAM_CRYPTO_API || \
+ CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI)
diff --git a/drivers/crypto/caam/Makefile b/drivers/crypto/caam/Makefile
index 2e60e45c2bf1..9e2e98856b9b 100644
--- a/drivers/crypto/caam/Makefile
+++ b/drivers/crypto/caam/Makefile
@@ -8,6 +8,7 @@ endif
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_JR) += caam_jr.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_CRYPTO_API) += caamalg.o
+obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI) += caamalg_qi.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_CRYPTO_API_DESC) += caamalg_desc.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_AHASH_API) += caamhash.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API) += caamrng.o
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 9bc80eb06934..398807d1b77e 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -266,8 +266,9 @@ static int aead_set_sh_desc(struct crypto_aead *aead)
 
/* aead_encrypt shared descriptor */
desc = ctx->sh_desc_enc;
-   cnstr_shdsc_aead_encap(desc, >cdata, >adata, ctx->authsize,
-  is_rfc3686, nonce, ctx1_iv_off);
+   cnstr_shdsc_aead_encap(desc, >cdata, >adata, ivsize,
+  ctx->authsize, is_rfc3686, nonce, ctx1_iv_off,
+  false);
dma_sync_single_for_device(jrdev, ctx->sh_desc_enc_dma,
   desc_bytes(desc), DMA_TO_DEVICE);
 
@@ -299,7 +300,7 @@ static int aead_set_sh_desc(struct crypto_aead *aead)
desc = ctx->sh_desc_dec;
cnstr_shdsc_aead_decap(desc, >cdata, >adata, ivsize,
   ctx->authsize, alg->caam.geniv, is_rfc3686,
-  nonce, ctx1_iv_off);
+  nonce, ctx1_iv_off, false);
dma_sync_single_for_device(jrdev, ctx->sh_desc_dec_dma,
   desc_bytes(desc), DMA_TO_DEVICE);
 
@@ -333,7 +334,7 @@ static int aead_set_sh_desc(struct crypto_aead *aead)
desc = ctx->sh_desc_enc;
cnstr_shdsc_aead_givencap(desc, >cdata, >adata, ivsize,
  ctx->authsize, is_rfc3686, nonce,
- ctx1_iv_off);
+ ctx1_iv_off, false);
dma_sync_single_for_device(jrdev, 

[PATCH 6/7] crypto: caam - add Queue Interface (QI) backend support

2017-03-17 Thread Horia Geantă
CAAM engine supports two interfaces for crypto job submission:
-job ring interface - already existing caam/jr driver
-Queue Interface (QI) - caam/qi driver added in current patch

QI is present in CAAM engines found on DPAA platforms.
QI gets its I/O (frame descriptors) from QMan (Queue Manager) queues.

This patch adds a platform device for accessing CAAM's queue interface.
The requests are submitted to CAAM using one frame queue per
cryptographic context. Each crypto context has one shared descriptor.
This shared descriptor is attached to frame queue associated with
corresponding driver context using context_a.

The driver hides the mechanics of FQ creation, initialisation from its
applications. Each cryptographic context needs to be associated with
driver context which houses the FQ to be used to transport the job to
CAAM. The driver provides API for:
(a) Context creation
(b) Job submission
(c) Context deletion
(d) Congestion indication - whether path to/from CAAM is congested

The driver supports affining its context to a particular CPU.
This means that any responses from CAAM for the context in question
would arrive at the given CPU. This helps in implementing one CPU
per packet round trip in IPsec application.

The driver processes CAAM responses under NAPI contexts.
NAPI contexts are instantiated only on cores with affined portals since
only cores having their own portal can receive responses from DQRR.

The responses from CAAM for all cryptographic contexts ride on a fixed
set of FQs. We use one response FQ per portal owning core. The response
FQ is configured in each core's and thus portal's dedicated channel.
This gives the flexibility to direct CAAM's responses for a crypto
context on a given core.

Signed-off-by: Vakul Garg 
Signed-off-by: Alex Porosanu 
Signed-off-by: Horia Geantă 
---
 drivers/crypto/caam/Makefile |   4 +
 drivers/crypto/caam/ctrl.c   |  58 ++--
 drivers/crypto/caam/intern.h |  24 ++
 drivers/crypto/caam/qi.c | 805 +++
 drivers/crypto/caam/qi.h | 201 +++
 5 files changed, 1064 insertions(+), 28 deletions(-)
 create mode 100644 drivers/crypto/caam/qi.c
 create mode 100644 drivers/crypto/caam/qi.h

diff --git a/drivers/crypto/caam/Makefile b/drivers/crypto/caam/Makefile
index 6554742f357e..2e60e45c2bf1 100644
--- a/drivers/crypto/caam/Makefile
+++ b/drivers/crypto/caam/Makefile
@@ -16,3 +16,7 @@ obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM_PKC_API) += caam_pkc.o
 caam-objs := ctrl.o
 caam_jr-objs := jr.o key_gen.o error.o
 caam_pkc-y := caampkc.o pkc_desc.o
+ifneq ($(CONFIG_CRYPTO_DEV_FSL_CAAM_CRYPTO_API_QI),)
+   ccflags-y += -DCONFIG_CAAM_QI
+   caam-objs += qi.o
+endif
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index fef39f9f41ee..b3a94d5eff26 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -18,6 +18,10 @@
 bool caam_little_end;
 EXPORT_SYMBOL(caam_little_end);
 
+#ifdef CONFIG_CAAM_QI
+#include "qi.h"
+#endif
+
 /*
  * i.MX targets tend to have clock control subsystems that can
  * enable/disable clocking to our device.
@@ -311,6 +315,11 @@ static int caam_remove(struct platform_device *pdev)
for (ring = 0; ring < ctrlpriv->total_jobrs; ring++)
of_device_unregister(ctrlpriv->jrpdev[ring]);
 
+#ifdef CONFIG_CAAM_QI
+   if (ctrlpriv->qidev)
+   caam_qi_shutdown(ctrlpriv->qidev);
+#endif
+
/* De-initialize RNG state handles initialized by this driver. */
if (ctrlpriv->rng4_sh_init)
deinstantiate_rng(ctrldev, ctrlpriv->rng4_sh_init);
@@ -401,23 +410,6 @@ int caam_get_era(void)
 }
 EXPORT_SYMBOL(caam_get_era);
 
-#ifdef CONFIG_DEBUG_FS
-static int caam_debugfs_u64_get(void *data, u64 *val)
-{
-   *val = caam64_to_cpu(*(u64 *)data);
-   return 0;
-}
-
-static int caam_debugfs_u32_get(void *data, u64 *val)
-{
-   *val = caam32_to_cpu(*(u32 *)data);
-   return 0;
-}
-
-DEFINE_SIMPLE_ATTRIBUTE(caam_fops_u32_ro, caam_debugfs_u32_get, NULL, 
"%llu\n");
-DEFINE_SIMPLE_ATTRIBUTE(caam_fops_u64_ro, caam_debugfs_u64_get, NULL, 
"%llu\n");
-#endif
-
 /* Probe routine for CAAM top (controller) level */
 static int caam_probe(struct platform_device *pdev)
 {
@@ -615,6 +607,17 @@ static int caam_probe(struct platform_device *pdev)
goto iounmap_ctrl;
}
 
+#ifdef CONFIG_DEBUG_FS
+   /*
+* FIXME: needs better naming distinction, as some amalgamation of
+* "caam" and nprop->full_name. The OF name isn't distinctive,
+* but does separate instances
+*/
+   perfmon = (struct caam_perfmon __force *)>perfmon;
+
+   ctrlpriv->dfs_root = debugfs_create_dir(dev_name(dev), NULL);
+   ctrlpriv->ctl = debugfs_create_dir("ctl", ctrlpriv->dfs_root);
+#endif
ring = 0;
ridx = 0;
ctrlpriv->total_jobrs = 0;
@@ -650,6 +653,13 @@ static int caam_probe(struct 

[PATCH 1/7] soc/qman: export volatile dequeue related structs

2017-03-17 Thread Horia Geantă
Since qman_volatile_dequeue() is already exported, move the related
structures into the public header too.

Signed-off-by: Horia Geantă 
---
 drivers/soc/fsl/qbman/qman_priv.h | 36 
 include/soc/fsl/qman.h| 36 
 2 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman_priv.h 
b/drivers/soc/fsl/qbman/qman_priv.h
index 53685b59718e..64781eff6974 100644
--- a/drivers/soc/fsl/qbman/qman_priv.h
+++ b/drivers/soc/fsl/qbman/qman_priv.h
@@ -271,42 +271,6 @@ const struct qm_portal_config 
*qman_destroy_affine_portal(void);
  */
 int qman_query_fq(struct qman_fq *fq, struct qm_fqd *fqd);
 
-/*
- * For qman_volatile_dequeue(); Choose one PRECEDENCE. EXACT is optional. Use
- * NUMFRAMES(n) (6-bit) or NUMFRAMES_TILLEMPTY to fill in the frame-count. Use
- * FQID(n) to fill in the frame queue ID.
- */
-#define QM_VDQCR_PRECEDENCE_VDQCR  0x0
-#define QM_VDQCR_PRECEDENCE_SDQCR  0x8000
-#define QM_VDQCR_EXACT 0x4000
-#define QM_VDQCR_NUMFRAMES_MASK0x3f00
-#define QM_VDQCR_NUMFRAMES_SET(n)  (((n) & 0x3f) << 24)
-#define QM_VDQCR_NUMFRAMES_GET(n)  (((n) >> 24) & 0x3f)
-#define QM_VDQCR_NUMFRAMES_TILLEMPTY   QM_VDQCR_NUMFRAMES_SET(0)
-
-#define QMAN_VOLATILE_FLAG_WAIT 0x0001 /* wait if VDQCR is in 
use */
-#define QMAN_VOLATILE_FLAG_WAIT_INT  0x0002 /* if wait, interruptible? */
-#define QMAN_VOLATILE_FLAG_FINISH0x0004 /* wait till VDQCR completes */
-
-/*
- * qman_volatile_dequeue - Issue a volatile dequeue command
- * @fq: the frame queue object to dequeue from
- * @flags: a bit-mask of QMAN_VOLATILE_FLAG_*** options
- * @vdqcr: bit mask of QM_VDQCR_*** options, as per qm_dqrr_vdqcr_set()
- *
- * Attempts to lock access to the portal's VDQCR volatile dequeue 
functionality.
- * The function will block and sleep if QMAN_VOLATILE_FLAG_WAIT is specified 
and
- * the VDQCR is already in use, otherwise returns non-zero for failure. If
- * QMAN_VOLATILE_FLAG_FINISH is specified, the function will only return once
- * the VDQCR command has finished executing (ie. once the callback for the last
- * DQRR entry resulting from the VDQCR command has been called). If not using
- * the FINISH flag, completion can be determined either by detecting the
- * presence of the QM_DQRR_STAT_UNSCHEDULED and QM_DQRR_STAT_DQCR_EXPIRED bits
- * in the "stat" parameter passed to the FQ's dequeue callback, or by waiting
- * for the QMAN_FQ_STATE_VDQCR bit to disappear.
- */
-int qman_volatile_dequeue(struct qman_fq *fq, u32 flags, u32 vdqcr);
-
 int qman_alloc_fq_table(u32 num_fqids);
 
 /*   QMan s/w corenet portal, low-level i/face  */
diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h
index 3d4df74a96de..4de1ffcc8982 100644
--- a/include/soc/fsl/qman.h
+++ b/include/soc/fsl/qman.h
@@ -791,6 +791,23 @@ struct qman_cgr {
 #define QMAN_INITFQ_FLAG_SCHED  0x0001 /* schedule rather than park */
 #define QMAN_INITFQ_FLAG_LOCAL  0x0004 /* set dest portal */
 
+/*
+ * For qman_volatile_dequeue(); Choose one PRECEDENCE. EXACT is optional. Use
+ * NUMFRAMES(n) (6-bit) or NUMFRAMES_TILLEMPTY to fill in the frame-count. Use
+ * FQID(n) to fill in the frame queue ID.
+ */
+#define QM_VDQCR_PRECEDENCE_VDQCR  0x0
+#define QM_VDQCR_PRECEDENCE_SDQCR  0x8000
+#define QM_VDQCR_EXACT 0x4000
+#define QM_VDQCR_NUMFRAMES_MASK0x3f00
+#define QM_VDQCR_NUMFRAMES_SET(n)  (((n) & 0x3f) << 24)
+#define QM_VDQCR_NUMFRAMES_GET(n)  (((n) >> 24) & 0x3f)
+#define QM_VDQCR_NUMFRAMES_TILLEMPTY   QM_VDQCR_NUMFRAMES_SET(0)
+
+#define QMAN_VOLATILE_FLAG_WAIT 0x0001 /* wait if VDQCR is in 
use */
+#define QMAN_VOLATILE_FLAG_WAIT_INT  0x0002 /* if wait, interruptible? */
+#define QMAN_VOLATILE_FLAG_FINISH0x0004 /* wait till VDQCR completes */
+
/* Portal Management */
 /**
  * qman_p_irqsource_add - add processing sources to be interrupt-driven
@@ -963,6 +980,25 @@ int qman_retire_fq(struct qman_fq *fq, u32 *flags);
  */
 int qman_oos_fq(struct qman_fq *fq);
 
+/*
+ * qman_volatile_dequeue - Issue a volatile dequeue command
+ * @fq: the frame queue object to dequeue from
+ * @flags: a bit-mask of QMAN_VOLATILE_FLAG_*** options
+ * @vdqcr: bit mask of QM_VDQCR_*** options, as per qm_dqrr_vdqcr_set()
+ *
+ * Attempts to lock access to the portal's VDQCR volatile dequeue 
functionality.
+ * The function will block and sleep if QMAN_VOLATILE_FLAG_WAIT is specified 
and
+ * the VDQCR is already in use, otherwise returns non-zero for failure. If
+ * QMAN_VOLATILE_FLAG_FINISH is specified, the function will only return once
+ * the VDQCR command has finished executing (ie. once the callback for the last
+ * DQRR entry resulting from the VDQCR command has been called). If not using
+ * the FINISH flag, completion can be determined either by detecting 

[PATCH 2/7] soc/qman: add dedicated channel ID for CAAM

2017-03-17 Thread Horia Geantă
Add and export the ID of the channel serviced by the
CAAM (Cryptographic Acceleration and Assurance Module) DCP.

Signed-off-by: Horia Geantă 
---
 drivers/soc/fsl/qbman/qman_ccsr.c | 6 +-
 include/soc/fsl/qman.h| 3 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/qman_ccsr.c 
b/drivers/soc/fsl/qbman/qman_ccsr.c
index f4e6e70de259..90bc40c48675 100644
--- a/drivers/soc/fsl/qbman/qman_ccsr.c
+++ b/drivers/soc/fsl/qbman/qman_ccsr.c
@@ -34,6 +34,8 @@ u16 qman_ip_rev;
 EXPORT_SYMBOL(qman_ip_rev);
 u16 qm_channel_pool1 = QMAN_CHANNEL_POOL1;
 EXPORT_SYMBOL(qm_channel_pool1);
+u16 qm_channel_caam = QMAN_CHANNEL_CAAM;
+EXPORT_SYMBOL(qm_channel_caam);
 
 /* Register offsets */
 #define REG_QCSP_LIO_CFG(n)(0x + ((n) * 0x10))
@@ -720,8 +722,10 @@ static int fsl_qman_probe(struct platform_device *pdev)
return -ENODEV;
}
 
-   if ((qman_ip_rev & 0xff00) >= QMAN_REV30)
+   if ((qman_ip_rev & 0xff00) >= QMAN_REV30) {
qm_channel_pool1 = QMAN_CHANNEL_POOL1_REV3;
+   qm_channel_caam = QMAN_CHANNEL_CAAM_REV3;
+   }
 
ret = zero_priv_mem(dev, node, fqd_a, fqd_sz);
WARN_ON(ret);
diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h
index 4de1ffcc8982..10b549783ec5 100644
--- a/include/soc/fsl/qman.h
+++ b/include/soc/fsl/qman.h
@@ -36,8 +36,11 @@
 /* Hardware constants */
 #define QM_CHANNEL_SWPORTAL0 0
 #define QMAN_CHANNEL_POOL1 0x21
+#define QMAN_CHANNEL_CAAM 0x80
 #define QMAN_CHANNEL_POOL1_REV3 0x401
+#define QMAN_CHANNEL_CAAM_REV3 0x840
 extern u16 qm_channel_pool1;
+extern u16 qm_channel_caam;
 
 /* Portal processing (interrupt) sources */
 #define QM_PIRQ_CSCI   0x0010  /* Congestion State Change */
-- 
2.12.0.264.gd6db3f216544



[PATCH 3/7] soc/qman: export non-programmable FQD fields query

2017-03-17 Thread Horia Geantă
Export qman_query_fq_np() function and related structures.
This will be needed in the caam/qi driver, where "queue empty"
condition will be decided based on the frm_cnt.

Signed-off-by: Horia Geantă 
---
 drivers/soc/fsl/qbman/qman.c  |  4 +--
 drivers/soc/fsl/qbman/qman_priv.h | 61 ---
 include/soc/fsl/qman.h| 68 +++
 3 files changed, 70 insertions(+), 63 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index 6f509f68085e..3d891db57ee6 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -2019,8 +2019,7 @@ int qman_query_fq(struct qman_fq *fq, struct qm_fqd *fqd)
return ret;
 }
 
-static int qman_query_fq_np(struct qman_fq *fq,
-   struct qm_mcr_queryfq_np *np)
+int qman_query_fq_np(struct qman_fq *fq, struct qm_mcr_queryfq_np *np)
 {
union qm_mc_command *mcc;
union qm_mc_result *mcr;
@@ -2046,6 +2045,7 @@ static int qman_query_fq_np(struct qman_fq *fq,
put_affine_portal();
return ret;
 }
+EXPORT_SYMBOL(qman_query_fq_np);
 
 static int qman_query_cgr(struct qman_cgr *cgr,
  struct qm_mcr_querycgr *cgrd)
diff --git a/drivers/soc/fsl/qbman/qman_priv.h 
b/drivers/soc/fsl/qbman/qman_priv.h
index 64781eff6974..22725bdc6f15 100644
--- a/drivers/soc/fsl/qbman/qman_priv.h
+++ b/drivers/soc/fsl/qbman/qman_priv.h
@@ -89,67 +89,6 @@ static inline u64 qm_mcr_querycgr_a_get64(const struct 
qm_mcr_querycgr *q)
return ((u64)q->a_bcnt_hi << 32) | be32_to_cpu(q->a_bcnt_lo);
 }
 
-/* "Query FQ Non-Programmable Fields" */
-
-struct qm_mcr_queryfq_np {
-   u8 verb;
-   u8 result;
-   u8 __reserved1;
-   u8 state;   /* QM_MCR_NP_STATE_*** */
-   u32 fqd_link;   /* 24-bit, _res2[24-31] */
-   u16 odp_seq;/* 14-bit, _res3[14-15] */
-   u16 orp_nesn;   /* 14-bit, _res4[14-15] */
-   u16 orp_ea_hseq;/* 15-bit, _res5[15] */
-   u16 orp_ea_tseq;/* 15-bit, _res6[15] */
-   u32 orp_ea_hptr;/* 24-bit, _res7[24-31] */
-   u32 orp_ea_tptr;/* 24-bit, _res8[24-31] */
-   u32 pfdr_hptr;  /* 24-bit, _res9[24-31] */
-   u32 pfdr_tptr;  /* 24-bit, _res10[24-31] */
-   u8 __reserved2[5];
-   u8 is;  /* 1-bit, _res12[1-7] */
-   u16 ics_surp;
-   u32 byte_cnt;
-   u32 frm_cnt;/* 24-bit, _res13[24-31] */
-   u32 __reserved3;
-   u16 ra1_sfdr;   /* QM_MCR_NP_RA1_*** */
-   u16 ra2_sfdr;   /* QM_MCR_NP_RA2_*** */
-   u16 __reserved4;
-   u16 od1_sfdr;   /* QM_MCR_NP_OD1_*** */
-   u16 od2_sfdr;   /* QM_MCR_NP_OD2_*** */
-   u16 od3_sfdr;   /* QM_MCR_NP_OD3_*** */
-} __packed;
-
-#define QM_MCR_NP_STATE_FE 0x10
-#define QM_MCR_NP_STATE_R  0x08
-#define QM_MCR_NP_STATE_MASK   0x07/* Reads FQD::STATE; */
-#define QM_MCR_NP_STATE_OOS0x00
-#define QM_MCR_NP_STATE_RETIRED0x01
-#define QM_MCR_NP_STATE_TEN_SCHED  0x02
-#define QM_MCR_NP_STATE_TRU_SCHED  0x03
-#define QM_MCR_NP_STATE_PARKED 0x04
-#define QM_MCR_NP_STATE_ACTIVE 0x05
-#define QM_MCR_NP_PTR_MASK 0x07ff  /* for RA[12] & OD[123] */
-#define QM_MCR_NP_RA1_NRA(v)   (((v) >> 14) & 0x3) /* FQD::NRA */
-#define QM_MCR_NP_RA2_IT(v)(((v) >> 14) & 0x1) /* FQD::IT */
-#define QM_MCR_NP_OD1_NOD(v)   (((v) >> 14) & 0x3) /* FQD::NOD */
-#define QM_MCR_NP_OD3_NPC(v)   (((v) >> 14) & 0x3) /* FQD::NPC */
-
-enum qm_mcr_queryfq_np_masks {
-   qm_mcr_fqd_link_mask = BIT(24)-1,
-   qm_mcr_odp_seq_mask = BIT(14)-1,
-   qm_mcr_orp_nesn_mask = BIT(14)-1,
-   qm_mcr_orp_ea_hseq_mask = BIT(15)-1,
-   qm_mcr_orp_ea_tseq_mask = BIT(15)-1,
-   qm_mcr_orp_ea_hptr_mask = BIT(24)-1,
-   qm_mcr_orp_ea_tptr_mask = BIT(24)-1,
-   qm_mcr_pfdr_hptr_mask = BIT(24)-1,
-   qm_mcr_pfdr_tptr_mask = BIT(24)-1,
-   qm_mcr_is_mask = BIT(1)-1,
-   qm_mcr_frm_cnt_mask = BIT(24)-1,
-};
-#define qm_mcr_np_get(np, field) \
-   ((np)->field & (qm_mcr_##field##_mask))
-
 /* Congestion Groups */
 
 /*
diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h
index 10b549783ec5..0252c32f7421 100644
--- a/include/soc/fsl/qman.h
+++ b/include/soc/fsl/qman.h
@@ -811,6 +811,67 @@ struct qman_cgr {
 #define QMAN_VOLATILE_FLAG_WAIT_INT  0x0002 /* if wait, interruptible? */
 #define QMAN_VOLATILE_FLAG_FINISH0x0004 /* wait till VDQCR completes */
 
+/* "Query FQ Non-Programmable Fields" */
+struct qm_mcr_queryfq_np {
+   u8 verb;
+   u8 result;
+   u8 __reserved1;
+   u8 state;   /* QM_MCR_NP_STATE_*** */
+   u32 fqd_link;   /* 24-bit, _res2[24-31] */
+   u16 odp_seq;/* 

[PATCH 5/7] crypto: caam - avoid double inclusion in desc_constr.h

2017-03-17 Thread Horia Geantă
Signed-off-by: Horia Geantă 
---
 drivers/crypto/caam/desc_constr.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/crypto/caam/desc_constr.h 
b/drivers/crypto/caam/desc_constr.h
index b9c8d98ef826..d8e83ca104e0 100644
--- a/drivers/crypto/caam/desc_constr.h
+++ b/drivers/crypto/caam/desc_constr.h
@@ -4,6 +4,9 @@
  * Copyright 2008-2012 Freescale Semiconductor, Inc.
  */
 
+#ifndef DESC_CONSTR_H
+#define DESC_CONSTR_H
+
 #include "desc.h"
 #include "regs.h"
 
@@ -491,3 +494,5 @@ static inline int desc_inline_query(unsigned int 
sd_base_len,
 
return (rem_bytes >= 0) ? 0 : -1;
 }
+
+#endif /* DESC_CONSTR_H */
-- 
2.12.0.264.gd6db3f216544



[PATCH 4/7] soc/qman: add macros needed by caam/qi driver

2017-03-17 Thread Horia Geantă
A few other things need to be added in soc/qman, such that
caam/qi won't open-code them.

Signed-off-by: Horia Geantă 
---
 include/soc/fsl/qman.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h
index 0252c32f7421..d4dfefdee6c1 100644
--- a/include/soc/fsl/qman.h
+++ b/include/soc/fsl/qman.h
@@ -168,6 +168,7 @@ static inline void qm_fd_set_param(struct qm_fd *fd, enum 
qm_fd_format fmt,
 #define qm_fd_set_contig_big(fd, len) \
qm_fd_set_param(fd, qm_fd_contig_big, 0, len)
 #define qm_fd_set_sg_big(fd, len) qm_fd_set_param(fd, qm_fd_sg_big, 0, len)
+#define qm_fd_set_compound(fd, len) qm_fd_set_param(fd, qm_fd_compound, 0, len)
 
 static inline void qm_fd_clear_fd(struct qm_fd *fd)
 {
@@ -642,6 +643,7 @@ struct qm_mcc_initcgr {
 #define QM_CGR_WE_MODE 0x0001
 
 #define QMAN_CGR_FLAG_USE_INIT  0x0001
+#define QMAN_CGR_MODE_FRAME  0x0001
 
/* Portal and Frame Queues */
 /* Represents a managed portal */
-- 
2.12.0.264.gd6db3f216544



Re: [RFC PATCH v2 29/32] kvm: svm: Add support for SEV DEBUG_DECRYPT command

2017-03-17 Thread Paolo Bonzini


On 16/03/2017 19:41, Brijesh Singh wrote:
>>
>> Please do add it, it doesn't seem very different from what you're doing
>> in LAUNCH_UPDATE_DATA.  There's no need for a separate
>> __sev_dbg_decrypt_page function, you can just pin/unpin here and do a
>> per-page loop as in LAUNCH_UPDATE_DATA.
> 
> I can certainly add support to handle crossing the page boundary cases.
> Should we limit the size to prevent user passing arbitrary long length
> and we end up looping inside the kernel? I was thinking to limit to a
> PAGE_SIZE.

I guess it depends on how it's used.  PAGE_SIZE makes sense since you
only know if a physical address is encrypted when you reach it from a
visit of the page tables.

Paolo


Re: [PATCH] dt-bindings: rng: clocks property on omap_rng is optional

2017-03-17 Thread Thomas Petazzoni
Hello,

On Wed, 15 Mar 2017 12:57:37 -0500, Rob Herring wrote:

> > diff --git a/Documentation/devicetree/bindings/rng/omap_rng.txt 
> > b/Documentation/devicetree/bindings/rng/omap_rng.txt
> > index 4714772..20d435da 100644
> > --- a/Documentation/devicetree/bindings/rng/omap_rng.txt
> > +++ b/Documentation/devicetree/bindings/rng/omap_rng.txt
> > @@ -12,6 +12,9 @@ Required properties:
> >  - reg : Offset and length of the register set for the module
> >  - interrupts : the interrupt number for the RNG module.
> > Used for "ti,omap4-rng" and "inside-secure,safexcel-eip76"
> > +
> > +Optional properties:
> > +  
> 
> Wouldn't just "for ? compatible only" be more correct?

I don't know if at the HW point of view the EIP76 will *always* need a
clock. Maybe it depends on the integration in the SoC.

But anyway, let's mark the clocks property as mandatory for
"inside-secure,safexcel-eip76" for the moment, we can always relax this
requirement later on if we realize that some EIP76 have been integrated
in a way that doesn't require a clock.

It is worth mentioning that the actual driver implementation simply
makes the clock optional in all cases, without looking at the
compatible to figure out if the clock must be there or not. But that's
just the current driver implementation. The Device Tree binding
specification can be more specific than what the current driver does.

Therefore: v2 coming.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com