Re: ARM FDPIC_FUNCPTRS personality flag handling looks broken

2021-03-25 Thread Nicolas Pitre
On Thu, 25 Mar 2021, Jann Horn wrote:

> Hi!
> 
> Tavis noticed that on ARM kernels with CONFIG_BINFMT_ELF_FDPIC, it
> looks like the FDPIC_FUNCPTRS personality flag is not reset on
> execve(). This would mean that if a process first executes an ELF
> FDPIC binary (which forces the personality to PER_LINUX_FDPIC), and
> then executes a non-FDPIC binary, the signal handling code
> (setup_return()) will have bogus behavior (interpreting a normal
> function pointer as an FDPIC function handle).
> 
> I think FDPIC_FUNCPTRS should probably either be reset on every
> execve() or not be a personality flag at all (since AFAIU pretty much
> the whole point of personality flags is that they control behavior
> even across execve()).

I think you're right. This is probably true for SH as well.
I'd recommend the former solution as being the least intrusive one.


Nicolas


Re: [PATCH] kbuild: collect minimum tool versions into scripts/min-tool-version.sh

2021-03-11 Thread Nicolas Pitre
On Thu, 11 Mar 2021, Miguel Ojeda wrote:

> On Thu, Mar 11, 2021 at 10:47 AM Masahiro Yamada  wrote:
> >
> > +# When you raise the minimum version, please update
> > +# Documentation/process/changes.rst as well.
> > +min_gcc_version=4.9.0
> > +min_llvm_version=10.0.1
> > +min_icc_version=16.0.3 # temporary
> > +min_binutils_version=2.23.0
> 
> +1 to creating a central place for all minimum versions.
> 
> Acked-by: Miguel Ojeda 
> 
> I wonder if you considered creating a folder with files like
> `scripts/min_versions/gcc` containing the version string. That would
> make it easier for reading from other languages or even importing them
> dynamically into the documentation, thus removing even more
> duplication.

Alternatively, the documentation could be the actual reference and the 
script would parse the documentation to get those values out.


Nicolas


Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

2021-03-10 Thread Nicolas Pitre
On Wed, 10 Mar 2021, Nick Desaulniers wrote:

> On Wed, Mar 10, 2021 at 1:08 PM Arnd Bergmann  wrote:
> >
> > On Wed, Mar 10, 2021 at 9:50 PM Masahiro Yamada  
> > wrote:
> > > On Mon, Mar 1, 2021 at 10:11 AM Nicholas Piggin  wrote:
> > > > Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm:
> >
> > >
> > > masahiro@oscar:~/ref/linux$ echo  'void this_func_is_unused(void) {}'
> > > >>  kernel/cpu.c
> > > masahiro@oscar:~/ref/linux$ export
> > > CROSS_COMPILE=/home/masahiro/tools/powerpc-10.1.0/bin/powerpc-linux-
> > > masahiro@oscar:~/ref/linux$ make ARCH=powerpc  defconfig
> > > masahiro@oscar:~/ref/linux$ ./scripts/config  -e EXPERT
> > > masahiro@oscar:~/ref/linux$ ./scripts/config  -e 
> > > LD_DEAD_CODE_DATA_ELIMINATION
> > > masahiro@oscar:~/ref/linux$
> > > ~/tools/powerpc-10.1.0/bin/powerpc-linux-nm -n  vmlinux | grep
> > > this_func
> > > c0170560 T .this_func_is_unused
> > > c1d8d560 D this_func_is_unused
> > > masahiro@oscar:~/ref/linux$ grep DEAD_CODE_ .config
> > > CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y
> > > CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
> > >
> > >
> > > If I remember correctly,
> > > LD_DEAD_CODE_DATA_ELIMINATION dropped unused functions
> > > when I tried it last time.
> > >
> > >
> > > I also tried arm64 with a HAVE_LD_DEAD_CODE_DATA_ELIMINATION hack.
> > > The result was the same.
> > >
> > >
> > >
> > > Am I missing something?
> >
> > It's possible that it only works in combination with CLANG_LTO now
> > because something broke. I definitely saw a reduction in kernel
> > size when both options are enabled, but did not try a simple test
> > case like you did.
> >
> > Maybe some other reference gets created that prevents the function
> > from being garbage-collected unless that other option is removed
> > as well?
> 
> I wish the linker had a debug flag that could let developers discover
> the decisions it made during --gc-sections as to why certain symbols
> were retained/kept or not.

The GNU LD has --print-gc-sections to list those sections that were 
dropped. And normally you should be able to find why a section wasn't 
dropped by looking for dependencies in the linker map file.


Nicolas


Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

2021-03-10 Thread Nicolas Pitre
On Wed, 10 Mar 2021, Sedat Dilek wrote:

> The best results on size-reduction of vmlinux I got with Clang-CFI on x86-64.
> 
> Clang-LTO and Clang-CFI:
> I was able to build with CONFIG_TRIM_UNUSED_KSYMS=y which needs to add
> a whitelist file or add a whitelist to scripts/gen_autoksyms.sh.
> And boot on bare metal.
> Furthermore, I was able to compile
> CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y with and without
> CONFIG_TRIM_UNUSED_KSYMS=y.
> Every kernel I had CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y does not boot.
> Yes, there is a size reduction with both enabled but not that good as
> with Clang-CFI.
> All testings with several iterations of LLVM/Clang v13-git.
> With CONFIG_TRIM_UNUSED_KSYMS=y I see a 3x-loops of building .version
> and folowing steps - got no answer if this is intended.

Yes it is intended. I explained it here:

https://lkml.org/lkml/2021/3/9/1099

With CONFIG_TRIM_UNUSED_KSYMS some EXPORT_SYMBOL() are removed, which 
allows for optimizing away the corresponding code, which in turn opens 
the possibility for more EXPORT_SYMBOL() to be removed, etc. The process 
eventually converge to a stable build. Normally only 2 passes are needed 
to converge, but LTO opens the possibilities for extra passes.

> Means longer build-time.

Oh, absolutely.  LTO (at least when I played with it) is slow. Add the 
multi-pass from CONFIG_TRIM_UNUSED_KSYMS on top of that and your kernel 
build becomes agonizingly slow. This is not something you want when 
doing kernel development.

> I did not follow this anymore as both Kconfigs with Clang-LTO consume
> more build-time and the resulting vmlinux is some MiB bigger than with
> Clang-CFI.

That's rather strange. At least with gcc LTO I always obtained smaller 
kernels.


Nicolas


Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

2021-03-10 Thread Nicolas Pitre
On Mon, 1 Mar 2021, Nicholas Piggin wrote:

> Excerpts from Arnd Bergmann's message of February 27, 2021 7:49 pm:
> > Unlike what Nick expected in his submission, I now think the annotations
> > will be needed for LTO just like they are for --gc-sections.
> 
> Yeah I wasn't sure exactly what LTO looks like or how it would work.
> I thought perhaps LTO might be able to find dead code with circular / 
> back references, we could put references from the code back to these 
> tables or something so they would be kept without KEEP. I don't know, I 
> was handwaving!
> 
> I managed to get powerpc (and IIRC x86?) working with gc sections with
> those KEEP annotations, but effectiveness of course is far worse than 
> what Nicolas was able to achieve with all his techniques and tricks.
> 
> But yes unless there is some other mechanism to handle these tables, 
> then KEEP probably has to stay. I suggest this wants a very explicit and 
> systematic way to handle it (maybe with some toolchain support) rather 
> than trying to just remove things case by case and see what breaks.
> 
> I don't know if Nicolas is still been working on his shrinking patches
> recenty but he probably knows more than anyone about this stuff.

Looks like not much has changed since last time I played with this stuff.

There is a way to omit the KEEP() on tables, but something must create a 
dependency from the code being pointed to by those tables to the table 
entries themselves. I did write my findings in the following article 
(just skip over the introductory blurb): 

https://lwn.net/Articles/741494/

Once that dependency is there, then the KEEP() may go and 
garbage-collecting a function will also garbage-collect the table entry 
automatically.

OTOH this trickery is not needed with LTO as garbage collection happens 
at the source code optimization level. The KEEP() may remain in that 
case as unneeded table entries will simply not be created in the first 
place.


Nicolas


Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

2021-03-09 Thread Nicolas Pitre
On Tue, 9 Mar 2021, Rasmus Villemoes wrote:

> On 09/03/2021 20.54, Nicolas Pitre wrote:
> > On Wed, 10 Mar 2021, Masahiro Yamada wrote:
> > 
> 
> >>> I'm not sure I do understand every detail here, especially since it is
> >>> so far away from the version that I originally contributed. But the
> >>> concept looks good.
> >>>
> >>> I still think that there is no way around a recursive approach to get
> >>> the maximum effect with LTO, but given that true LTO still isn't applied
> >>> to mainline after all those years, the recursive approach brings
> >>> nothing. Maybe that could be revisited if true LTO ever makes it into
> >>> mainline, and the desire to reduce the binary size is still relevant
> >>> enough to justify it.
> >>
> >> Hmm, I am confused.
> >>
> >> Does this patch change the behavior in the
> >> combination with the "true LTO"?
> >>
> >> Please let me borrow this sentence from your article:
> >> "But what LTO does is more like getting rid of branches that simply
> >> float in the air without being connected to anything or which have
> >> become loose due to optimization."
> >> (https://lwn.net/Articles/746780/)
> >>
> >> This patch throws unneeded EXPORT_SYMBOL metadata
> >> into the /DISCARD/ section of the linker script.
> >>
> >> The approach is different (preprocessor vs linker), but
> >> we will still get the same result; the unneeded
> >> EXPORT_SYMBOLs are disconnected from the main trunk.
> >>
> >> Then, the true LTO will remove branches floating in the air,
> >> right?
> >>
> >> So, what will be lost by this patch?
> > 
> > Let's say you have this in module_foo:
> > 
> > int foo(int x)
> > {
> > return 2 + bar(x);
> > }
> > EXPORT_SYMBOL(foo);
> > 
> > And module_bar:
> > 
> > int bar(int y)
> > {
> > return 3 * baz(y);
> > }
> > EXPORT_SYMBOL(bar);
> > 
> > And this in the main kernel image:
> > 
> > int baz(int z)
> > {
> > return plonk(z);
> > }
> > EXPORT_SYMBOLbaz);
> > 
> > Now we build the kernel and modules. Then we realize that nothing 
> > references symbol "foo". We can trim the "foo" export. But it would be 
> > necessary to recompile module_foo with LTO (especially if there is 
> > some other code in that module) to realize that nothing 
> > references foo() any longer and optimize away the reference to bar(). 
> 
> But, does LTO even do that to modules? Sure, the export metadata for foo
> vanishes, so there's no function pointer reference to foo, but given
> that modules are just -r links, the compiler/linker can't really assume
> that the generated object won't later be linked with something that does
> require foo? At least for the simpler case of --gc-sections, ld docs say:
> 
> '--gc-sections'
> ...
> 
> This option can be set when doing a partial link (enabled with
>  option '-r').  In this case the root of symbols kept must be
>  explicitly specified either by one of the options '--entry',
>  '--undefined', or '--gc-keep-exported' or by a 'ENTRY' command in
>  the linker script.
> 
> and I would assume that for LTO, --gc-keep-exported would be the only
> sane semantics (keep any external symbol with default visibility).
> 
> Can you point me at a tree/set of LTO patches and a toolchain where the
> previous implementation would actually eventually eliminate bar() from
> module_bar?

All that I readily have is a link to the article I wrote with the 
results I obtained at the time: https://lwn.net/Articles/746780/.
The toolchain and kernel tree are rather old at this point and some 
effort would be required to modernize everything.

I don't remember if there was something special to do LTO on modules. 
Maybe Andi Kleen had something in his patchset for that: 
https://github.com/andikleen/linux-misc/blob/lto-415-2/Documentation/lto-build
He mentions that LTO isn't very effective with modules enabled, but what 
I demonstrated in myarticle is that LTO becomes very effective with or 
without modules as long as CONFIG_TRIM_UNUSED_KSYMS is enabled.

Having CONFIG_TRIM_UNUSED_KSYMS in one pass is likely to still be pretty 
effective even if possibly not not optimal. And maybe people don't 
really care for the missing 10% anyway (I'm just throwing a number in 
the air 
here).


Nicolas


Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

2021-03-09 Thread Nicolas Pitre
On Wed, 10 Mar 2021, Masahiro Yamada wrote:

> On Wed, Mar 10, 2021 at 2:36 AM Nicolas Pitre  wrote:
> >
> > On Wed, 10 Mar 2021, Masahiro Yamada wrote:
> >
> > > Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
> > > guarding") re-enabled this feature, but Linus is still unhappy about
> > > the build time.
> > >
> > > The reason of the slowness is the recursion - this basically works in
> > > two loops.
> > >
> > > In the first loop, Kbuild builds the entire tree based on the temporary
> > > autoksyms.h, which contains macro defines to control whether their
> > > corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
> > > symbols required by modules. After the tree traverse, Kbuild updates
> > > autoksyms.h and triggers the second loop to rebuild source files whose
> > > EXPORT_SYMBOL() needs flipping.
> > >
> > > This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
> > > one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
> > > trimmed by the linker instead of the preprocessor.
> > >
> > > After the tree traverse, a linker script snippet 
> > > is generated. It feeds the list of necessary sections to vmlinus.lds.S
> > > and modules.lds.S. The other sections fall into /DISCARD/.
> > >
> > > Signed-off-by: Masahiro Yamada 
> >
> > I'm not sure I do understand every detail here, especially since it is
> > so far away from the version that I originally contributed. But the
> > concept looks good.
> >
> > I still think that there is no way around a recursive approach to get
> > the maximum effect with LTO, but given that true LTO still isn't applied
> > to mainline after all those years, the recursive approach brings
> > nothing. Maybe that could be revisited if true LTO ever makes it into
> > mainline, and the desire to reduce the binary size is still relevant
> > enough to justify it.
> 
> Hmm, I am confused.
> 
> Does this patch change the behavior in the
> combination with the "true LTO"?
> 
> Please let me borrow this sentence from your article:
> "But what LTO does is more like getting rid of branches that simply
> float in the air without being connected to anything or which have
> become loose due to optimization."
> (https://lwn.net/Articles/746780/)
> 
> This patch throws unneeded EXPORT_SYMBOL metadata
> into the /DISCARD/ section of the linker script.
> 
> The approach is different (preprocessor vs linker), but
> we will still get the same result; the unneeded
> EXPORT_SYMBOLs are disconnected from the main trunk.
> 
> Then, the true LTO will remove branches floating in the air,
> right?
> 
> So, what will be lost by this patch?

Let's say you have this in module_foo:

int foo(int x)
{
return 2 + bar(x);
}
EXPORT_SYMBOL(foo);

And module_bar:

int bar(int y)
{
return 3 * baz(y);
}
EXPORT_SYMBOL(bar);

And this in the main kernel image:

int baz(int z)
{
return plonk(z);
}
EXPORT_SYMBOLbaz);

Now we build the kernel and modules. Then we realize that nothing 
references symbol "foo". We can trim the "foo" export. But it would be 
necessary to recompile module_foo with LTO (especially if there is 
some other code in that module) to realize that nothing 
references foo() any longer and optimize away the reference to bar(). 
With another round, we now realize that the "bar" export is no longer 
necessary. But that will require another compile round to optimize away 
the reference to baz(). And then a final compilation round with 
LTO to possibly optimize plonk() out of the kernel.

I don't see how you can propagate all this chain reaction with only one 
pass.


Nicolas


Re: [PATCH v2 3/4] kbuild: re-implement CONFIG_TRIM_UNUSED_KSYMS to make it work in one-pass

2021-03-09 Thread Nicolas Pitre
On Wed, 10 Mar 2021, Masahiro Yamada wrote:

> Commit a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some
> guarding") re-enabled this feature, but Linus is still unhappy about
> the build time.
> 
> The reason of the slowness is the recursion - this basically works in
> two loops.
> 
> In the first loop, Kbuild builds the entire tree based on the temporary
> autoksyms.h, which contains macro defines to control whether their
> corresponding EXPORT_SYMBOL() is enabled or not, and also gathers all
> symbols required by modules. After the tree traverse, Kbuild updates
> autoksyms.h and triggers the second loop to rebuild source files whose
> EXPORT_SYMBOL() needs flipping.
> 
> This commit re-implements CONFIG_TRIM_UNUSED_KSYMS to make it work in
> one pass. In the new design, unneeded EXPORT_SYMBOL() instances are
> trimmed by the linker instead of the preprocessor.
> 
> After the tree traverse, a linker script snippet 
> is generated. It feeds the list of necessary sections to vmlinus.lds.S
> and modules.lds.S. The other sections fall into /DISCARD/.
> 
> Signed-off-by: Masahiro Yamada 

I'm not sure I do understand every detail here, especially since it is 
so far away from the version that I originally contributed. But the 
concept looks good.

I still think that there is no way around a recursive approach to get 
the maximum effect with LTO, but given that true LTO still isn't applied 
to mainline after all those years, the recursive approach brings 
nothing. Maybe that could be revisited if true LTO ever makes it into 
mainline, and the desire to reduce the binary size is still relevant 
enough to justify it.

Acked-by: Nicolas Pitre 


Nicolas


Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

2021-03-09 Thread Nicolas Pitre
On Tue, 9 Mar 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 4:24 AM Nicolas Pitre  wrote:
> >
> > If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease.
> > That comes with the feature.
> 
> This patch set intends to change this.
> TRIM_UNUSED_KSYMS will build without additional cost,
> like LD_DEAD_CODE_DATA_ELIMINATION.

OK... I do see how you're going about it.

> > > Modules are relocatable ELF.
> > > Clang LTO cannot eliminate any code.
> > > GCC LTO does not work with relocatable ELF
> > > in the first place.
> >
> > I don't think I follow you here. What relocatable ELF has to do with LTO?
> 
> What is important is,
> GCC LTO is the feature of gcc, not binutils.
> That is, LD_FINAL is $(CC).

Exact.

> GCC LTO can be implemented for the final link stage
> by using $(CC) as the linker driver.
> Then, it can determine which code is unreachable.
> In other words, GCC LTO works only when building
> the final executable.

Yes. And it does so by filling .o files with its intermediate code 
representation and not ELF code.

> On the other hand, a relocatable ELF is created
> by $(LD) -r by combining some objects together.
> The relocatable ELF can be fed to another $(LD) -r,
> or the final link stage.

You still can create relocatable ELF using LTO. But LTO stops there. 
>From that point on, .o files will no longer contain data that LTO can 
use if you further combine those object files together. But until that 
point, LTO is still usable.

> As I said above, modules are created by $(LD) -r.
> It is not possible to implement GCC LTO for modules.

If I remember correctly (that was a while ago) the problem with LTO and 
the kernel had to do with the fact that avery subdirectory was gathering 
object files in built-in.o using ld -r. At some point we switched to 
gathering object files into built-in.a files where no linking is taking 
place. The real linking happens in vmlinux.o where LTO may now do its 
magic.

The same is true for modules. Compiling foo_module.c into foo_module.o 
will create a .o file with LTO data rather than executable code. But 
when you create the final .o for the module then LTO takes place and 
produce the relocatable ELF executable.

> > I've successfully used gcc LTO on the kernel quite a while ago.
> >
> > For a reference about binary size reduction with LTO and
> > CONFIG_TRIM_UNUSED_KSYMS please read this article:
> >
> > https://lwn.net/Articles/746780/
> 
> Thanks for the great articles.
> 
> Just for curiosity, I think you used GCC LTO from
> Andy's GitHub.

Right. I provided the reference in the preceding article:
https://lwn.net/Articles/744507/ 

> In the article, you took stm32_defconfig as an example,
> but ARM does not select ARCH_SUPPORTS_LTO.
> 
> Did you add some local hacks to make LTO work
> for ARM?

Of course. This article was written in 2017 and no LTO support at all 
was in mainline back then. But, besides adding CONFIG_LTO, very little 
was needed to make it compile, and I did upstream most changes such as 
commit 75fea300d7, commit a85b2257a5, commit 5d48417592, commit 
19c233b79d, etc.

> I tried the lto-5.8.1 branch, but
> I did not even succeed in building x86 + LTO.

My latest working LTO branch (i.e. last time I worked on it) is much 
older than that.

Maybe people aren't very excited about LTO because it makes the time to 
recompiling the kernel many times longer because gcc does its 
optimization passes on the whole kernel even if you modify a single 
file.


Nicolas


Re: [PATCH 2/4] kbuild: collect minimum tool versions into scripts/tool-version.sh

2021-03-04 Thread Nicolas Pitre
On Thu, 4 Mar 2021, Masahiro Yamada wrote:

> The kernel build uses various tools, many of which are provided by the
> same software suite, for example, LLVM and Binutils.
> 
> When we raise the minimal version of Clang/LLVM, we need to update
> clang_min_version in scripts/cc-version.sh and also lld_min_version in
> scripts/ld-version.sh.
> 
> In fact, Kbuild can handle CC=clang and LD=ld.lld independently, and we
> could manage their minimal version separately, but it does not make
> much sense.
> 
> Make scripts/tool-version.sh a central place of minimum tool versions
> so that we do not need to touch multiple files.

It would be better and self-explanatory if a script that provides the 
minimum version of a tool was actually called ... min_tool-version.sh or 
the like. Otherwise one might think it would give e.g. the current 
version of installed tools.


Nicolas


Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

2021-02-25 Thread Nicolas Pitre
On Fri, 26 Feb 2021, Masahiro Yamada wrote:

> On Fri, Feb 26, 2021 at 2:20 AM Nicolas Pitre  wrote:
> >
> > On Fri, 26 Feb 2021, Masahiro Yamada wrote:
> >
> > >
> > > Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> > > about the build speed.
> > >
> > > I re-implemented this feature, and the build time cost is now
> > > almost unnoticeable level.
> > >
> > > I hope this makes Linus happy.
> >
> > :-)
> >
> > I'm surprised to see that Linus is using this feature. When disabled
> > (the default) this should have had no impact on the build time.
> 
> Linus is not using this feature, but does build tests.
> After pulling the module subsystem pull request in this merge window,
> CONFIG_TRIM_UNUSED_KSYMS was enabled by allmodconfig.

If CONFIG_TRIM_UNUSED_KSYMS is enabled then build time willincrease. 
That comes with the feature.

> > This feature provides a nice security advantage by significantly
> > reducing the kernel input surface. And people are using that also to
> > better what third party vendor can and cannot do with a distro kernel,
> > etc. But that's not the reason why I implemented this feature in the
> > first place.
> >
> > My primary goal was to efficiently reduce the kernel binary size using
> > LTO even with kernel modules enabled.
> 
> 
> Clang LTO landed in this MW.
> 
> Do you think it will reduce the kernel binary size?
> No, opposite.

LTO ought to reduce binary size. It is rather broken otherwise.
Having a global view before optimizing allows for the compiler to do 
project wide constant propagation and dead code elimination.

> CONFIG_LTO_CLANG cannot trim any code even if it
> is obviously unused.
> Hence, it never reduces the kernel binary size.
> Rather, it produces a bigger kernel.

Then what's the point?

> The reason is Clang LTO was implemented against
> relocatable ELF (vmlinux.o) .

That's not true LTO then.

> I pointed out this flaw in the review process, but
> it was dismissed.
> 
> This is the main reason why I did not give any Ack
> (but it was merged via Kees Cook's tree).

> So, the help text of this option should be revised:
> 
>   This option allows for unused exported symbols to be dropped from
>   the build. In turn, this provides the compiler more opportunities
>   (especially when using LTO) for optimizing the code and reducing
>   binary size.  This might have some security advantages as well.
> 
> Clang LTO is opposite to your expectation.

Then Clang LTO is a misnomer. That is the option to revise not this one.

> > Each EXPORT_SYMBOL() created a
> > symbol dependency that prevented LTO from optimizing out the related
> > code even though a tiny fraction of those exported symbols were needed.
> >
> > The idea behind the recursion was to catch those cases where disabling
> > an exported symbol within a module would optimize out references to more
> > exported symbols that, in turn, could be disabled and possibly trigger
> > yet more code elimination. There is no way that can be achieved without
> > extra compiler passes in a recursive manner.
> 
> I do not understand.
> 
> Modules are relocatable ELF.
> Clang LTO cannot eliminate any code.
> GCC LTO does not work with relocatable ELF
> in the first place.

I don't think I follow you here. What relocatable ELF has to do with LTO?

I've successfully used gcc LTO on the kernel quite a while ago.

For a reference about binary size reduction with LTO and 
CONFIG_TRIM_UNUSED_KSYMS please read this article:

https://lwn.net/Articles/746780/


Nicolas


Re: [PATCH 0/4] kbuild: build speed improvment of CONFIG_TRIM_UNUSED_KSYMS

2021-02-25 Thread Nicolas Pitre
On Fri, 26 Feb 2021, Masahiro Yamada wrote:

> 
> Now CONFIG_TRIM_UNUSED_KSYMS is revived, but Linus is still unhappy
> about the build speed.
> 
> I re-implemented this feature, and the build time cost is now
> almost unnoticeable level.
> 
> I hope this makes Linus happy.

:-)

I'm surprised to see that Linus is using this feature. When disabled 
(the default) this should have had no impact on the build time.

This feature provides a nice security advantage by significantly 
reducing the kernel input surface. And people are using that also to 
better what third party vendor can and cannot do with a distro kernel, 
etc. But that's not the reason why I implemented this feature in the 
first place.

My primary goal was to efficiently reduce the kernel binary size using 
LTO even with kernel modules enabled. Each EXPORT_SYMBOL() created a 
symbol dependency that prevented LTO from optimizing out the related 
code even though a tiny fraction of those exported symbols were needed.

The idea behind the recursion was to catch those cases where disabling 
an exported symbol within a module would optimize out references to more 
exported symbols that, in turn, could be disabled and possibly trigger 
yet more code elimination. There is no way that can be achieved without 
extra compiler passes in a recursive manner.


Nicolas


Re: [PATCH v3] arm: OABI compat: fix build when EPOLL is not enabled

2021-02-20 Thread Nicolas Pitre
On Sat, 20 Feb 2021, Randy Dunlap wrote:

> When CONFIG_EPOLL is not set/enabled, sys_oabi-compat.c has build
> errors. Fix these by surrounding them with ifdef CONFIG_EPOLL/endif
> and providing stubs for the "EPOLL is not set" case.
> 
> ../arch/arm/kernel/sys_oabi-compat.c: In function 'sys_oabi_epoll_ctl':
> ../arch/arm/kernel/sys_oabi-compat.c:257:6: error: implicit declaration of 
> function 'ep_op_has_event' [-Werror=implicit-function-declaration]
>   257 |  if (ep_op_has_event(op) &&
>   |  ^~~
> ../arch/arm/kernel/sys_oabi-compat.c:264:9: error: implicit declaration of 
> function 'do_epoll_ctl'; did you mean 'sys_epoll_ctl'? 
> [-Werror=implicit-function-declaration]
>   264 |  return do_epoll_ctl(epfd, op, fd, , false);
>   | ^~~~
> 
> Fixes: c281634c8652 ("ARM: compat: remove KERNEL_DS usage in 
> sys_oabi_epoll_ctl()")
> Signed-off-by: Randy Dunlap 
> Reported-by: kernel test robot  # from an lkp .config file
> Cc: Russell King 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: Nicolas Pitre 
> Cc: Alexander Viro 
> Cc: patc...@armlinux.org.uk

Acked-by: Nicolas Pitre 

> ---
> v2: use correct Fixes: tag (thanks, rmk)
> v3: add patches@ to Cc: list
> 
>  arch/arm/kernel/sys_oabi-compat.c |   15 +++
>  1 file changed, 15 insertions(+)
> 
> --- linux-next-20201214.orig/arch/arm/kernel/sys_oabi-compat.c
> +++ linux-next-20201214/arch/arm/kernel/sys_oabi-compat.c
> @@ -248,6 +248,7 @@ struct oabi_epoll_event {
>   __u64 data;
>  } __attribute__ ((packed,aligned(4)));
>  
> +#ifdef CONFIG_EPOLL
>  asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
>  struct oabi_epoll_event __user *event)
>  {
> @@ -298,6 +299,20 @@ asmlinkage long sys_oabi_epoll_wait(int
>   kfree(kbuf);
>   return err ? -EFAULT : ret;
>  }
> +#else
> +asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
> +struct oabi_epoll_event __user *event)
> +{
> + return -EINVAL;
> +}
> +
> +asmlinkage long sys_oabi_epoll_wait(int epfd,
> + struct oabi_epoll_event __user *events,
> + int maxevents, int timeout)
> +{
> + return -EINVAL;
> +}
> +#endif
>  
>  struct oabi_sembuf {
>   unsigned short  sem_num;
> 


Re: [PATCH] i3c/master/mipi-i3c-hci: Specify HAS_IOMEM dependency

2021-02-01 Thread Nicolas Pitre
On Tue, 26 Jan 2021, David Gow wrote:

> The MIPI i3c HCI driver makes use of IOMEM functions like
> devm_platform_ioremap_resource(), which are only available if
> CONFIG_HAS_IOMEM is defined.
> 
> This causes the driver to be enabled under make ARCH=um allyesconfig,
> even though it won't build.
> 
> By adding a dependency on HAS_IOMEM, the driver will not be enabled on
> architectures which don't support it.
> 
> Fixes: 9ad9a52cce28 ("i3c/master: introduce the mipi-i3c-hci driver")
> Signed-off-by: David Gow 

Acked-by: Nicolas Pitre 


> ---
>  drivers/i3c/master/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/i3c/master/Kconfig b/drivers/i3c/master/Kconfig
> index e68f15f4b4d0..afff0e2320f7 100644
> --- a/drivers/i3c/master/Kconfig
> +++ b/drivers/i3c/master/Kconfig
> @@ -25,6 +25,7 @@ config DW_I3C_MASTER
>  config MIPI_I3C_HCI
>   tristate "MIPI I3C Host Controller Interface driver (EXPERIMENTAL)"
>   depends on I3C
> + depends on HAS_IOMEM
>   help
> Support for hardware following the MIPI Aliance's I3C Host Controller
> Interface specification.
> -- 
> 2.30.0.280.ga3ce27912f-goog
> 
> 
> -- 
> linux-i3c mailing list
> linux-...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-i3c
> 


Re: [PATCH] i3c/master/mipi-i3c-hci: Specify HAS_IOMEM dependency

2021-01-27 Thread Nicolas Pitre
On Tue, 26 Jan 2021, David Gow wrote:

> The MIPI i3c HCI driver makes use of IOMEM functions like
> devm_platform_ioremap_resource(), which are only available if
> CONFIG_HAS_IOMEM is defined.
> 
> This causes the driver to be enabled under make ARCH=um allyesconfig,
> even though it won't build.
> 
> By adding a dependency on HAS_IOMEM, the driver will not be enabled on
> architectures which don't support it.
> 
> Fixes: 9ad9a52cce28 ("i3c/master: introduce the mipi-i3c-hci driver")
> Signed-off-by: David Gow 

Acked-by: Nicolas Pitre 


> ---
>  drivers/i3c/master/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/i3c/master/Kconfig b/drivers/i3c/master/Kconfig
> index e68f15f4b4d0..afff0e2320f7 100644
> --- a/drivers/i3c/master/Kconfig
> +++ b/drivers/i3c/master/Kconfig
> @@ -25,6 +25,7 @@ config DW_I3C_MASTER
>  config MIPI_I3C_HCI
>   tristate "MIPI I3C Host Controller Interface driver (EXPERIMENTAL)"
>   depends on I3C
> + depends on HAS_IOMEM
>   help
> Support for hardware following the MIPI Aliance's I3C Host Controller
> Interface specification.
> -- 
> 2.30.0.280.ga3ce27912f-goog
> 
> 


[PATCH v4] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-26 Thread Nicolas Pitre
The clock API splits its interface into sleepable ant atomic contexts:

- clk_prepare/clk_unprepare for stuff that might sleep

- clk_enable_clk_disable for anything that may be done in atomic context

The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.

Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.

Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.

For clocks that do implement the enable and disable methods then
everything just works as before.

A note on sparse:
According to https://lwn.net/Articles/109066/ there are things
that sparse can't cope with. In particular, pm_clk_op_lock() and
pm_clk_op_unlock() may or may not lock/unlock psd->lock depending on
some runtime condition. To work around that we tell it the lock is
always untaken for the purpose of static analisys.

Thanks to Naresh Kamboju for reporting issues with the initial patch.

Signed-off-by: Nicolas Pitre 
Tested-by: Naresh Kamboju 

---

On Mon, 25 Jan 2021, Nicolas Pitre wrote:

> On Mon, 25 Jan 2021, Rafael J. Wysocki wrote:
> 
> > It looks like sparse is still complaining:
> > 
> > https://lore.kernel.org/linux-acpi/600dc681.3mal9wqxnragfnzk%25...@intel.com/
> 
> Would you happen to still have one of those randconfig configuration?
> I'd like to know why sparse complains about 3 out of 93 configs.

Well... never mind. Although I'm not able to reproduce, the only 
explanation I can guess is that, dfor some configs, the inline attribute 
was inhibited somehow.

Let's hope this one will do. If not please keep the problematic config.

Changes from v3:

- more sparse annotation as inlining isn't always enough.

Changes from v2:

- added workarounds to cope with sparse limitations (see above).

Changes from v1:

- made dummy clk_is_enabled_when_prepared() dependent on
  CONFIG_HAVE_CLK_PREPARE instead of CONFIG_HAVE_CLK.

diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
index ced6863a16..84d5acb630 100644
--- a/drivers/base/power/clock_ops.c
+++ b/drivers/base/power/clock_ops.c
@@ -23,6 +23,7 @@
 enum pce_status {
PCE_STATUS_NONE = 0,
PCE_STATUS_ACQUIRED,
+   PCE_STATUS_PREPARED,
PCE_STATUS_ENABLED,
PCE_STATUS_ERROR,
 };
@@ -32,8 +33,112 @@ struct pm_clock_entry {
char *con_id;
struct clk *clk;
enum pce_status status;
+   bool enabled_when_prepared;
 };
 
+/**
+ * pm_clk_list_lock - ensure exclusive access for modifying the PM clock
+ *   entry list.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count to be modified.
+ *
+ * Get exclusive access before modifying the PM clock entry list and the
+ * clock_op_might_sleep count to guard against concurrent modifications.
+ * This also protects against a concurrent clock_op_might_sleep and PM clock
+ * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or may not
+ * happen in atomic context, hence both the mutex and the spinlock must be
+ * taken here.
+ */
+static void pm_clk_list_lock(struct pm_subsys_data *psd)
+   __acquires(>lock)
+{
+   mutex_lock(>clock_mutex);
+   spin_lock_irq(>lock);
+}
+
+/**
+ * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
+ * @psd: the same pm_subsys_data instance previously passed to
+ *  pm_clk_list_lock().
+ */
+static void pm_clk_list_unlock(struct pm_subsys_data *psd)
+   __releases(>lock)
+{
+   spin_unlock_irq(>lock);
+   mutex_unlock(>clock_mutex);
+}
+
+/**
+ * pm_clk_op_lock - ensure exclusive access for performing clock operations.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count being used.
+ * @flags: stored irq flags.
+ * @fn: string for the caller function's name.
+ *
+ * This is used by pm_clk_suspend() and pm_clk_resume() to guard
+ * against concurrent modifications to the clock entry list and the
+ * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
+ * only the mutex can be locked and those functions can only be used in
+ * non atomic context. If clock_op_might_sleep == 0 then these functions
+ * may be used in any context and only the spinlock can be locked.
+ * Returns -EINVAL if called in atomic context when clock ops might sleep.
+ */
+static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned

Re: [PATCH v3] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-25 Thread Nicolas Pitre
On Mon, 25 Jan 2021, Rafael J. Wysocki wrote:

> On Sun, Jan 24, 2021 at 12:07 AM Nicolas Pitre  wrote:
> > A note on sparse:
> > According to https://lwn.net/Articles/109066/ there are things
> > that sparse can't cope with. In particular, pm_clk_op_lock() and
> > pm_clk_op_unlock() may or may not lock/unlock psd->lock depending on
> > some runtime condition. To work around that we tell sparse the lock
> > is always untaken for the purpose of static analisys.
> 
> It looks like sparse is still complaining:
> 
> https://lore.kernel.org/linux-acpi/600dc681.3mal9wqxnragfnzk%25...@intel.com/

Would you happen to still have one of those randconfig configuration?

I'd like to know why sparse complains about 3 out of 93 configs.


Nicolas


[PATCH v3] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-23 Thread Nicolas Pitre
The clock API splits its interface into sleepable ant atomic contexts:

- clk_prepare/clk_unprepare for stuff that might sleep

- clk_enable_clk_disable for anything that may be done in atomic context

The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.

Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.

Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.

For clocks that do implement the enable and disable methods then
everything just works as before.

A note on sparse:
According to https://lwn.net/Articles/109066/ there are things
that sparse can't cope with. In particular, pm_clk_op_lock() and
pm_clk_op_unlock() may or may not lock/unlock psd->lock depending on
some runtime condition. To work around that we tell sparse the lock
is always untaken for the purpose of static analisys.

Thanks to Naresh Kamboju for reporting issues with the initial patch.

Signed-off-by: Nicolas Pitre 
Tested-by: Naresh Kamboju 

---

Changes from v2:

- added workarounds to cope with sparse limitations (see above).

Changes from v1:

- made dummy clk_is_enabled_when_prepared() dependent on 
  CONFIG_HAVE_CLK_PREPARE instead of CONFIG_HAVE_CLK.

diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
index ced6863a16..e6956ce301 100644
--- a/drivers/base/power/clock_ops.c
+++ b/drivers/base/power/clock_ops.c
@@ -23,6 +23,7 @@
 enum pce_status {
PCE_STATUS_NONE = 0,
PCE_STATUS_ACQUIRED,
+   PCE_STATUS_PREPARED,
PCE_STATUS_ENABLED,
PCE_STATUS_ERROR,
 };
@@ -32,8 +33,108 @@ struct pm_clock_entry {
char *con_id;
struct clk *clk;
enum pce_status status;
+   bool enabled_when_prepared;
 };
 
+/**
+ * pm_clk_list_lock - ensure exclusive access for modifying the PM clock
+ *   entry list.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count to be modified.
+ *
+ * Get exclusive access before modifying the PM clock entry list and the
+ * clock_op_might_sleep count to guard against concurrent modifications.
+ * This also protects against a concurrent clock_op_might_sleep and PM clock
+ * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or may not
+ * happen in atomic context, hence both the mutex and the spinlock must be
+ * taken here.
+ */
+static inline void pm_clk_list_lock(struct pm_subsys_data *psd)
+{
+   mutex_lock(>clock_mutex);
+   spin_lock_irq(>lock);
+}
+
+/**
+ * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
+ * @psd: the same pm_subsys_data instance previously passed to
+ *  pm_clk_list_lock().
+ */
+static inline void pm_clk_list_unlock(struct pm_subsys_data *psd)
+{
+   spin_unlock_irq(>lock);
+   mutex_unlock(>clock_mutex);
+}
+
+/**
+ * pm_clk_op_lock - ensure exclusive access for performing clock operations.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count being used.
+ * @flags: stored irq flags.
+ * @fn: string for the caller function's name.
+ *
+ * This is used by pm_clk_suspend() and pm_clk_resume() to guard
+ * against concurrent modifications to the clock entry list and the
+ * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
+ * only the mutex can be locked and those functions can only be used in
+ * non atomic context. If clock_op_might_sleep == 0 then these functions
+ * may be used in any context and only the spinlock can be locked.
+ * Returns -EINVAL if called in atomic context when clock ops might sleep.
+ */
+static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned long *flags,
+ const char *fn)
+{
+   bool atomic_context = in_atomic() || irqs_disabled();
+
+try_again:
+   spin_lock_irqsave(>lock, *flags);
+   if (!psd->clock_op_might_sleep) {
+   /* the __release is there to work around sparse limitations */
+   __release(>lock);
+   return 0;
+   }
+
+   /* bail out if in atomic context */
+   if (atomic_context) {
+   pr_err("%s: atomic context with clock_ops_might_sleep = %d",
+  fn, psd->clock_op_might_sleep);
+   spin_unlock_irqrestore(>lock, *flags);
+   might_sleep();
+   return -EPERM;
+   }
+
+   /* we must switch to the mutex

Re: [PATCH v2] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-22 Thread Nicolas Pitre
On Fri, 22 Jan 2021, Nicolas Pitre wrote:

> On Fri, 22 Jan 2021, Rafael J. Wysocki wrote:
> 
> > On Fri, Jan 22, 2021 at 4:48 PM Naresh Kamboju
> >  wrote:
> > >
> > > On Fri, 22 Jan 2021 at 20:39, Rafael J. Wysocki  wrote:
> > > >
> > > > On Thu, Jan 21, 2021 at 8:01 PM Rafael J. Wysocki  
> > > > wrote:
> > > > >
> > > > > On Thu, Jan 21, 2021 at 6:23 PM Nicolas Pitre  
> > > > > wrote:
> > > > > >
> > > > > > The clock API splits its interface into sleepable ant atomic 
> > > > > > contexts:
> > > > > >
> > > > > > - clk_prepare/clk_unprepare for stuff that might sleep
> > > > > >
> > > > > > - clk_enable_clk_disable for anything that may be done in atomic 
> > > > > > context
> > > > > >
> > > > > > The code handling runtime PM for clocks only calls clk_disable() on
> > > > > > suspend requests, and clk_enable on resume requests. This means that
> > > > > > runtime PM with clock providers that only have the prepare/unprepare
> > > > > > methods implemented is basically useless.
> > > > > >
> > > > > > Many clock implementations can't accommodate atomic contexts. This 
> > > > > > is
> > > > > > often the case when communication with the clock happens through 
> > > > > > another
> > > > > > subsystem like I2C or SCMI.
> > > > > >
> > > > > > Let's make the clock PM code useful with such clocks by safely 
> > > > > > invoking
> > > > > > clk_prepare/clk_unprepare upon resume/suspend requests. Of course, 
> > > > > > when
> > > > > > such clocks are registered with the PM layer then 
> > > > > > pm_runtime_irq_safe()
> > > > > > can't be used, and neither pm_runtime_suspend() nor 
> > > > > > pm_runtime_resume()
> > > > > > may be invoked in atomic context.
> > > > > >
> > > > > > For clocks that do implement the enable and disable methods then
> > > > > > everything just works as before.
> > > > > >
> > > > > > Signed-off-by: Nicolas Pitre 
> > > > > >
> > > > > > ---
> > > > > >
> > > > > > On Thu, 21 Jan 2021, Rafael J. Wysocki wrote:
> > > > > >
> > > > > > > So I'm going to drop this patch from linux-next until the issue is
> > > > > > > resolved, thanks!
> > > > > >
> > > > > > Here's the fixed version.
> > > > >
> > > > > Applied instead of the v1, thanks!
> > > > >
> > > > > > Changes from v1:
> > > > > >
> > > > > > - Moved clk_is_enabled_when_prepared() declaration under
> > > > > >   CONFIG_HAVE_CLK_PREPARE and provided a dummy definition when that
> > > > > >   config option is unset.
> > > > > >
> > > > > > diff --git a/drivers/base/power/clock_ops.c 
> > > > > > b/drivers/base/power/clock_ops.c
> > > > > > index ced6863a16..a62fb0f9b1 100644
> > > > > > --- a/drivers/base/power/clock_ops.c
> > > > > > +++ b/drivers/base/power/clock_ops.c
> > > > > > @@ -23,6 +23,7 @@
> > > > > >  enum pce_status {
> > > > > > PCE_STATUS_NONE = 0,
> > > > > > PCE_STATUS_ACQUIRED,
> > > > > > +   PCE_STATUS_PREPARED,
> > > > > > PCE_STATUS_ENABLED,
> > > > > > PCE_STATUS_ERROR,
> > > > > >  };
> > > > > > @@ -32,8 +33,102 @@ struct pm_clock_entry {
> > > > > > char *con_id;
> > > > > > struct clk *clk;
> > > > > > enum pce_status status;
> > > > > > +   bool enabled_when_prepared;
> > > > > >  };
> > > > > >
> > > > > > +/**
> > > > > > + * pm_clk_list_lock - ensure exclusive access for modifying the PM 
> > > > > > clock
> > > > > > + *   entry list.
> > > > > > + * @psd: pm_subsys_data instance corresponding to the PM clock 
> > > > > &g

Re: [PATCH v2] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-22 Thread Nicolas Pitre
On Fri, 22 Jan 2021, Rafael J. Wysocki wrote:

> On Fri, Jan 22, 2021 at 4:48 PM Naresh Kamboju
>  wrote:
> >
> > On Fri, 22 Jan 2021 at 20:39, Rafael J. Wysocki  wrote:
> > >
> > > On Thu, Jan 21, 2021 at 8:01 PM Rafael J. Wysocki  
> > > wrote:
> > > >
> > > > On Thu, Jan 21, 2021 at 6:23 PM Nicolas Pitre  
> > > > wrote:
> > > > >
> > > > > The clock API splits its interface into sleepable ant atomic contexts:
> > > > >
> > > > > - clk_prepare/clk_unprepare for stuff that might sleep
> > > > >
> > > > > - clk_enable_clk_disable for anything that may be done in atomic 
> > > > > context
> > > > >
> > > > > The code handling runtime PM for clocks only calls clk_disable() on
> > > > > suspend requests, and clk_enable on resume requests. This means that
> > > > > runtime PM with clock providers that only have the prepare/unprepare
> > > > > methods implemented is basically useless.
> > > > >
> > > > > Many clock implementations can't accommodate atomic contexts. This is
> > > > > often the case when communication with the clock happens through 
> > > > > another
> > > > > subsystem like I2C or SCMI.
> > > > >
> > > > > Let's make the clock PM code useful with such clocks by safely 
> > > > > invoking
> > > > > clk_prepare/clk_unprepare upon resume/suspend requests. Of course, 
> > > > > when
> > > > > such clocks are registered with the PM layer then 
> > > > > pm_runtime_irq_safe()
> > > > > can't be used, and neither pm_runtime_suspend() nor 
> > > > > pm_runtime_resume()
> > > > > may be invoked in atomic context.
> > > > >
> > > > > For clocks that do implement the enable and disable methods then
> > > > > everything just works as before.
> > > > >
> > > > > Signed-off-by: Nicolas Pitre 
> > > > >
> > > > > ---
> > > > >
> > > > > On Thu, 21 Jan 2021, Rafael J. Wysocki wrote:
> > > > >
> > > > > > So I'm going to drop this patch from linux-next until the issue is
> > > > > > resolved, thanks!
> > > > >
> > > > > Here's the fixed version.
> > > >
> > > > Applied instead of the v1, thanks!
> > > >
> > > > > Changes from v1:
> > > > >
> > > > > - Moved clk_is_enabled_when_prepared() declaration under
> > > > >   CONFIG_HAVE_CLK_PREPARE and provided a dummy definition when that
> > > > >   config option is unset.
> > > > >
> > > > > diff --git a/drivers/base/power/clock_ops.c 
> > > > > b/drivers/base/power/clock_ops.c
> > > > > index ced6863a16..a62fb0f9b1 100644
> > > > > --- a/drivers/base/power/clock_ops.c
> > > > > +++ b/drivers/base/power/clock_ops.c
> > > > > @@ -23,6 +23,7 @@
> > > > >  enum pce_status {
> > > > > PCE_STATUS_NONE = 0,
> > > > > PCE_STATUS_ACQUIRED,
> > > > > +   PCE_STATUS_PREPARED,
> > > > > PCE_STATUS_ENABLED,
> > > > > PCE_STATUS_ERROR,
> > > > >  };
> > > > > @@ -32,8 +33,102 @@ struct pm_clock_entry {
> > > > > char *con_id;
> > > > > struct clk *clk;
> > > > > enum pce_status status;
> > > > > +   bool enabled_when_prepared;
> > > > >  };
> > > > >
> > > > > +/**
> > > > > + * pm_clk_list_lock - ensure exclusive access for modifying the PM 
> > > > > clock
> > > > > + *   entry list.
> > > > > + * @psd: pm_subsys_data instance corresponding to the PM clock entry 
> > > > > list
> > > > > + *  and clk_op_might_sleep count to be modified.
> > > > > + *
> > > > > + * Get exclusive access before modifying the PM clock entry list and 
> > > > > the
> > > > > + * clock_op_might_sleep count to guard against concurrent 
> > > > > modifications.
> > > > > + * This also protects against a concurrent clock_op_might_sleep and 
> > > > > PM clock
> > > > > + * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or 
> > > > > may not
> > > > > + * happen in atomic context, hence both the mutex and the spinlock 
> > > > > must be
> > > > > + * taken here.
> > > > > + */
> > > > > +static void pm_clk_list_lock(struct pm_subsys_data *psd)
> > > > > +{
> > > > > +   mutex_lock(>clock_mutex);
> > > > > +   spin_lock_irq(>lock);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
> > > > > + * @psd: the same pm_subsys_data instance previously passed to
> > > > > + *  pm_clk_list_lock().
> > > > > + */
> > > > > +static void pm_clk_list_unlock(struct pm_subsys_data *psd)
> > >
> > > Locking annotations for sparse were missing here and above, so I've
> > > added them by hand.
> > >
> > > Please double check the result in my linux-next branch (just pushed).
> >
> > May i request to add Reported-by: Naresh Kamboju 
> 
> If this had been a patch fixing a problem reported by you, there would
> have been a reason to add a Reported-by,
> 
> In this case, it is just a new version of a patch taking your testing
> feedback into account.
> 
> I can add a Tested-by for you to it if desired, though.

It is probably fair to mention that Naresh reported the issue too.
My bad, I should have added the tag myself in v2.


Nicolas


Re: [PATCH v2] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-22 Thread Nicolas Pitre
On Fri, 22 Jan 2021, Rafael J. Wysocki wrote:

> > > +/**
> > > + * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
> > > + * @psd: the same pm_subsys_data instance previously passed to
> > > + *  pm_clk_list_lock().
> > > + */
> > > +static void pm_clk_list_unlock(struct pm_subsys_data *psd)
> 
> Locking annotations for sparse were missing here and above, so I've
> added them by hand.

Thanks.

> Please double check the result in my linux-next branch (just pushed).

There are still the following warnings:

drivers/base/power/clock_ops.c:52:13: warning: context imbalance in 
'pm_clk_list_lock' - wrong count at exit
drivers/base/power/clock_ops.c:64:13: warning: context imbalance in 
'pm_clk_list_unlock' - wrong count at exit

I guess this can be silenced (still need to investigate how those 
annotations work).

But I'm more worried about these:

drivers/base/power/clock_ops.c:86:12: warning: context imbalance in 
'pm_clk_op_lock' - different lock contexts for basic block
drivers/base/power/clock_ops.c:131:39: warning: context imbalance in 
'pm_clk_op_unlock' - unexpected unlock

Those are special locking helpers indeed and I don't know if that can be 
dealt with.


Nicolas


[PATCH v2] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-21 Thread Nicolas Pitre
The clock API splits its interface into sleepable ant atomic contexts:

- clk_prepare/clk_unprepare for stuff that might sleep

- clk_enable_clk_disable for anything that may be done in atomic context

The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.

Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.

Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.

For clocks that do implement the enable and disable methods then
everything just works as before.

Signed-off-by: Nicolas Pitre 

---

On Thu, 21 Jan 2021, Rafael J. Wysocki wrote:

> So I'm going to drop this patch from linux-next until the issue is
> resolved, thanks!

Here's the fixed version.

Changes from v1:

- Moved clk_is_enabled_when_prepared() declaration under 
  CONFIG_HAVE_CLK_PREPARE and provided a dummy definition when that 
  config option is unset.

diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
index ced6863a16..a62fb0f9b1 100644
--- a/drivers/base/power/clock_ops.c
+++ b/drivers/base/power/clock_ops.c
@@ -23,6 +23,7 @@
 enum pce_status {
PCE_STATUS_NONE = 0,
PCE_STATUS_ACQUIRED,
+   PCE_STATUS_PREPARED,
PCE_STATUS_ENABLED,
PCE_STATUS_ERROR,
 };
@@ -32,8 +33,102 @@ struct pm_clock_entry {
char *con_id;
struct clk *clk;
enum pce_status status;
+   bool enabled_when_prepared;
 };
 
+/**
+ * pm_clk_list_lock - ensure exclusive access for modifying the PM clock
+ *   entry list.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count to be modified.
+ *
+ * Get exclusive access before modifying the PM clock entry list and the
+ * clock_op_might_sleep count to guard against concurrent modifications.
+ * This also protects against a concurrent clock_op_might_sleep and PM clock
+ * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or may not
+ * happen in atomic context, hence both the mutex and the spinlock must be
+ * taken here.
+ */
+static void pm_clk_list_lock(struct pm_subsys_data *psd)
+{
+   mutex_lock(>clock_mutex);
+   spin_lock_irq(>lock);
+}
+
+/**
+ * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
+ * @psd: the same pm_subsys_data instance previously passed to
+ *  pm_clk_list_lock().
+ */
+static void pm_clk_list_unlock(struct pm_subsys_data *psd)
+{
+   spin_unlock_irq(>lock);
+   mutex_unlock(>clock_mutex);
+}
+
+/**
+ * pm_clk_op_lock - ensure exclusive access for performing clock operations.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count being used.
+ * @flags: stored irq flags.
+ * @fn: string for the caller function's name.
+ *
+ * This is used by pm_clk_suspend() and pm_clk_resume() to guard
+ * against concurrent modifications to the clock entry list and the
+ * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
+ * only the mutex can be locked and those functions can only be used in
+ * non atomic context. If clock_op_might_sleep == 0 then these functions
+ * may be used in any context and only the spinlock can be locked.
+ * Returns -EINVAL if called in atomic context when clock ops might sleep.
+ */
+static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned long *flags,
+ const char *fn)
+{
+   bool atomic_context = in_atomic() || irqs_disabled();
+
+try_again:
+   spin_lock_irqsave(>lock, *flags);
+   if (!psd->clock_op_might_sleep)
+   return 0;
+
+   /* bail out if in atomic context */
+   if (atomic_context) {
+   pr_err("%s: atomic context with clock_ops_might_sleep = %d",
+  fn, psd->clock_op_might_sleep);
+   spin_unlock_irqrestore(>lock, *flags);
+   might_sleep();
+   return -EPERM;
+   }
+
+   /* we must switch to the mutex */
+   spin_unlock_irqrestore(>lock, *flags);
+   mutex_lock(>clock_mutex);
+
+   /*
+* There was a possibility for psd->clock_op_might_sleep
+* to become 0 above. Keep the mutex only if not the case.
+*/
+   if (likely(psd->clock_op_might_sleep))
+   return 0;
+
+   mutex_unlock(>clock_mutex);
+   goto try_again;
+}
+
+/**
+ * pm_clk_op_unlock - counterpart to pm_clk_op_lock()

Re: [PATCH] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-21 Thread Nicolas Pitre
On Thu, 21 Jan 2021, Rafael J. Wysocki wrote:

> On Thu, Jan 21, 2021 at 1:11 PM Naresh Kamboju
>  wrote:
> >
> > ref:
> > https://builds.tuxbuild.com/1nN0vkpNP4qhvIuIJN12j7tTpQs/
> 
> So I'm going to drop this patch from linux-next until the issue is
> resolved, thanks!

No problem - I'm on it.

Thanks Naresh for reporting the issue.


Nicolas


Re: [PATCH] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-19 Thread Nicolas Pitre
On Tue, 19 Jan 2021, Geert Uytterhoeven wrote:

> Hi Kevin, Nicolas,
> 
> On Tue, Jan 19, 2021 at 7:45 PM Kevin Hilman  wrote:
> > [ + Geert.. renesas SoCs are the primary user of PM clk ]
> 
> Thanks!
> 
> > Nicolas Pitre  writes:
> > > The clock API splits its interface into sleepable ant atomic contexts:
> > >
> > > - clk_prepare/clk_unprepare for stuff that might sleep
> > >
> > > - clk_enable_clk_disable for anything that may be done in atomic context
> > >
> > > The code handling runtime PM for clocks only calls clk_disable() on
> > > suspend requests, and clk_enable on resume requests. This means that
> > > runtime PM with clock providers that only have the prepare/unprepare
> > > methods implemented is basically useless.
> > >
> > > Many clock implementations can't accommodate atomic contexts. This is
> > > often the case when communication with the clock happens through another
> > > subsystem like I2C or SCMI.
> > >
> > > Let's make the clock PM code useful with such clocks by safely invoking
> > > clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
> > > such clocks are registered with the PM layer then pm_runtime_irq_safe()
> > > can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
> > > may be invoked in atomic context.
> > >
> > > For clocks that do implement the enable and disable methods then
> > > everything just works as before.
> > >
> > > Signed-off-by: Nicolas Pitre 
> 
> Thanks for your patch!
> 
> > > --- a/drivers/base/power/clock_ops.c
> > > +++ b/drivers/base/power/clock_ops.c
> 
> > > +/**
> > > + * pm_clk_op_lock - ensure exclusive access for performing clock 
> > > operations.
> > > + * @psd: pm_subsys_data instance corresponding to the PM clock entry list
> > > + *and clk_op_might_sleep count being used.
> > > + * @flags: stored irq flags.
> > > + * @fn: string for the caller function's name.
> > > + *
> > > + * This is used by pm_clk_suspend() and pm_clk_resume() to guard
> > > + * against concurrent modifications to the clock entry list and the
> > > + * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
> > > + * only the mutex can be locked and those functions can only be used in
> > > + * non atomic context. If clock_op_might_sleep == 0 then these functions
> > > + * may be used in any context and only the spinlock can be locked.
> > > + * Returns -EINVAL if called in atomic context when clock ops might 
> > > sleep.
> > > + */
> > > +static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned long 
> > > *flags,
> > > +   const char *fn)
> > > +{
> > > + bool atomic_context = in_atomic() || irqs_disabled();
> 
> Is this safe? Cfr.
> https://lore.kernel.org/dri-devel/20200914204209.256266...@linutronix.de/

I noticed this topic is a mess. This is why I'm not relying on 
in_atomic() alone as it turned out not to be sufficient in all cases 
during testing.

What's there now is safe at least in the context from which it is called 
i.e. the runtime pm core code. If not then hopefully the might_sleep() 
that follows will catch misuses.

It should be noted that we assume an atomic context by default. However, 
if you rely on clocks that must sleep then you must not invoke runtime 
pm facilities in atomic context from your driver in the first place. The 
atomic_context variable above is there only used further down as a 
validation check to catch programming mistakes and not an operational 
parameter.


Nicolas


Re: [PATCH] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-17 Thread Nicolas Pitre
Ping.

On Mon, 4 Jan 2021, Nicolas Pitre wrote:

> The clock API splits its interface into sleepable ant atomic contexts:
> 
> - clk_prepare/clk_unprepare for stuff that might sleep
> 
> - clk_enable_clk_disable for anything that may be done in atomic context
> 
> The code handling runtime PM for clocks only calls clk_disable() on
> suspend requests, and clk_enable on resume requests. This means that
> runtime PM with clock providers that only have the prepare/unprepare
> methods implemented is basically useless.
> 
> Many clock implementations can't accommodate atomic contexts. This is
> often the case when communication with the clock happens through another
> subsystem like I2C or SCMI.
> 
> Let's make the clock PM code useful with such clocks by safely invoking
> clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
> such clocks are registered with the PM layer then pm_runtime_irq_safe()
> can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
> may be invoked in atomic context.
> 
> For clocks that do implement the enable and disable methods then 
> everything just works as before.
> 
> Signed-off-by: Nicolas Pitre 
> 
> diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
> index ced6863a16..a62fb0f9b1 100644
> --- a/drivers/base/power/clock_ops.c
> +++ b/drivers/base/power/clock_ops.c
> @@ -23,6 +23,7 @@
>  enum pce_status {
>   PCE_STATUS_NONE = 0,
>   PCE_STATUS_ACQUIRED,
> + PCE_STATUS_PREPARED,
>   PCE_STATUS_ENABLED,
>   PCE_STATUS_ERROR,
>  };
> @@ -32,8 +33,102 @@ struct pm_clock_entry {
>   char *con_id;
>   struct clk *clk;
>   enum pce_status status;
> + bool enabled_when_prepared;
>  };
>  
> +/**
> + * pm_clk_list_lock - ensure exclusive access for modifying the PM clock
> + * entry list.
> + * @psd: pm_subsys_data instance corresponding to the PM clock entry list
> + *and clk_op_might_sleep count to be modified.
> + *
> + * Get exclusive access before modifying the PM clock entry list and the
> + * clock_op_might_sleep count to guard against concurrent modifications.
> + * This also protects against a concurrent clock_op_might_sleep and PM clock
> + * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or may not
> + * happen in atomic context, hence both the mutex and the spinlock must be
> + * taken here.
> + */
> +static void pm_clk_list_lock(struct pm_subsys_data *psd)
> +{
> + mutex_lock(>clock_mutex);
> + spin_lock_irq(>lock);
> +}
> +
> +/**
> + * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
> + * @psd: the same pm_subsys_data instance previously passed to
> + *pm_clk_list_lock().
> + */
> +static void pm_clk_list_unlock(struct pm_subsys_data *psd)
> +{
> + spin_unlock_irq(>lock);
> + mutex_unlock(>clock_mutex);
> +}
> +
> +/**
> + * pm_clk_op_lock - ensure exclusive access for performing clock operations.
> + * @psd: pm_subsys_data instance corresponding to the PM clock entry list
> + *and clk_op_might_sleep count being used.
> + * @flags: stored irq flags.
> + * @fn: string for the caller function's name.
> + *
> + * This is used by pm_clk_suspend() and pm_clk_resume() to guard
> + * against concurrent modifications to the clock entry list and the
> + * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
> + * only the mutex can be locked and those functions can only be used in
> + * non atomic context. If clock_op_might_sleep == 0 then these functions
> + * may be used in any context and only the spinlock can be locked.
> + * Returns -EINVAL if called in atomic context when clock ops might sleep.
> + */
> +static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned long *flags,
> +   const char *fn)
> +{
> + bool atomic_context = in_atomic() || irqs_disabled();
> +
> +try_again:
> + spin_lock_irqsave(>lock, *flags);
> + if (!psd->clock_op_might_sleep)
> + return 0;
> +
> + /* bail out if in atomic context */
> + if (atomic_context) {
> + pr_err("%s: atomic context with clock_ops_might_sleep = %d",
> +fn, psd->clock_op_might_sleep);
> + spin_unlock_irqrestore(>lock, *flags);
> + might_sleep();
> + return -EPERM;
> + }
> +
> + /* we must switch to the mutex */
> + spin_unlock_irqrestore(>lock, *flags);
> + mutex_lock(>clock_mutex);
> +
> + /*
> +  * There was a possibility for psd->clock_op_might_sleep
> +  * to become 0 above. Keep the mutex only if not the case.

[PATCH] PM / clk: make PM clock layer compatible with clocks that must sleep

2021-01-04 Thread Nicolas Pitre
The clock API splits its interface into sleepable ant atomic contexts:

- clk_prepare/clk_unprepare for stuff that might sleep

- clk_enable_clk_disable for anything that may be done in atomic context

The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.

Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.

Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.

For clocks that do implement the enable and disable methods then 
everything just works as before.

Signed-off-by: Nicolas Pitre 

diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
index ced6863a16..a62fb0f9b1 100644
--- a/drivers/base/power/clock_ops.c
+++ b/drivers/base/power/clock_ops.c
@@ -23,6 +23,7 @@
 enum pce_status {
PCE_STATUS_NONE = 0,
PCE_STATUS_ACQUIRED,
+   PCE_STATUS_PREPARED,
PCE_STATUS_ENABLED,
PCE_STATUS_ERROR,
 };
@@ -32,8 +33,102 @@ struct pm_clock_entry {
char *con_id;
struct clk *clk;
enum pce_status status;
+   bool enabled_when_prepared;
 };
 
+/**
+ * pm_clk_list_lock - ensure exclusive access for modifying the PM clock
+ *   entry list.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count to be modified.
+ *
+ * Get exclusive access before modifying the PM clock entry list and the
+ * clock_op_might_sleep count to guard against concurrent modifications.
+ * This also protects against a concurrent clock_op_might_sleep and PM clock
+ * entry list usage in pm_clk_suspend()/pm_clk_resume() that may or may not
+ * happen in atomic context, hence both the mutex and the spinlock must be
+ * taken here.
+ */
+static void pm_clk_list_lock(struct pm_subsys_data *psd)
+{
+   mutex_lock(>clock_mutex);
+   spin_lock_irq(>lock);
+}
+
+/**
+ * pm_clk_list_unlock - counterpart to pm_clk_list_lock().
+ * @psd: the same pm_subsys_data instance previously passed to
+ *  pm_clk_list_lock().
+ */
+static void pm_clk_list_unlock(struct pm_subsys_data *psd)
+{
+   spin_unlock_irq(>lock);
+   mutex_unlock(>clock_mutex);
+}
+
+/**
+ * pm_clk_op_lock - ensure exclusive access for performing clock operations.
+ * @psd: pm_subsys_data instance corresponding to the PM clock entry list
+ *  and clk_op_might_sleep count being used.
+ * @flags: stored irq flags.
+ * @fn: string for the caller function's name.
+ *
+ * This is used by pm_clk_suspend() and pm_clk_resume() to guard
+ * against concurrent modifications to the clock entry list and the
+ * clock_op_might_sleep count. If clock_op_might_sleep is != 0 then
+ * only the mutex can be locked and those functions can only be used in
+ * non atomic context. If clock_op_might_sleep == 0 then these functions
+ * may be used in any context and only the spinlock can be locked.
+ * Returns -EINVAL if called in atomic context when clock ops might sleep.
+ */
+static int pm_clk_op_lock(struct pm_subsys_data *psd, unsigned long *flags,
+ const char *fn)
+{
+   bool atomic_context = in_atomic() || irqs_disabled();
+
+try_again:
+   spin_lock_irqsave(>lock, *flags);
+   if (!psd->clock_op_might_sleep)
+   return 0;
+
+   /* bail out if in atomic context */
+   if (atomic_context) {
+   pr_err("%s: atomic context with clock_ops_might_sleep = %d",
+  fn, psd->clock_op_might_sleep);
+   spin_unlock_irqrestore(>lock, *flags);
+   might_sleep();
+   return -EPERM;
+   }
+
+   /* we must switch to the mutex */
+   spin_unlock_irqrestore(>lock, *flags);
+   mutex_lock(>clock_mutex);
+
+   /*
+* There was a possibility for psd->clock_op_might_sleep
+* to become 0 above. Keep the mutex only if not the case.
+*/
+   if (likely(psd->clock_op_might_sleep))
+   return 0;
+
+   mutex_unlock(>clock_mutex);
+   goto try_again;
+}
+
+/**
+ * pm_clk_op_unlock - counterpart to pm_clk_op_lock().
+ * @psd: the same pm_subsys_data instance previously passed to
+ *  pm_clk_op_lock().
+ * @flags: irq flags provided by pm_clk_op_lock().
+ */
+static void pm_clk_op_unlock(struct pm_subsys_data *psd, unsigned long *flags)
+{
+   if (psd->clock_op_might_sleep)
+   mutex_unlock(>clock_mutex);
+   el

Re: [PATCH] i3c/master/mipi-i3c-hci: re-fix __maybe_unused attribute

2020-12-30 Thread Nicolas Pitre
On Wed, 30 Dec 2020, Arnd Bergmann wrote:

> From: Arnd Bergmann 
> 
> clang warns because the added __maybe_unused attribute is in
> the wrong place:
> 
> drivers/i3c/master/mipi-i3c-hci/core.c:780:21: error: attribute declaration 
> must precede definition [-Werror,-Wignored-attributes]
> static const struct __maybe_unused of_device_id i3c_hci_of_match[] = {
> ^
> include/linux/compiler_attributes.h:267:56: note: expanded
> 
> Fixes: 95393f3e07ab ("i3c/master/mipi-i3c-hci: quiet maybe-unused variable 
> warning")
> Signed-off-by: Arnd Bergmann 

Acked-by: Nicolas Pitre 

This might be the 3rd patch from 3 different people fixing the same 
thing. Looks like I3C maintainer is on vacation. Please feel free to 
send this trivial fix upstream some other way.

> ---
>  drivers/i3c/master/mipi-i3c-hci/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/i3c/master/mipi-i3c-hci/core.c 
> b/drivers/i3c/master/mipi-i3c-hci/core.c
> index 500abd27fb22..1b73647cc3b1 100644
> --- a/drivers/i3c/master/mipi-i3c-hci/core.c
> +++ b/drivers/i3c/master/mipi-i3c-hci/core.c
> @@ -777,7 +777,7 @@ static int i3c_hci_remove(struct platform_device *pdev)
>   return 0;
>  }
>  
> -static const struct __maybe_unused of_device_id i3c_hci_of_match[] = {
> +static const __maybe_unused struct of_device_id i3c_hci_of_match[] = {
>   { .compatible = "mipi-i3c-hci", },
>   {},
>  };
> -- 
> 2.29.2
> 
> 


Re: [PATCH] i3c/master/mipi-i3c-hci: Fix position of __maybe_unused in i3c_hci_of_match

2020-12-21 Thread Nicolas Pitre
On Mon, 21 Dec 2020, Nathan Chancellor wrote:

> Clang warns:
> 
>  ../drivers/i3c/master/mipi-i3c-hci/core.c:780:21: warning: attribute
>  declaration must precede definition [-Wignored-attributes]
>  static const struct __maybe_unused of_device_id i3c_hci_of_match[] = {
>  ^
>  ../include/linux/compiler_attributes.h:267:56: note: expanded from macro
>  '__maybe_unused'
>  #define __maybe_unused  __attribute__((__unused__))
> ^
>  ../include/linux/mod_devicetable.h:262:8: note: previous definition is
>  here
>  struct of_device_id {
> ^
> 1 warning generated.
> 
> 'struct of_device_id' should not be split, as it is a type. Move the
> __maybe_unused attribute after the static and const qualifiers so that
> there are no warnings about this variable, period.
> 
> Fixes: 95393f3e07ab ("i3c/master/mipi-i3c-hci: quiet maybe-unused variable 
> warning")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1221
> Signed-off-by: Nathan Chancellor 

Acked-by: Nicolas Pitre 

> ---
>  drivers/i3c/master/mipi-i3c-hci/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/i3c/master/mipi-i3c-hci/core.c 
> b/drivers/i3c/master/mipi-i3c-hci/core.c
> index 500abd27fb22..1b73647cc3b1 100644
> --- a/drivers/i3c/master/mipi-i3c-hci/core.c
> +++ b/drivers/i3c/master/mipi-i3c-hci/core.c
> @@ -777,7 +777,7 @@ static int i3c_hci_remove(struct platform_device *pdev)
>   return 0;
>  }
>  
> -static const struct __maybe_unused of_device_id i3c_hci_of_match[] = {
> +static const __maybe_unused struct of_device_id i3c_hci_of_match[] = {
>   { .compatible = "mipi-i3c-hci", },
>   {},
>  };
> 
> base-commit: 95393f3e07ab53855b91881692a4a5b52dcdc03c
> -- 
> 2.30.0.rc1
> 
> 


Re: [PATCH v10 3/7] i3c: master: add i3c_secondary_master_register

2020-12-03 Thread Nicolas Pitre
On Mon, 30 Nov 2020, Parshuram Thombare wrote:

> add i3c_secondary_master_register which is used
> to register secondary masters.

I'm not sure about the logic here. Why would the secondary master 
initialize the bus? If you make a distinction between primary and 
secondary, then the primary should be the owner of the bus and it should 
have enumerated it already. You should populate the bus structure with 
info provided by the primary master not from DT?


> Signed-off-by: Parshuram Thombare 
> ---
>  drivers/i3c/master.c   |  154 
> +++-
>  include/linux/i3c/master.h |3 +
>  2 files changed, 156 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
> index 56e8fe4..af0630a 100644
> --- a/drivers/i3c/master.c
> +++ b/drivers/i3c/master.c
> @@ -1797,6 +1797,90 @@ static int i3c_primary_master_bus_init(struct 
> i3c_master_controller *master)
>   return ret;
>  }
>  
> +/**
> + * i3c_secondary_master_bus_init() - initialize an I3C bus for secondary
> + * master
> + * @master: secondary master initializing the bus
> + *
> + * This function does
> + *
> + * 1. Attach I2C devs to the master
> + *
> + * 2. Call _master_controller_ops->bus_init() method to initialize
> + *the master controller. That's usually where the bus mode is selected
> + *(pure bus or mixed fast/slow bus)
> + *
> + * Once this is done, I2C devices should be usable.
> + *
> + * Return: a 0 in case of success, an negative error code otherwise.
> + */
> +static int i3c_secondary_master_bus_init(struct i3c_master_controller 
> *master)
> +{
> + enum i3c_addr_slot_status status;
> + struct i2c_dev_boardinfo *i2cboardinfo;
> + struct i2c_dev_desc *i2cdev;
> + int ret;
> +
> + /*
> +  * First attach all devices with static definitions provided by the
> +  * FW.
> +  */
> + list_for_each_entry(i2cboardinfo, >boardinfo.i2c, node) {
> + status = i3c_bus_get_addr_slot_status(>bus,
> +   i2cboardinfo->base.addr);
> + if (status != I3C_ADDR_SLOT_FREE) {
> + ret = -EBUSY;
> + goto err_detach_devs;
> + }
> +
> + i3c_bus_set_addr_slot_status(>bus,
> +  i2cboardinfo->base.addr,
> +  I3C_ADDR_SLOT_I2C_DEV);
> +
> + i2cdev = i3c_master_alloc_i2c_dev(master, i2cboardinfo);
> + if (IS_ERR(i2cdev)) {
> + ret = PTR_ERR(i2cdev);
> + goto err_detach_devs;
> + }
> +
> + ret = i3c_master_attach_i2c_dev(master, i2cdev);
> + if (ret) {
> + i3c_master_free_i2c_dev(i2cdev);
> + goto err_detach_devs;
> + }
> + }
> +
> + /*
> +  * Now execute the controller specific ->bus_init() routine, which
> +  * might configure its internal logic to match the bus limitations.
> +  */
> + ret = master->ops->bus_init(master);
> + if (ret)
> + goto err_detach_devs;
> +
> + /*
> +  * The master device should have been instantiated in ->bus_init(),
> +  * complain if this was not the case.
> +  */
> + if (!master->this) {
> + dev_err(>dev,
> + "master_set_info() was not called in ->bus_init()\n");
> + ret = -EINVAL;
> + goto err_bus_cleanup;
> + }
> +
> + return 0;
> +
> +err_bus_cleanup:
> + if (master->ops->bus_cleanup)
> + master->ops->bus_cleanup(master);
> +
> +err_detach_devs:
> + i3c_master_detach_free_devs(master);
> +
> + return ret;
> +}
> +
>  static void i3c_master_bus_cleanup(struct i3c_master_controller *master)
>  {
>   if (master->ops->bus_cleanup)
> @@ -2514,7 +2598,10 @@ static int i3c_master_init(struct 
> i3c_master_controller *master,
>   goto err_put_dev;
>   }
>  
> - ret = i3c_primary_master_bus_init(master);
> + if (secondary)
> + ret = i3c_secondary_master_bus_init(master);
> + else
> + ret = i3c_primary_master_bus_init(master);
>   if (ret)
>   goto err_destroy_wq;
>  
> @@ -2595,6 +2682,71 @@ int i3c_primary_master_register(struct 
> i3c_master_controller *master,
>  EXPORT_SYMBOL_GPL(i3c_primary_master_register);
>  
>  /**
> + * i3c_secondary_master_register() - register an I3C secondary master
> + * @master: master used to send frames on the bus
> + * @parent: the parent device (the one that provides this I3C master
> + *   controller)
> + * @ops: the master controller operations
> + *
> + * This function does minimal required initialization for secondary
> + * master, rest functionality like creating and registering I2C
> + * and I3C devices is done in defslvs processing.
> + *
> + *  i3c_secondary_master_register() does following things -
> + 

Re: [PATCH v10 2/7] i3c: master: use i3c_master_register only for main master

2020-12-03 Thread Nicolas Pitre
On Mon, 30 Nov 2020, Parshuram Thombare wrote:

> Removed last argument 'secondary' and restructured i3c_master_register
> to move code that can be common to i3c_secondary_master_register
> to separate function i3c_master_init.
> 
> Signed-off-by: Parshuram Thombare 

[...]

> +static int i3c_master_init(struct i3c_master_controller *master,
> +struct device *parent,
> +const struct i3c_master_controller_ops *ops,
> +bool secondary)
>  {
>   unsigned long i2c_scl_rate = I3C_BUS_I2C_FM_PLUS_SCL_RATE;
>   struct i3c_bus *i3cbus = i3c_master_get_bus(master);
> @@ -2535,10 +2514,49 @@ int i3c_master_register(struct i3c_master_controller 
> *master,
>   goto err_put_dev;
>   }
>  
> - ret = i3c_master_bus_init(master);
> + ret = i3c_primary_master_bus_init(master);
>   if (ret)
>   goto err_destroy_wq;
>  
> + return 0;
> +
> +err_destroy_wq:
> + destroy_workqueue(master->wq);
> +
> +err_put_dev:
> + put_device(>dev);
> +
> + return ret;
> +}

[...]

> +int i3c_primary_master_register(struct i3c_master_controller *master,
> + struct device *parent,
> + const struct i3c_master_controller_ops *ops)
> +{
> + int ret;
> +
> + ret = i3c_master_init(master, parent, ops, false);
> + if (ret)
> + return ret;
> +
>   ret = device_add(>dev);
>   if (ret)
>   goto err_cleanup_bus;
> @@ -2568,15 +2586,13 @@ int i3c_master_register(struct i3c_master_controller 
> *master,
>  err_cleanup_bus:
>   i3c_master_bus_cleanup(master);
>  
> -err_destroy_wq:
>   destroy_workqueue(master->wq);
>  
> -err_put_dev:
>   put_device(>dev);
>  
>   return ret;
>  }

This looks a bit confusing. Here you're rolling back detailss in 
i3c_primary_master_register() that were factored out in 
i3c_master_init(). If i3c_master_init() is successful, then you 
shouldn't be undoing its things openly in i3c_primary_master_register(). 
Instead, there should be another function that does the reverse of 
i3c_master_init() here.


Nicolas


Re: [PATCH v10 3/3] ARM: uncompress: Validate start of physical memory against passed DTB

2020-12-03 Thread Nicolas Pitre
On Thu, 3 Dec 2020, Geert Uytterhoeven wrote:

> Currently, the start address of physical memory is obtained by masking
> the program counter with a fixed mask of 0xf800.  This mask value
> was chosen as a balance between the requirements of different platforms.
> However, this does require that the start address of physical memory is
> a multiple of 128 MiB, precluding booting Linux on platforms where this
> requirement is not fulfilled.
> 
> Fix this limitation by validating the masked address against the memory
> information in the passed DTB.  Only use the start address
> from DTB when masking would yield an out-of-range address, prefer the
> traditional method in all other cases.  Note that this applies only to the
> explicitly passed DTB on modern systems, and not to a DTB appended to
> the kernel, or to ATAGS.  The appended DTB may need to be augmented by
> information from ATAGS, which may need to rely on knowledge of the start
> address of physical memory itself.
> 
> This allows to boot Linux on r7s9210/rza2mevb using the 64 MiB of SDRAM
> on the RZA2MEVB sub board, which is located at 0x0C00 (CS3 space),
> i.e. not at a multiple of 128 MiB.
> 
> Suggested-by: Nicolas Pitre 
> Suggested-by: Ard Biesheuvel 
> Signed-off-by: Geert Uytterhoeven 

I'm not that intimate with dt contents so:

Acked-by: Nicolas Pitre 

And it would be a good idea to repeat the GOT fixup caviat comment 
before calling fdt_check_mem_start.

> ---
> v10:
>   - Update for commit 9443076e4330a14a ("ARM: p2v: reduce p2v alignment
> requirement to 2 MiB"),
>   - Use OF_DT_MAGIC macro,
>   - Rename fdt_get_mem_start() to fdt_check_mem_start(),
>   - Skip validation if there is an appended DTB,
>   - Pass mem_start to fdt_check_mem_start() to streamline code,
>   - Optimize register allocation,
>   - Update CONFIG_AUTO_ZRELADDR help text,
>   - Check all memory nodes and ranges (not just the first one), and
> "linux,usable-memory", similar to early_init_dt_scan_memory(),
>   - Drop Reviewed-by, Tested-by tags,
> 
> v9:
>   - Add minlen parameter to getprop(), for better validation of
> memory properties,
>   - Only use start of memory from the DTB if masking would yield an
> out-of-range address, to fix kdump, as suggested by Ard.
> 
> v8:
>   - Rebase on top of commit 893ab00439a45513 ("kbuild: remove cc-option
> test of -fno-stack-protector"),
> 
> v7:
>   - Rebase on top of commit 161e04a5bae58a65 ("ARM: decompressor: split
> off _edata and stack base into separate object"),
> 
> v6:
>   - Rebase on top of commit 7ae4a78daacf240a ("ARM: 8969/1:
> decompressor: simplify libfdt builds"),
>   - Include  instead of ,
> 
> v5:
>   - Add Tested-by, Reviewed-by,
>   - Round up start of memory to satisfy 16 MiB alignment rule,
> 
> v4:
>   - Fix stack location after commit 184bf653a7a452c1 ("ARM:
> decompressor: factor out routine to obtain the inflated image
> size"),
> 
> v3:
>   - Add Reviewed-by,
>   - Fix ATAGs with appended DTB,
>   - Add Tested-by,
> 
> v2:
>   - Use "cmp r0, #-1", instead of "cmn r0, #1",
>   - Add missing stack setup,
>   - Support appended DTB.
> ---
>  arch/arm/Kconfig  |   7 +-
>  arch/arm/boot/compressed/Makefile |   5 +-
>  .../arm/boot/compressed/fdt_check_mem_start.c | 131 ++
>  arch/arm/boot/compressed/head.S   |  32 -
>  4 files changed, 168 insertions(+), 7 deletions(-)
>  create mode 100644 arch/arm/boot/compressed/fdt_check_mem_start.c
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index b2bf019dcefa6379..c341aa6fa862455c 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1908,9 +1908,10 @@ config AUTO_ZRELADDR
>   help
> ZRELADDR is the physical address where the decompressed kernel
> image will be placed. If AUTO_ZRELADDR is selected, the address
> -   will be determined at run-time by masking the current IP with
> -   0xf800. This assumes the zImage being placed in the first 128MB
> -   from start of memory.
> +   will be determined at run-time, either by masking the current IP
> +   with 0xf800, or, if invalid, from the DTB passed in r2.
> +   This assumes the zImage being placed in the first 128MB from
> +   start of memory.
>  
>  config EFI_STUB
>   bool
> diff --git a/arch/arm/boot/compressed/Makefile 
> b/arch/arm/boot/compressed/Makefile
> index a815b1ae990d2d48..7361d45dc2ad603e 100644
> --- a/arch/arm/boot/compressed/Makefile
> +++ b/arch/arm/boot/compressed/Makefile
> @@ -87,10 +87,13 @@ libfdt_o

Re: [PATCH v10 2/3] ARM: uncompress: Add OF_DT_MAGIC macro

2020-12-03 Thread Nicolas Pitre
On Thu, 3 Dec 2020, Geert Uytterhoeven wrote:

> The DTB magic marker is stored as a 32-bit big-endian value, and thus
> depends on the CPU's endianness.  Add a macro to define this value in
> native endianness, to reduce #ifdef clutter and (future) duplication.
> 
> Signed-off-by: Geert Uytterhoeven 

Reviewed-by: Nicolas Pitre 


> ---
> v10:
>   - New.
> ---
>  arch/arm/boot/compressed/head.S | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index aabdc544c03aafdc..d9cce7238a365081 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -11,6 +11,12 @@
>  
>  #include "efi-header.S"
>  
> +#ifdef __ARMEB__
> +#define OF_DT_MAGIC 0xd00dfeed
> +#else
> +#define OF_DT_MAGIC 0xedfe0dd0
> +#endif
> +
>   AR_CLASS(   .arch   armv7-a )
>   M_CLASS(.arch   armv7-m )
>  
> @@ -335,11 +341,7 @@ restart: adr r0, LC1
>   */
>  
>   ldr lr, [r6, #0]
> -#ifndef __ARMEB__
> - ldr r1, =0xedfe0dd0 @ sig is 0xd00dfeed big endian
> -#else
> - ldr r1, =0xd00dfeed
> -#endif
> + ldr r1, =OF_DT_MAGIC
>   cmp lr, r1
>   bne dtb_check_done  @ not found
>  
> -- 
> 2.25.1
> 
> 


Re: [PATCH v10 1/3] ARM: uncompress: Add be32tocpu macro

2020-12-03 Thread Nicolas Pitre
On Thu, 3 Dec 2020, Geert Uytterhoeven wrote:

> DTB stores all values as 32-bit big-endian integers.
> Add a macro to convert such values to native CPU endianness, to reduce
> duplication.
> 
> Signed-off-by: Geert Uytterhoeven 

I agree with Ard's suggestions. In any case:

Reviewed-by: Nicolas Pitre 

> ---
> v10:
>   - New.
> ---
>  arch/arm/boot/compressed/head.S | 26 --
>  1 file changed, 12 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index 835ce64f1674c9a2..aabdc544c03aafdc 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -165,6 +165,16 @@
>   orr \res, \res, \tmp1, lsl #24
>   .endm
>  
> + .macro  be32tocpu, val, tmp
> +#ifndef __ARMEB__
> + /* convert to little endian */
> + eor \tmp, \val, \val, ror #16
> + bic \tmp, \tmp, #0x00ff
> + mov \val, \val, ror #8
> + eor \val, \val, \tmp, lsr #8
> +#endif
> + .endm
> +
>   .section ".start", "ax"
>  /*
>   * sort out different calling conventions
> @@ -345,13 +355,7 @@ restart: adr r0, LC1
>  
>   /* Get the initial DTB size */
>   ldr r5, [r6, #4]
> -#ifndef __ARMEB__
> - /* convert to little endian */
> - eor r1, r5, r5, ror #16
> - bic r1, r1, #0x00ff
> - mov r5, r5, ror #8
> - eor r5, r5, r1, lsr #8
> -#endif
> + be32tocpu r5, r1
>   dbgadtb r6, r5
>   /* 50% DTB growth should be good enough */
>   add r5, r5, r5, lsr #1
> @@ -403,13 +407,7 @@ restart: adr r0, LC1
>  
>   /* Get the current DTB size */
>   ldr r5, [r6, #4]
> -#ifndef __ARMEB__
> - /* convert r5 (dtb size) to little endian */
> - eor r1, r5, r5, ror #16
> - bic r1, r1, #0x00ff
> - mov r5, r5, ror #8
> - eor r5, r5, r1, lsr #8
> -#endif
> + be32tocpu r5, r1
>  
>   /* preserve 64-bit alignment */
>   add r5, r5, #7
> -- 
> 2.25.1
> 
> 


Re: [PATCH v2] RISC-V: enable XIP

2020-12-03 Thread Nicolas Pitre
On Thu, 3 Dec 2020, Vitaly Wool wrote:

> Introduce XIP (eXecute In Place) support for RISC-V platforms.
> It allows code to be executed directly from non-volatile storage
> directly addressable by the CPU, such as QSPI NOR flash which can
> be found on many RISC-V platforms. This makes way for significant
> optimization of RAM footprint. The XIP kernel is not compressed
> since it has to run directly from flash, so it will occupy more
> space on the non-volatile storage to The physical flash address
> used to link the kernel object files and for storing it has to
> be known at compile time and is represented by a Kconfig option.
> 
> XIP on RISC-V will currently only work on MMU-enabled kernels.
> 
> Changed in v2:
> - dedicated macro for XIP address fixup when MMU is not enabled yet
>   o both for 32-bit and 64-bit RISC-V
> - SP is explicitly set to a safe place in RAM before __copy_data call
> - removed redundant alignment requirements in vmlinux-xip.lds.S
> - changed long -> uintptr_t typecast in __XIP_FIXUP macro.
> 
> Signed-off-by: Vitaly Wool 

Looks fine to me, at least for those parts I'm familiar with (I can't 
really comment on rv asm styling and page setup).

Acked-by: Nicolas Pitre 

If you have a chance to test cramfs' user space XIP support that would 
be nice. So far it was tested only on ARM. See 
Documentation/filesystems/cramfs.rst for instructions.



> ---
>  arch/riscv/Kconfig  |  40 -
>  arch/riscv/Makefile |   8 +-
>  arch/riscv/boot/Makefile|  14 ++-
>  arch/riscv/include/asm/pgtable.h|  54 ++--
>  arch/riscv/kernel/head.S|  54 +++-
>  arch/riscv/kernel/head.h|   3 +
>  arch/riscv/kernel/setup.c   |   2 +-
>  arch/riscv/kernel/vmlinux-xip.lds.S | 132 
>  arch/riscv/kernel/vmlinux.lds.S |   6 ++
>  arch/riscv/mm/init.c| 132 ++--
>  10 files changed, 423 insertions(+), 22 deletions(-)
>  create mode 100644 arch/riscv/kernel/vmlinux-xip.lds.S
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 44377fd7860e..c9bef841c884 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -395,7 +395,7 @@ config EFI_STUB
>  
>  config EFI
>   bool "UEFI runtime support"
> - depends on OF
> + depends on OF && !XIP_KERNEL
>   select LIBFDT
>   select UCS2_STRING
>   select EFI_PARAMS_FROM_FDT
> @@ -412,6 +412,44 @@ config EFI
> allow the kernel to be booted as an EFI application. This
> is only useful on systems that have UEFI firmware.
>  
> +config XIP_KERNEL
> + bool "Kernel Execute-In-Place from ROM"
> + depends on MMU
> + help
> +   Execute-In-Place allows the kernel to run from non-volatile storage
> +   directly addressable by the CPU, such as NOR flash. This saves RAM
> +   space since the text section of the kernel is not loaded from flash
> +   to RAM.  Read-write sections, such as the data section and stack,
> +   are still copied to RAM.  The XIP kernel is not compressed since
> +   it has to run directly from flash, so it will take more space to
> +   store it.  The flash address used to link the kernel object files,
> +   and for storing it, is configuration dependent. Therefore, if you
> +   say Y here, you must know the proper physical address where to
> +   store the kernel image depending on your own flash memory usage.
> +
> +   Also note that the make target becomes "make xipImage" rather than
> +   "make zImage" or "make Image".  The final kernel binary to put in
> +   ROM memory will be arch/riscv/boot/xipImage.
> +
> +   If unsure, say N.
> +
> +config XIP_PHYS_ADDR
> + hex "XIP Kernel Physical Location"
> + depends on XIP_KERNEL
> + default "0x2100"
> + help
> +   This is the physical address in your flash memory the kernel will
> +   be linked for and stored to.  This address is dependent on your
> +   own flash usage.
> +
> +config XIP_PHYS_RAM_BASE
> + hex "Platform Physical RAM address"
> + depends on XIP_KERNEL
> + default "0x8000"
> + help
> +   This is the physical address of RAM in the system. It has to be
> +   explicitly specified to run early relocations of read-write data
> +   from flash to RAM.
>  endmenu
>  
>  config BUILTIN_DTB
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0289a97325d1..387afe973530 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -70,7 +70,11 @@ CHECKFLA

Re: [PATCH] arch/riscv: enable XIP

2020-12-02 Thread Nicolas Pitre
On Wed, 2 Dec 2020, Vitaly Wool wrote:

> On Wed, Dec 2, 2020 at 7:06 PM Nicolas Pitre  wrote:
> >
> > On Wed, 2 Dec 2020, Vitaly Wool wrote:
> >
> > > Introduce XIP (eXecute In Place) support for RISC-V platforms.
> > > It allows code to be executed directly from non-volatile storage
> > > directly addressable by the CPU, such as QSPI NOR flash which can
> > > be found on many RISC-V platforms. This makes way for significant
> > > optimization of RAM footprint. The XIP kernel is not compressed
> > > since it has to run directly from flash, so it will occupy more
> > > space on the non-volatile storage to The physical flash address
> > > used to link the kernel object files and for storing it has to
> > > be known at compile time and is represented by a Kconfig option.
> > >
> > > XIP on RISC-V will currently only work on MMU-enabled kernels.
> > >
> > > Signed-off-by: Vitaly Wool 
> >
> > That's nice!
> >
> > Suggestion for a future enhancement:
> > To save on ROM storage, and given that the .data segment has to be
> > copied to RAM anyway, you could store .data compressed and decompress it
> > to RAM instead. See commit ca8b5d97d6bf for inspiration. In fact, many
> > parts there could be shared.
> 
> Thanks! That's in my TODO list.
> 
> > More comments below.
> >
> > > +#define __XIP_FIXUP(addr) \
> > > + (((long)(addr) >= CONFIG_XIP_PHYS_ADDR && \
> > > +   (long)(addr) <= CONFIG_XIP_PHYS_ADDR + SZ_16M) ? \
> > > + (long)(addr) - CONFIG_XIP_PHYS_ADDR + CONFIG_XIP_PHYS_RAM_BASE - 
> > > XIP_OFFSET : \
> > > + (long)(addr))
> >
> > Here you should cast to unsigned long instead.
> 
> Right, or (just thought of it) uintptr_t for that matter. Does that sound 
> right?

Sure.

> > > +#ifdef CONFIG_XIP_KERNEL
> > > + la a0, _trampoline_pg_dir
> > > + lw t0, _xip_fixup
> > > + add a0, a0, t0
> > [...]
> > > +_xip_fixup:
> > > + .dword CONFIG_XIP_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR - XIP_OFFSET
> > > +#endif
> >
> > Here _xip_fixup is a dword but you're loading it as a word.
> > This won't work for both rv32 and rv64.
> 
> Well, at this point I believe it does, as long as we use little
> endian.

... and the fixup is not larger than 32 bits.

> 64-bit version has been verified.

Won't work if ROM and RAM are far-enough apart.

> I do not argue though that it isn't nice and should be fixed.
> 
> > > +SECTIONS
> > > +{
> > > + /* Beginning of code and text segment */
> > > + . = XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR);
> > > + _xiprom = .;
> > > + _start = .;
> > > + HEAD_TEXT_SECTION
> > > + INIT_TEXT_SECTION(PAGE_SIZE)
> > > + /* we have to discard exit text and such at runtime, not link time 
> > > */
> > > + .exit.text :
> > > + {
> > > + EXIT_TEXT
> > > + }
> > > +
> > > + .text : {
> > > + _text = .;
> > > + _stext = .;
> > > + TEXT_TEXT
> > > + SCHED_TEXT
> > > + CPUIDLE_TEXT
> > > + LOCK_TEXT
> > > + KPROBES_TEXT
> > > + ENTRY_TEXT
> > > + IRQENTRY_TEXT
> > > + SOFTIRQENTRY_TEXT
> > > + *(.fixup)
> > > + _etext = .;
> > > + }
> > > + RO_DATA(L1_CACHE_BYTES)
> > > + .srodata : {
> > > + *(.srodata*)
> > > + }
> > > + .init.rodata : {
> > > + INIT_SETUP(16)
> > > + INIT_CALLS
> > > + CON_INITCALL
> > > + INIT_RAM_FS
> > > + }
> > > + _exiprom = ALIGN(PAGE_SIZE);/* End of XIP ROM area */
> >
> > Why do you align this to a page size?
> 
> TBH I just cut the corners here and below, did not have to worry about
> partial pages and such.

You're wasting ROM space though.

> > > +
> > > +
> > > +/*
> > > + * From this point, stuff is considered writable and will be copied to 
> > > RAM
> > > + */
> > > + __data_loc = ALIGN(PAGE_SIZE);  /* location in file */
> >
> > Same question here?
> >
> > > + . = PAGE_OFFSET;/* location in memory */
> > > +
> > > + _sdata = .; /* Start of data section */
> > > + _data = .;
> > > + RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
> > > + _edata = .;
> > > + __start_ro_after_init = .;
> > > + .data.ro_after_init : AT(ADDR(.data.ro_after_init) - LOAD_OFFSET) {
> > > + *(.data..ro_after_init)
> > > + }
> > > + __end_ro_after_init = .;
> > > +
> > > + . = ALIGN(PAGE_SIZE);
> >
> > And again here?

Forget this particular one. It is fine.


Nicolas


Re: [PATCH] arch/riscv: enable XIP

2020-12-02 Thread Nicolas Pitre
On Wed, 2 Dec 2020, Vitaly Wool wrote:

> Introduce XIP (eXecute In Place) support for RISC-V platforms.
> It allows code to be executed directly from non-volatile storage
> directly addressable by the CPU, such as QSPI NOR flash which can
> be found on many RISC-V platforms. This makes way for significant
> optimization of RAM footprint. The XIP kernel is not compressed
> since it has to run directly from flash, so it will occupy more
> space on the non-volatile storage to The physical flash address
> used to link the kernel object files and for storing it has to
> be known at compile time and is represented by a Kconfig option.
> 
> XIP on RISC-V will currently only work on MMU-enabled kernels.
> 
> Signed-off-by: Vitaly Wool 

That's nice!

Suggestion for a future enhancement:
To save on ROM storage, and given that the .data segment has to be 
copied to RAM anyway, you could store .data compressed and decompress it 
to RAM instead. See commit ca8b5d97d6bf for inspiration. In fact, many 
parts there could be shared.

More comments below.

> +#define __XIP_FIXUP(addr) \
> + (((long)(addr) >= CONFIG_XIP_PHYS_ADDR && \
> +   (long)(addr) <= CONFIG_XIP_PHYS_ADDR + SZ_16M) ? \
> + (long)(addr) - CONFIG_XIP_PHYS_ADDR + CONFIG_XIP_PHYS_RAM_BASE - 
> XIP_OFFSET : \
> + (long)(addr))

Here you should cast to unsigned long instead.

> +#ifdef CONFIG_XIP_KERNEL
> + la a0, _trampoline_pg_dir
> + lw t0, _xip_fixup
> + add a0, a0, t0
[...]
> +_xip_fixup:
> + .dword CONFIG_XIP_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR - XIP_OFFSET
> +#endif

Here _xip_fixup is a dword but you're loading it as a word.
This won't work for both rv32 and rv64.

> +SECTIONS
> +{
> + /* Beginning of code and text segment */
> + . = XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR);
> + _xiprom = .;
> + _start = .;
> + HEAD_TEXT_SECTION
> + INIT_TEXT_SECTION(PAGE_SIZE)
> + /* we have to discard exit text and such at runtime, not link time */
> + .exit.text :
> + {
> + EXIT_TEXT
> + }
> +
> + .text : {
> + _text = .;
> + _stext = .;
> + TEXT_TEXT
> + SCHED_TEXT
> + CPUIDLE_TEXT
> + LOCK_TEXT
> + KPROBES_TEXT
> + ENTRY_TEXT
> + IRQENTRY_TEXT
> + SOFTIRQENTRY_TEXT
> + *(.fixup)
> + _etext = .;
> + }
> + RO_DATA(L1_CACHE_BYTES)
> + .srodata : {
> + *(.srodata*)
> + }
> + .init.rodata : {
> + INIT_SETUP(16)
> + INIT_CALLS
> + CON_INITCALL
> + INIT_RAM_FS
> + }
> + _exiprom = ALIGN(PAGE_SIZE);/* End of XIP ROM area */

Why do you align this to a page size?

> +
> +
> +/*
> + * From this point, stuff is considered writable and will be copied to RAM
> + */
> + __data_loc = ALIGN(PAGE_SIZE);  /* location in file */

Same question here?

> + . = PAGE_OFFSET;/* location in memory */
> +
> + _sdata = .; /* Start of data section */
> + _data = .;
> + RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
> + _edata = .;
> + __start_ro_after_init = .;
> + .data.ro_after_init : AT(ADDR(.data.ro_after_init) - LOAD_OFFSET) {
> + *(.data..ro_after_init)
> + }
> + __end_ro_after_init = .;
> +
> + . = ALIGN(PAGE_SIZE);

And again here?

> +#ifdef CONFIG_XIP_KERNEL
> +/* called from head.S with MMU off */
> +asmlinkage void __init __copy_data(void)
> +{
> + void *from = (void *)(&_sdata);
> + void *end = (void *)(&_end);
> + void *to = (void *)CONFIG_XIP_PHYS_RAM_BASE;
> + size_t sz = (size_t)(end - from);
> +
> + memcpy(to, from, sz);
> +}
> +#endif

Where is the stack located when this executes? The stack for the init 
task is typically found within the .data area. At least on ARM it is. 
You don't want to overwrite your stack here.


Nicolas


Re: [PATCH] __div64_32(): straighten up inline asm constraints

2020-11-30 Thread Nicolas Pitre
On Mon, 30 Nov 2020, Nick Desaulniers wrote:

> On Mon, Nov 30, 2020 at 11:05 AM Nicolas Pitre  wrote:
> 
> > +   __rem = __n >> 32;
> > *n = __res;
> > return __rem;
> 
> The above 3 statement could be:
> 
> ```
> *n = __res;
> return __n >> 32;
> ```

They could. However the compiler doesn't care, and the extra line makes 
it more obvious that the reminder is the high part of __n. So, 
semantically the extra line has value.

Thanks for the review.


Nicolas


[PATCH] __div64_32(): straighten up inline asm constraints

2020-11-30 Thread Nicolas Pitre
The ARM version of __div64_32() encapsulates a call to __do_div64 with 
non-standard argument passing. In particular, __n is a 64-bit input 
argument assigned to r0-r1 and __rem is an output argument sharing half 
of that 40-r1 register pair.

With __n being an input argument, the compiler is in its right to 
presume that r0-r1 would still hold the value of __n past the inline 
assembly statement. Normally, the compiler would have assigned non 
overlapping registers to __n and __rem if the value for __n is needed 
again.

However, here we enforce our own register assignment and gcc fails to 
notice the conflict. In practice this doesn't cause any problem as __n 
is considered dead after the asm statement and *n is overwritten. 
However this is not always guaranteed and clang rightfully complains.

Let's fix it properly by making __n into an input-output variable. This 
makes it clear that those registers representing __n have been modified. 
Then we can extract __rem as the high part of __n with plain C code.

This asm constraint "abuse" was likely relied upon back when gcc didn't 
handle 64-bit values optimally Turns out that gcc is now able to 
optimize things and produces the same code with this patch applied.

Signed-off-by: Nicolas Pitre 
Reviewed-by: Ard Biesheuvel 
Tested-by: Ard Biesheuvel 
---

This is related to the thread titled "[RESEND,PATCH] ARM: fix 
__div64_32() error when compiling with clang". My limited compile test 
with clang appears to make it happy. If no more comments I'll push this 
to RMK's patch system.

diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7..595e538f5b 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -21,29 +21,20 @@
  * assembly implementation with completely non standard calling convention
  * for arguments and results (beware).
  */
-
-#ifdef __ARMEB__
-#define __xh "r0"
-#define __xl "r1"
-#else
-#define __xl "r0"
-#define __xh "r1"
-#endif
-
 static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
 {
register unsigned int __base  asm("r4") = base;
register unsigned long long __n   asm("r0") = *n;
register unsigned long long __res asm("r2");
-   register unsigned int __rem   asm(__xh);
-   asm(__asmeq("%0", __xh)
+   unsigned int __rem;
+   asm(__asmeq("%0", "r0")
__asmeq("%1", "r2")
-   __asmeq("%2", "r0")
-   __asmeq("%3", "r4")
+   __asmeq("%2", "r4")
"bl __do_div64"
-   : "=r" (__rem), "=r" (__res)
-   : "r" (__n), "r" (__base)
+   : "+r" (__n), "=r" (__res)
+   : "r" (__base)
: "ip", "lr", "cc");
+   __rem = __n >> 32;
*n = __res;
return __rem;
 }


Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang

2020-11-30 Thread Nicolas Pitre
On Mon, 30 Nov 2020, Ard Biesheuvel wrote:

> On Mon, 30 Nov 2020 at 16:51, Nicolas Pitre  wrote:
> 
> > Here's my version of the fix which should be correct. Warning: this
> > is completely untested, but should in theory produce the same code on
> > modern gcc.
> >
> > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> > index 898e9c78a7..595e538f5b 100644
> > --- a/arch/arm/include/asm/div64.h
> > +++ b/arch/arm/include/asm/div64.h
> > @@ -21,29 +21,20 @@
> >   * assembly implementation with completely non standard calling convention
> >   * for arguments and results (beware).
> >   */
> > -
> > -#ifdef __ARMEB__
> > -#define __xh "r0"
> > -#define __xl "r1"
> > -#else
> > -#define __xl "r0"
> > -#define __xh "r1"
> > -#endif
> > -
> >  static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
> >  {
> > register unsigned int __base  asm("r4") = base;
> > register unsigned long long __n   asm("r0") = *n;
> > register unsigned long long __res asm("r2");
> > -   register unsigned int __rem   asm(__xh);
> > -   asm(__asmeq("%0", __xh)
> > +   unsigned int __rem;
> > +   asm(__asmeq("%0", "r0")
> > __asmeq("%1", "r2")
> > -   __asmeq("%2", "r0")
> > -   __asmeq("%3", "r4")
> > +   __asmeq("%2", "r4")
> > "bl __do_div64"
> > -   : "=r" (__rem), "=r" (__res)
> > -   : "r" (__n), "r" (__base)
> > +   : "+r" (__n), "=r" (__res)
> > +   : "r" (__base)
> > : "ip", "lr", "cc");
> > +   __rem = __n >> 32;
> 
> This treats {r0, r1} as a {low, high} pair, regardless of endianness,
> and so it puts the value of r0 into r1. Doesn't that mean the shift
> should only be done on little endian?

Not quite. r0-r1 = low-high is for little endian. Then "__n >> 32" is 
actually translated into "mov r0, r1" to move it into __rem and returned 
through r0.

On big endial it is r0-r1 = high-low. Here "__n >> 32" picks r0 and 
moves it to __rem which is returned through r0 so no extra instruction 
needed.

Of course the function is inlined so r0 can be anything, or optimized 
away if__rem is not used.


Nicolas


Re: [RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang

2020-11-30 Thread Nicolas Pitre
On Mon, 30 Nov 2020, Ard Biesheuvel wrote:

> (+ Nico)
> 
> On Mon, 30 Nov 2020 at 11:11, Ard Biesheuvel  wrote:
> >
> > On Mon, 23 Nov 2020 at 08:39, Antony Yu  wrote:
> > >
> > > __do_div64 clobbers the input register r0 in little endian system.
> > > According to the inline assembly document, if an input operand is
> > > modified, it should be tied to a output operand. This patch can
> > > prevent compilers from reusing r0 register after asm statements.
> > >
> > > Signed-off-by: Antony Yu 
> > > ---
> > >  arch/arm/include/asm/div64.h | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> > > index 898e9c78a7e7..809efc51e90f 100644
> > > --- a/arch/arm/include/asm/div64.h
> > > +++ b/arch/arm/include/asm/div64.h
> > > @@ -39,9 +39,10 @@ static inline uint32_t __div64_32(uint64_t *n, 
> > > uint32_t base)
> > > asm(__asmeq("%0", __xh)
> > > __asmeq("%1", "r2")
> > > __asmeq("%2", "r0")
> > > -   __asmeq("%3", "r4")
> > > +   __asmeq("%3", "r0")
> > > +   __asmeq("%4", "r4")
> > > "bl __do_div64"
> > > -   : "=r" (__rem), "=r" (__res)
> > > +   : "=r" (__rem), "=r" (__res), "=r" (__n)
> > > : "r" (__n), "r" (__base)
> > > : "ip", "lr", "cc");
> > > *n = __res;
> > > --
> > > 2.23.0
> > >
> >
> > Agree that using r0 as an input operand only is incorrect, and not
> > only on Clang. The compiler might assume that r0 will retain its value
> > across the asm() block, which is obviously not the case.

You're right.

This was done like that most likely to work around some stupid code 
generation with "__n >> 32" while using gcc from about 20 years ago. IOW 
I don't exactly remember why I did it like that, but it is certainly 
flawed.

Here's my version of the fix which should be correct. Warning: this 
is completely untested, but should in theory produce the same code on 
modern gcc.

diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7..595e538f5b 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -21,29 +21,20 @@
  * assembly implementation with completely non standard calling convention
  * for arguments and results (beware).
  */
-
-#ifdef __ARMEB__
-#define __xh "r0"
-#define __xl "r1"
-#else
-#define __xl "r0"
-#define __xh "r1"
-#endif
-
 static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
 {
register unsigned int __base  asm("r4") = base;
register unsigned long long __n   asm("r0") = *n;
register unsigned long long __res asm("r2");
-   register unsigned int __rem   asm(__xh);
-   asm(__asmeq("%0", __xh)
+   unsigned int __rem;
+   asm(__asmeq("%0", "r0")
__asmeq("%1", "r2")
-   __asmeq("%2", "r0")
-   __asmeq("%3", "r4")
+   __asmeq("%2", "r4")
"bl __do_div64"
-   : "=r" (__rem), "=r" (__res)
-   : "r" (__n), "r" (__base)
+   : "+r" (__n), "=r" (__res)
+   : "r" (__base)
: "ip", "lr", "cc");
+   __rem = __n >> 32;
*n = __res;
return __rem;
 }

I'll submit it if someone confirms it works.


Nicolas


Re: [PATCH] cramfs: Kconfig: Fix spelling mistake "adresssed" -> "addressed"

2020-11-26 Thread Nicolas Pitre
On Thu, 26 Nov 2020, Colin King wrote:

> From: Colin Ian King 
> 
> There is a spelling mistake in the Kconfig help text. Fix it.
> 
> Signed-off-by: Colin Ian King 

Acked-by: Nicolas Pitre 


> ---
>  fs/cramfs/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/cramfs/Kconfig b/fs/cramfs/Kconfig
> index d98cef0dbb6b..a8af8c6ac15d 100644
> --- a/fs/cramfs/Kconfig
> +++ b/fs/cramfs/Kconfig
> @@ -38,7 +38,7 @@ config CRAMFS_MTD
>   default y if !CRAMFS_BLOCKDEV
>   help
> This option allows the CramFs driver to load data directly from
> -   a linear adressed memory range (usually non volatile memory
> +   a linear addressed memory range (usually non volatile memory
> like flash) instead of going through the block device layer.
> This saves some memory since no intermediate buffering is
> necessary.
> -- 
> 2.29.2
> 
> 


Re: [PATCH][next] i3c/master: Fix uninitialized variable next_addr

2020-11-24 Thread Nicolas Pitre
On Tue, 24 Nov 2020, Colin King wrote:

> From: Colin Ian King 
> 
> The variable next_addr is not initialized and is being used in a call
> to i3c_master_get_free_addr as a starting point to find the next address.
> Fix this by initializing next_addr to 0 to avoid an uninitialized garbage
> starting address from being used.
> 
> Addresses-Coverity: ("Uninitialized scalar variable")
> Fixes: 9ad9a52cce28 ("i3c/master: introduce the mipi-i3c-hci driver")
> Signed-off-by: Colin Ian King 

Acked-by: Nicolas Pitre 





> ---
>  drivers/i3c/master/mipi-i3c-hci/cmd_v1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/i3c/master/mipi-i3c-hci/cmd_v1.c 
> b/drivers/i3c/master/mipi-i3c-hci/cmd_v1.c
> index 6dd234a82892..d97c3175e0e2 100644
> --- a/drivers/i3c/master/mipi-i3c-hci/cmd_v1.c
> +++ b/drivers/i3c/master/mipi-i3c-hci/cmd_v1.c
> @@ -293,7 +293,7 @@ static int hci_cmd_v1_daa(struct i3c_hci *hci)
>  {
>   struct hci_xfer *xfer;
>   int ret, dat_idx = -1;
> - u8 next_addr;
> + u8 next_addr = 0;
>   u64 pid;
>   unsigned int dcr, bcr;
>   DECLARE_COMPLETION_ONSTACK(done);
> -- 
> 2.29.2
> 
> 


Re: [PATCH 12/17] fs/cramfs: Use memcpy_from_page()

2020-11-24 Thread Nicolas Pitre
On Mon, 23 Nov 2020, ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> Remove open coded kmap/memcpy/kunmap and use mempcy_from_page() instead.
> 
> Cc: Nicolas Pitre 
> Signed-off-by: Ira Weiny 

Acked-by: Nicolas Pitre 


> ---
>  fs/cramfs/inode.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 4b90cfd1ec36..996a3a32a01f 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -247,8 +247,7 @@ static void *cramfs_blkdev_read(struct super_block *sb, 
> unsigned int offset,
>   struct page *page = pages[i];
>  
>   if (page) {
> - memcpy(data, kmap(page), PAGE_SIZE);
> - kunmap(page);
> + memcpy_from_page(data, page, 0, PAGE_SIZE);
>   put_page(page);
>   } else
>   memset(data, 0, PAGE_SIZE);
> -- 
> 2.28.0.rc0.12.gb6a658bd00c9
> 
> 


Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()

2020-10-13 Thread Nicolas Pitre
On Fri, 9 Oct 2020, ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> The kmap() calls in this FS are localized to a single thread.  To avoid
> the over head of global PKRS updates use the new kmap_thread() call.
> 
> Cc: Nicolas Pitre 
> Signed-off-by: Ira Weiny 

Acked-by: Nicolas Pitre 

>  fs/cramfs/inode.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 912308600d39..003c014a42ed 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block *sb, 
> unsigned int offset,
>   struct page *page = pages[i];
>  
>   if (page) {
> - memcpy(data, kmap(page), PAGE_SIZE);
> - kunmap(page);
> + memcpy(data, kmap_thread(page), PAGE_SIZE);
> + kunmap_thread(page);
>   put_page(page);
>   } else
>   memset(data, 0, PAGE_SIZE);
> @@ -826,7 +826,7 @@ static int cramfs_readpage(struct file *file, struct page 
> *page)
>  
>   maxblock = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
>   bytes_filled = 0;
> - pgdata = kmap(page);
> + pgdata = kmap_thread(page);
>  
>   if (page->index < maxblock) {
>   struct super_block *sb = inode->i_sb;
> @@ -914,13 +914,13 @@ static int cramfs_readpage(struct file *file, struct 
> page *page)
>  
>   memset(pgdata + bytes_filled, 0, PAGE_SIZE - bytes_filled);
>   flush_dcache_page(page);
> - kunmap(page);
> + kunmap_thread(page);
>   SetPageUptodate(page);
>   unlock_page(page);
>   return 0;
>  
>  err:
> - kunmap(page);
> + kunmap_thread(page);
>   ClearPageUptodate(page);
>   SetPageError(page);
>   unlock_page(page);
> -- 
> 2.28.0.rc0.12.gb6a658bd00c9
> 
> 


Re: {standard input}:1174: Error: inappropriate arguments for opcode 'mpydu'

2020-09-28 Thread Nicolas Pitre
On Sun, 27 Sep 2020, Rong Chen wrote:

> Hi Nicolas,
> 
> Thanks for the feedback, the error still remains with gcc 10.2.0:

I've created the simplest test case that can be. You won't believe it.

Test case:

$ cat test.c
unsigned int test(unsigned int x, unsigned long long y)
{
y /= 0x2000;
if (x > 1)
y *= x;
return y;
}
$ export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/0day/gcc-9.3.0-nolibc/arc-elf/libexec/gcc/arc-elf/9.3.0
$ ~/0day/gcc-9.3.0-nolibc/arc-elf/bin/arc-elf-gcc -mcpu=hs38 -mbig-endian -O2 
-c test.c
/tmp/cc0GAomh.s: Assembler messages:
/tmp/cc0GAomh.s:21: Error: inappropriate arguments for opcode 'mpydu'

I know nothing about ARC. Please anyone take it over from here.


Nicolas


Re: {standard input}:1174: Error: inappropriate arguments for opcode 'mpydu'

2020-08-29 Thread Nicolas Pitre
This looks like a buggy compiler to me.


On Sun, 30 Aug 2020, kernel test robot wrote:

> Hi Nicolas,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   e77aee1326f7691763aa968eee2f57db37840b9d
> commit: 602828c1aade576ac5f3fbd59b4eb014c5fc2414 __div64_const32(): improve 
> the generic C version
> date:   12 months ago
> config: arc-allmodconfig (attached as .config)
> compiler: arceb-elf-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 602828c1aade576ac5f3fbd59b4eb014c5fc2414
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> ARCH=arc 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> All errors (new ones prefixed by >>):
> 
>{standard input}: Assembler messages:
> >> {standard input}:1174: Error: inappropriate arguments for opcode 'mpydu'
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
> 


Re: [PATCH 3/4] i3c: master: svc: Add Silvaco I3C master driver

2020-08-05 Thread Nicolas Pitre
On Tue, 4 Aug 2020, Conor Culhane wrote:

> Miquel is passing 0 as the delay_us argument.

Good point.

> Is this still a concern?

Not in that case.


Nicolas


Re: [PATCH 3/4] i3c: master: svc: Add Silvaco I3C master driver

2020-08-03 Thread Nicolas Pitre
On Thu, 9 Jul 2020, Miquel Raynal wrote:

> Add support for Silvaco I3C dual-role IP. The master role is supported
> in SDR mode only. I2C transfers have not been tested but are shared
> because they are so close to the I3C transfers in terms of registers
> configuration.
> 
> Signed-off-by: Miquel Raynal 

I'm worried by constructs like these:

> +static int svc_i3c_master_handle_ibi(struct svc_i3c_master *master,
> +  struct i3c_dev_desc *dev)
> +{
> + struct svc_i3c_i2c_dev_data *data = i3c_dev_get_master_data(dev);
> + struct i3c_ibi_slot *slot;
> + unsigned int count;
> + u32 mdatactrl;
> + u8 *buf;
> + int ret;
> + u32 reg;
> +
> + spin_lock(>ibi.lock);
> +
> + slot = i3c_generic_ibi_get_free_slot(data->ibi_pool);
> + if (!slot) {
> + ret = -ENOSPC;
> + goto unlock;
> + }
> +
> + slot->len = 0;
> + buf = slot->data;
> +
> + ret = readl_poll_timeout(master->regs + SVC_I3C_MSTATUS, reg,
> +  reg & SVC_I3C_MINT_RXPEND, 0, 1000);

Here you're in atomic context due to the lock, and 
readl_poll_timeout(() is built using usleep() which may ... sleep.

Also, is it actually possible for execution to reach this point if 
SVC_I3C_MINT_RXPEND is not set?

The rest looks reasonable to me.


Nicolas


Re: [RFC][PATCHES] converting FDPIC coredumps to regsets

2020-06-30 Thread Nicolas Pitre
On Tue, 30 Jun 2020, Al Viro wrote:

>   The obvious solution is to introduce struct elf_prstatus_fdpic
> and use that in binfmt_elf_fdpic.c, taking these fields out of the
> normal struct elf_prstatus.  Unfortunately, the damn thing is defined in
> include/uapi/linux/elfcore.h, so nominally it's a part of userland ABI.
> However, not a single userland program actually includes linux/elfcore.h.
> The reason is that the definition in there uses elf_gregset_t as a member,
> and _that_ is not defined anywhere in the exported headers.  It is defined
> in (libc) sys/procfs.h, but the same file defines struct elf_prstatus
> as well.  So if you try to include linux/elfcore.h without having already
> pulled sys/procfs.h, it'll break on incomplete type of a member.  And if
> you have pulled sys/procfs.h, it'll break on redefining a structure.
> IOW, it's not usable and it never had been; as the matter of fact,
> that's the reason sys/procfs.h had been introduced back in 1996.

Huh! That's convenient alright.

Acked-by: Nicolas Pitre 


Nicolas


Re: drivers/tty/vt/vt.c:1210:22: warning: comparison is always false due to limited range of data type

2020-06-17 Thread Nicolas Pitre
I don't know what to do with this.  IMHO the warning is useless in this 
particular case. It happens only on a few targets and depends on the 
kernel config, etc.

If the condition is always false in some cases then so be it. The 
compiler can optimize the unneeded code away I'm sure.

On Mon, 15 Jun 2020, kernel test robot wrote:

> Hi Nicolas,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   f82e7b57b5fc48199e2f26ffafe2f96f7338ad3d
> commit: 2717769e204e83e65b8819c5e2ef3e5b6639b270 vt: don't hardcode the mem 
> allocation upper bound
> date:   7 weeks ago
> :: branch date: 2 hours ago
> :: commit date: 7 weeks ago
> config: ia64-randconfig-r024-20200614 (attached as .config)
> compiler: ia64-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 2717769e204e83e65b8819c5e2ef3e5b6639b270
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> ARCH=ia64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> All warnings (new ones prefixed by >>, old ones prefixed by <<):
> 
> drivers/tty/vt/vt.c: In function 'vc_do_resize':
> >> drivers/tty/vt/vt.c:1210:22: warning: comparison is always false due to 
> >> limited range of data type [-Wtype-limits]
> 1210 |  if (new_screen_size > KMALLOC_MAX_SIZE)
> |  ^
> 
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2717769e204e83e65b8819c5e2ef3e5b6639b270
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git remote update linus
> git checkout 2717769e204e83e65b8819c5e2ef3e5b6639b270
> vim +1210 drivers/tty/vt/vt.c
> 
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1163  
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1164  /**
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1165   *   
> vc_do_resize-   resizing method for the tty
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1166   *   
> @tty: tty being resized
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1167   *   
> @real_tty: real tty (different to tty if a pty/tty pair)
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1168   *   
> @vc: virtual console private data
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1169   *   
> @cols: columns
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1170   *   
> @lines: lines
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1171   *
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1172   *   
> Resize a virtual console, clipping according to the actual constraints.
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1173   *   
> If the caller passes a tty structure then update the termios winsize
> 3ad2f3fbb96142 drivers/char/vt.c   Daniel Mack 2010-02-03  1174   *   
> information and perform any necessary signal handling.
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1175   *
> 6a1c0680cf3ba9 drivers/tty/vt/vt.c Peter Hurley2013-06-15  1176   *   
> Caller must hold the console semaphore. Takes the termios rwsem and
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1177   *   
> ctrl_lock of the tty IFF a tty is passed.
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1178   */
> 8c9a9dd0fa3a26 drivers/char/vt.c   Alan Cox2008-08-15  1179  
> fc6f6238226e6d drivers/char/vt.c   Alan Cox2009-01-02  1180  
> static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
> fc6f6238226e6d drivers/char/vt.c   Alan Cox2009-01-02  1181   
> unsigned int cols, unsigned int lines)
> ^1da177e4c3f41 drivers/char/vt.c   Linus Torvalds  2005-04-16  1182  {
> ^1da177e4c3f41 drivers/char/vt.c   Linus Torvalds  2005-04-16  1183   
> unsigned long old_origin, new_origin, new_scr_end, rlth, rrem, err = 0;
> 9e0ba741aabdf1 drivers/char/vt.c   qiaochong   2010-08-09  1184   
> unsigned long end;
> d8ae7242718738 drivers/tty/vt/vt.c Nicolas Pitre   2018-06-26  1185   
> unsigned int old_rows, old_row_size, first_copied_row;
> ^1da177e4c3f41 drivers/char/vt.c   Linus Torvalds  2005-04-16  1186   

Re: [RFC 0/3] ARM: copy_{from,to}_user() for vmsplit 4g/4g

2020-06-12 Thread Nicolas Pitre
On Fri, 12 Jun 2020, afzal mohammed wrote:

> Performance wise, results are not encouraging, 'dd' on tmpfs results,
> 
> ARM Cortex-A8, BeagleBone White (256MiB RAM):
> w/o series - ~29.5 MB/s
> w/ series - ~20.5 MB/s
> w/ series & highmem disabled - ~21.2 MB/s
> 
> On Cortex-A15(2GiB RAM) in QEMU:
> w/o series - ~4 MB/s
> w/ series - ~2.6 MB/s
> 
> Roughly a one-third drop in performance. Disabling highmem improves
> performance only slightly.

Could you compare with CONFIG_UACCESS_WITH_MEMCPY as well?


Nicolas


Re: clean up kernel_{read,write} & friends v2

2020-06-05 Thread Nicolas Pitre
On Fri, 5 Jun 2020, Philippe Mathieu-Daudé wrote:

> Unfortunately refreshable braille displays have that "hardware
> limitations". 80 cells displays are very expensive.
> Visual impairments is rarely a "choice".
> Relaxing the 80-char limit make it harder for blind developers
> to contribute.

Well, not really.

It is true that 80-cells displays are awfully expensive. IMHO they are 
also unwieldy due to their size: they are hardly portable, and they 
require your hands to move twice as far which may sometimes impair 
reading efficiency. So I never liked them.

My braille display has 40 cells only. So even with a 80-cells limit I 
always had to pan the display to see the whole line anyway.

My text console is set to 160x128. The trick here is to have the number 
of columns be a multiple of the braille display's width to avoid dead 
areas when panning to the right.

So if you ask me, I'm not against relaxing the 80 columns limit for 
code. What really matters to me is that I can stay clear of any GUI.


Nicolas

[PATCH] vt: fix unicode console freeing with a common interface

2020-05-02 Thread Nicolas Pitre
By directly using kfree() in different places we risk missing one if
it is switched to using vfree(), especially if the corresponding
vmalloc() is hidden away within a common abstraction.

Oh wait, that's exactly what happened here.

So let's fix this by creating a common abstraction for the free case
as well.

Signed-off-by: Nicolas Pitre 
Reported-by: syzbot+0bfda3ade1ee9288a...@syzkaller.appspotmail.com
Fixes: 9a98e7a80f95 ("vt: don't use kmalloc() for the unicode screen buffer")
Cc: 

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index e5ffed795e..48a8199f78 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -365,9 +365,14 @@ static struct uni_screen *vc_uniscr_alloc(unsigned int 
cols, unsigned int rows)
return uniscr;
 }
 
+static void vc_uniscr_free(struct uni_screen *uniscr)
+{
+   vfree(uniscr);
+}
+
 static void vc_uniscr_set(struct vc_data *vc, struct uni_screen *new_uniscr)
 {
-   vfree(vc->vc_uni_screen);
+   vc_uniscr_free(vc->vc_uni_screen);
vc->vc_uni_screen = new_uniscr;
 }
 
@@ -1230,7 +1235,7 @@ static int vc_do_resize(struct tty_struct *tty, struct 
vc_data *vc,
err = resize_screen(vc, new_cols, new_rows, user);
if (err) {
kfree(newscreen);
-   kfree(new_uniscr);
+   vc_uniscr_free(new_uniscr);
return err;
}
 


Re: kernel BUG at arch/x86/mm/physaddr.c:LINE! (5)

2020-05-02 Thread Nicolas Pitre
On Sat, 2 May 2020, Jiri Slaby wrote:

> > Call Trace:
> >  virt_to_head_page include/linux/mm.h:833 [inline]
> >  virt_to_cache mm/slab.h:474 [inline]
> >  kfree+0x60/0x220 mm/slab.c:3749
> >  vc_do_resize+0x738/0x1ce0 drivers/tty/vt/vt.c:1233
> Of course, s/kfree/vfree/ there. NIcolas, could you fix this?

Damn...  Will do.


Nicolas


Re: [PATCH v2 0/5] Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there

2020-04-29 Thread Nicolas Pitre
On Wed, 29 Apr 2020, Russell King - ARM Linux admin wrote:

> On Wed, Apr 29, 2020 at 11:49:49PM +0200, Jann Horn wrote:
> > At the moment, we have that rather ugly mmget_still_valid() helper to
> > work around <https://crbug.com/project-zero/1790>: ELF core dumping
> > doesn't take the mmap_sem while traversing the task's VMAs, and if
> > anything (like userfaultfd) then remotely messes with the VMA tree,
> > fireworks ensue. So at the moment we use mmget_still_valid() to bail
> > out in any writers that might be operating on a remote mm's VMAs.
> > 
> > With this series, I'm trying to get rid of the need for that as
> > cleanly as possible.
> > In particular, I want to avoid holding the mmap_sem across unbounded
> > sleeps.
> > 
> > 
> > Patches 1, 2 and 3 are relatively unrelated cleanups in the core
> > dumping code.
> > 
> > Patches 4 and 5 implement the main change: Instead of repeatedly
> > accessing the VMA list with sleeps in between, we snapshot it at the
> > start with proper locking, and then later we just use our copy of
> > the VMA list. This ensures that the kernel won't crash, that VMA
> > metadata in the coredump is consistent even in the presence of
> > concurrent modifications, and that any virtual addresses that aren't
> > being concurrently modified have their contents show up in the core
> > dump properly.
> > 
> > The disadvantage of this approach is that we need a bit more memory
> > during core dumping for storing metadata about all VMAs.
> > 
> > After this series has landed, we should be able to rip out
> > mmget_still_valid().
> > 
> > 
> > Testing done so far:
> > 
> >  - Creating a simple core dump on X86-64 still works.
> >  - The created coredump on X86-64 opens in GDB, and both the stack and the
> >exectutable look vaguely plausible.
> >  - 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.
> > 
> > I'm CCing some folks from the architectures that use FDPIC in case
> > anyone wants to give this a spin.
> 
> I've never had any reason to use FDPIC, and I don't have any binaries
> that would use it.  Nicolas Pitre added ARM support, so I guess he
> would be the one to talk to about it.  (Added Nicolas.)

It's been a while since I worked with it. However Christophe Lyon (in 
CC) added support for ARM FDPIC to mainline gcc recently, so hopefully 
he might still be set up and able to help.


Nicolas


Re: [PATCH v2 0/5] Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there

2020-04-29 Thread Nicolas Pitre
On Wed, 29 Apr 2020, Linus Torvalds wrote:

> While we're at it, is there anybody who knows binfmt_flat?

I'd say Greg Ungerer.

> It might be Nicolas too.

I only contributed the necessary changes to make it work on targets with 
a MMU. Once fdpic started to worked I used that instead.

FWIW I couldn't find a toolchain that would produce FLAT binaries with 
dynamic libs on ARM so I only used static binaries.


Nicolas


Re: [PATCH v2 5/5] cpufreq: vexpress-spc: fix some coding style issues

2019-10-18 Thread Nicolas Pitre
On Fri, 18 Oct 2019, Sudeep Holla wrote:

> On Fri, Oct 18, 2019 at 11:25:17AM +0530, Viresh Kumar wrote:
> > On 17-10-19, 13:35, Sudeep Holla wrote:
> > > Fix the following checkpatch checks/warnings:
> > > 
> > > CHECK: Unnecessary parentheses around the code
> > > CHECK: Alignment should match open parenthesis
> > > CHECK: Prefer kernel type 'u32' over 'uint32_t'
> > > WARNING: Missing a blank line after declarations
> > > 
> > > Signed-off-by: Sudeep Holla 
> > > ---
> > >  drivers/cpufreq/vexpress-spc-cpufreq.c | 43 --
> > >  1 file changed, 20 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c 
> > > b/drivers/cpufreq/vexpress-spc-cpufreq.c
> > > index 81064430317f..8ecb2961be86 100644
> > > --- a/drivers/cpufreq/vexpress-spc-cpufreq.c
> > > +++ b/drivers/cpufreq/vexpress-spc-cpufreq.c
> > > @@ -79,8 +79,8 @@ static unsigned int find_cluster_maxfreq(int cluster)
> > >   for_each_online_cpu(j) {
> > >   cpu_freq = per_cpu(cpu_last_req_freq, j);
> > >  
> > > - if ((cluster == per_cpu(physical_cluster, j)) &&
> > > - (max_freq < cpu_freq))
> > > + if (cluster == per_cpu(physical_cluster, j) &&
> > > + max_freq < cpu_freq)
> > >   max_freq = cpu_freq;
> > >   }
> > >  
> > > @@ -188,22 +188,19 @@ static int ve_spc_cpufreq_set_target(struct 
> > > cpufreq_policy *policy,
> > >   freqs_new = freq_table[cur_cluster][index].frequency;
> > >  
> > >   if (is_bL_switching_enabled()) {
> > > - if ((actual_cluster == A15_CLUSTER) &&
> > > - (freqs_new < clk_big_min)) {
> > > + if (actual_cluster == A15_CLUSTER && freqs_new < clk_big_min)
> > >   new_cluster = A7_CLUSTER;
> > > - } else if ((actual_cluster == A7_CLUSTER) &&
> > > - (freqs_new > clk_little_max)) {
> > > + else if (actual_cluster == A7_CLUSTER &&
> > > +  freqs_new > clk_little_max)
> > >   new_cluster = A15_CLUSTER;
> > > - }
> > >   }
> > >  
> > >   ret = ve_spc_cpufreq_set_rate(cpu, actual_cluster, new_cluster,
> > > freqs_new);
> > >  
> > > - if (!ret) {
> > > + if (!ret)
> > 
> > That's not the standard way in Linux I believe. We do use {} even when
> > the body is single line but broken into two, like below.
> >
> 
> OK, wasn't aware of that. I will update. Generally I ignore checkpatch
> warnings, but the list was big and fixed a bunch of them :)

In cases like this one, the best is to go with whatever makes checkpatch 
happy.


Nicolas


Re: [PATCH v2 0/5] cpufreq: merge arm big.LITTLE and vexpress-spc drivers

2019-10-17 Thread Nicolas Pitre
On Thu, 17 Oct 2019, Sudeep Holla wrote:

> Hi,
> 
> Since vexpress-spc is the sole user of arm_big_little cpufreq driver,
> there's no point in keeping it separate anymore. I wanted to post these
> patches for ages but kept postponing for no reason.

For these patches:

Acked-by: Nicolas Pitre 


Nicolas


Re: [PATCH] tty:vt: Add check the return value of kzalloc to avoid oops

2019-09-22 Thread Nicolas Pitre
On Sat, 21 Sep 2019, Xiaoming Ni wrote:

> @ Nicolas Pitre
> Can I make a v2 patch based on your advice ?
> Or you will submit a patch for "GFP_WONTFAIL" yourself ?

Here's a patch implementing what I had in mind. This is compile tested 
only.

- >8

Subject: [PATCH] mm: add __GFP_WONTFAIL and GFP_ONBOOT

Some memory allocations are very unlikely to fail during system boot.
Because of that, the code often doesn't bother to check for allocation
failure, but this gets reported anyway.

As an alternative to adding code to check for NULL that has almost no
chance of ever being exercised, let's use a GFP flag to identify those
cases and panic the kernel if allocation failure ever occurs.

Conversion of one such instance is also included.

Signed-off-by: Nicolas Pitre 

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 34aa39d1ae..bd0a0e4807 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -3356,7 +3356,7 @@ static int __init con_init(void)
}
 
for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++) {
-   vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
GFP_NOWAIT);
+   vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
GFP_ONBOOT);
INIT_WORK(_cons[currcons].SAK_work, vc_SAK);
tty_port_init(>port);
visual_init(vc, currcons, 1);
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f33881688f..6f33575cd6 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -39,8 +39,9 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL0x10u
 #define ___GFP_THISNODE0x20u
 #define ___GFP_ACCOUNT 0x40u
+#define ___GFP_WONTFAIL0x80u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP   0x80u
+#define ___GFP_NOLOCKDEP   0x100u
 #else
 #define ___GFP_NOLOCKDEP   0
 #endif
@@ -187,6 +188,12 @@ struct vm_area_struct;
  * definitely preferable to use the flag rather than opencode endless
  * loop around allocator.
  * Using this flag for costly allocations is _highly_ discouraged.
+ *
+ * %__GFP_WONTFAIL: No allocation error is expected what so ever. The
+ * caller presumes allocation will always succeed (e.g. the system is still
+ * booting, the allocation size is relatively small and memory should be
+ * plentiful) so testing for failure is skipped. If allocation ever fails
+ * then the kernel will simply panic.
  */
 #define __GFP_IO   ((__force gfp_t)___GFP_IO)
 #define __GFP_FS   ((__force gfp_t)___GFP_FS)
@@ -196,6 +203,7 @@ struct vm_area_struct;
 #define __GFP_RETRY_MAYFAIL((__force gfp_t)___GFP_RETRY_MAYFAIL)
 #define __GFP_NOFAIL   ((__force gfp_t)___GFP_NOFAIL)
 #define __GFP_NORETRY  ((__force gfp_t)___GFP_NORETRY)
+#define __GFP_WONTFAIL ((__force gfp_t)___GFP_WONTFAIL)
 
 /**
  * DOC: Action modifiers
@@ -217,7 +225,7 @@ struct vm_area_struct;
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
@@ -285,6 +293,9 @@ struct vm_area_struct;
  * available and will not wake kswapd/kcompactd on failure. The _LIGHT
  * version does not attempt reclaim/compaction at all and is by default used
  * in page fault path, while the non-light is used by khugepaged.
+ *
+ * %GFP_ONBOOT is for relatively small allocations that are not expected
+ * to fail while the system is booting.
  */
 #define GFP_ATOMIC (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
 #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
@@ -300,6 +311,7 @@ struct vm_area_struct;
 #define GFP_TRANSHUGE_LIGHT((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE  (GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
+#define GFP_ONBOOT (GFP_NOWAIT | __GFP_WONTFAIL)
 
 /* Convert GFP flags to their corresponding migrate type */
 #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ff5484fdbd..36dee09f7f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4625,6 +4625,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int 
order,
 fail:
warn_alloc(gfp_mask, ac->nodemask,
"page allocation failure: order:%u", order);
+   if (gfp_mask & __GFP_WONTFAIL) {
+   /*
+* The assumption was wrong. This is never supposed to happen.
+* Caller most likely won't check for a returned NULL either.
+* So the only reasonable thing to do is to pannic.
+*/
+   panic("Failed to allocate memory despite GFP_WONTFAIL\n");
+   }
 got_pg:
return page;
 }


Re: [PATCH] tty:vt: Add check the return value of kzalloc to avoid oops

2019-09-19 Thread Nicolas Pitre
On Thu, 19 Sep 2019, Greg KH wrote:

> On Thu, Sep 19, 2019 at 05:18:15PM +0800, Xiaoming Ni wrote:
> > Using kzalloc() to allocate memory in function con_init(), but not
> > checking the return value, there is a risk of null pointer references
> > oops.
> > 
> > Signed-off-by: Xiaoming Ni 
> 
> We keep having this be "reported" :(

Something probably needs to be "communicated" about that.

> > vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
> > GFP_NOWAIT);
> > +   if (unlikely(!vc)) {
> > +   pr_warn("%s:failed to allocate memory for the %u vc\n",
> > +   __func__, currcons);
> > +   break;
> > +   }
> 
> At init, this really can not happen.  Have you see it ever happen?

This is maybe too subtle a fact. The "communication" could be done with 
some GFP_WONTFAIL flag, and have the allocator simply pannic() if it 
ever fails.


Nicolas


Re: [PATCH 1/2] export.h: remove defined(__KERNEL__)

2019-09-09 Thread Nicolas Pitre
On Tue, 10 Sep 2019, Masahiro Yamada wrote:

> On Tue, Sep 10, 2019 at 1:06 AM Nicolas Pitre  wrote:
> >
> > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> >
> > > Hi Nicolas,
> > >
> > > On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre  wrote:
> > > >
> > > > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> > > >
> > > > > This line was touched by commit f235541699bc ("export.h: allow for
> > > > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did
> > > > > not explain why.
> > > > >
> > > > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__).
> > > >
> > > > I'm pretty sure it was needed back then so not to interfere with users
> > > > of this file. My fault for not documenting it.
> > >
> > > Hmm, I did not see a problem in my quick build test.
> > >
> > > Do you remember which file was causing the problem?
> >
> > If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the
> > defined(__KERNEL__) test removed then you'll get:
> >
> >   HOSTCC  scripts/mod/modpost.o
> > In file included from scripts/mod/modpost.c:24:
> > scripts/mod/../../include/linux/export.h:81:10: fatal error: 
> > linux/kconfig.h: No such file or directory
> >
> >
> > Nicolas
> 
> 
> Thanks for explaining this.
> 
> It is not the case any more.
> 
> 
> I will reword the commit message as follows:
> 
> >8---
> export.h: remove defined(__KERNEL__), which is no longer needed
> 
> The conditional define(__KERNEL__) was added by commit f235541699bc
> ("export.h: allow for per-symbol configurable EXPORT_SYMBOL()").
> 
> It was needed at that time to avoid the build error of modpost
> with CONFIG_TRIM_UNUSED_KSYMS=y.
> 
> Since commit b2c5cdcfd4bc ("modpost: remove symbol prefix support"),
> modpost no longer includes linux/export.h, thus the define(__KERNEL__)
> is unneeded.
> >8---
> 

Acked-by: Nicolas Pitre 


Nicolas


Re: [PATCH 1/2] export.h: remove defined(__KERNEL__)

2019-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2019, Masahiro Yamada wrote:

> Hi Nicolas,
> 
> On Mon, Sep 9, 2019 at 10:48 PM Nicolas Pitre  wrote:
> >
> > On Mon, 9 Sep 2019, Masahiro Yamada wrote:
> >
> > > This line was touched by commit f235541699bc ("export.h: allow for
> > > per-symbol configurable EXPORT_SYMBOL()"), but the commit log did
> > > not explain why.
> > >
> > > CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__).
> >
> > I'm pretty sure it was needed back then so not to interfere with users
> > of this file. My fault for not documenting it.
> 
> Hmm, I did not see a problem in my quick build test.
> 
> Do you remember which file was causing the problem?

If you build commit 7ec925701f5f with CONFIG_TRIM_UNUSED_KSYMS=y and the 
defined(__KERNEL__) test removed then you'll get:

  HOSTCC  scripts/mod/modpost.o
In file included from scripts/mod/modpost.c:24:
scripts/mod/../../include/linux/export.h:81:10: fatal error: linux/kconfig.h: 
No such file or directory


Nicolas


Re: [PATCH 1/2] export.h: remove defined(__KERNEL__)

2019-09-09 Thread Nicolas Pitre
On Mon, 9 Sep 2019, Masahiro Yamada wrote:

> This line was touched by commit f235541699bc ("export.h: allow for
> per-symbol configurable EXPORT_SYMBOL()"), but the commit log did
> not explain why.
> 
> CONFIG_TRIM_UNUSED_KSYMS works for me without defined(__KERNEL__).

I'm pretty sure it was needed back then so not to interfere with users 
of this file. My fault for not documenting it.


Nicolas


Re: [PATCH] __div64_const32(): improve the generic C version

2019-08-29 Thread Nicolas Pitre


Ping.

On Tue, 20 Aug 2019, Nicolas Pitre wrote:

> Let's rework that code to avoid large immediate values and convert some
> 64-bit variables to 32-bit ones when possible. This allows gcc to
> produce smaller and better code. This even produces optimal code on
> RISC-V.
> 
> Signed-off-by: Nicolas Pitre 
> 
> diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
> index dc9726fdac..33358245b4 100644
> --- a/include/asm-generic/div64.h
> +++ b/include/asm-generic/div64.h
> @@ -178,7 +178,8 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
> uint64_t n, bool bias)
>   uint32_t m_hi = m >> 32;
>   uint32_t n_lo = n;
>   uint32_t n_hi = n >> 32;
> - uint64_t res, tmp;
> + uint64_t res;
> + uint32_t res_lo, res_hi, tmp;
>  
>   if (!bias) {
>   res = ((uint64_t)m_lo * n_lo) >> 32;
> @@ -187,8 +188,9 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
> uint64_t n, bool bias)
>   res = (m + (uint64_t)m_lo * n_lo) >> 32;
>   } else {
>   res = m + (uint64_t)m_lo * n_lo;
> - tmp = (res < m) ? (1ULL << 32) : 0;
> - res = (res >> 32) + tmp;
> + res_lo = res >> 32;
> + res_hi = (res_lo < m_hi);
> + res = res_lo | ((uint64_t)res_hi << 32);
>   }
>  
>   if (!(m & ((1ULL << 63) | (1ULL << 31 {
> @@ -197,10 +199,12 @@ static inline uint64_t __arch_xprod_64(const uint64_t 
> m, uint64_t n, bool bias)
>   res += (uint64_t)m_hi * n_lo;
>   res >>= 32;
>   } else {
> - tmp = res += (uint64_t)m_lo * n_hi;
> + res += (uint64_t)m_lo * n_hi;
> + tmp = res >> 32;
>   res += (uint64_t)m_hi * n_lo;
> - tmp = (res < tmp) ? (1ULL << 32) : 0;
> - res = (res >> 32) + tmp;
> + res_lo = res >> 32;
> + res_hi = (res_lo < tmp);
> + res = res_lo | ((uint64_t)res_hi << 32);
>   }
>  
>   res += (uint64_t)m_hi * n_hi;
> 


[PATCH] __div64_const32(): improve the generic C version

2019-08-20 Thread Nicolas Pitre
Let's rework that code to avoid large immediate values and convert some
64-bit variables to 32-bit ones when possible. This allows gcc to
produce smaller and better code. This even produces optimal code on
RISC-V.

Signed-off-by: Nicolas Pitre 

diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
index dc9726fdac..33358245b4 100644
--- a/include/asm-generic/div64.h
+++ b/include/asm-generic/div64.h
@@ -178,7 +178,8 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
uint32_t m_hi = m >> 32;
uint32_t n_lo = n;
uint32_t n_hi = n >> 32;
-   uint64_t res, tmp;
+   uint64_t res;
+   uint32_t res_lo, res_hi, tmp;
 
if (!bias) {
res = ((uint64_t)m_lo * n_lo) >> 32;
@@ -187,8 +188,9 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
res = (m + (uint64_t)m_lo * n_lo) >> 32;
} else {
res = m + (uint64_t)m_lo * n_lo;
-   tmp = (res < m) ? (1ULL << 32) : 0;
-   res = (res >> 32) + tmp;
+   res_lo = res >> 32;
+   res_hi = (res_lo < m_hi);
+   res = res_lo | ((uint64_t)res_hi << 32);
}
 
if (!(m & ((1ULL << 63) | (1ULL << 31 {
@@ -197,10 +199,12 @@ static inline uint64_t __arch_xprod_64(const uint64_t m, 
uint64_t n, bool bias)
res += (uint64_t)m_hi * n_lo;
res >>= 32;
} else {
-   tmp = res += (uint64_t)m_lo * n_hi;
+   res += (uint64_t)m_lo * n_hi;
+   tmp = res >> 32;
res += (uint64_t)m_hi * n_lo;
-   tmp = (res < tmp) ? (1ULL << 32) : 0;
-   res = (res >> 32) + tmp;
+   res_lo = res >> 32;
+   res_hi = (res_lo < tmp);
+   res = res_lo | ((uint64_t)res_hi << 32);
}
 
res += (uint64_t)m_hi * n_hi;


Re: [PATCH v2 RESEND] fs: cramfs_fs.h: Fix shifting signed 32-bit value by 31 bits problem

2019-06-21 Thread Nicolas Pitre
On Tue, 18 Jun 2019, Puranjay Mohan wrote:

> Fix CRAMFS_BLK_FLAG_UNCOMPRESSED to use "U" cast to avoid shifting signed
> 32-bit value by 31 bits problem. This isn't a problem for kernel builds
> with gcc.
> 
> This could be problem since this header is part of public API which
> could be included for builds using compilers that don't handle this
> condition safely resulting in undefined behavior.
> 
> Signed-off-by: Puranjay Mohan 
> Reviewed-by: Christoph Hellwig 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Shuah Khan 

Acked-by: Nicolas Pitre 



> ---
> V2 - use the unsigned constants for all flags, not just one
> RESEND - Added Nicolas Pitre to CC list, added reviewed by tags.
> 
>  include/uapi/linux/cramfs_fs.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/cramfs_fs.h b/include/uapi/linux/cramfs_fs.h
> index 6713669aa2ed..71cb602d4198 100644
> --- a/include/uapi/linux/cramfs_fs.h
> +++ b/include/uapi/linux/cramfs_fs.h
> @@ -98,8 +98,8 @@ struct cramfs_super {
>   *
>   * That leaves room for 3 flag bits in the block pointer table.
>   */
> -#define CRAMFS_BLK_FLAG_UNCOMPRESSED (1 << 31)
> -#define CRAMFS_BLK_FLAG_DIRECT_PTR   (1 << 30)
> +#define CRAMFS_BLK_FLAG_UNCOMPRESSED (1U << 31)
> +#define CRAMFS_BLK_FLAG_DIRECT_PTR   (1U << 30)
>  
>  #define CRAMFS_BLK_FLAGS ( CRAMFS_BLK_FLAG_UNCOMPRESSED \
>   | CRAMFS_BLK_FLAG_DIRECT_PTR )
> -- 
> 2.21.0
> 
> 


Re: [PATCH v4] vt: fix a missing-check bug in con_init()

2019-06-12 Thread Nicolas Pitre
On Wed, 12 Jun 2019, Gen Zhang wrote:

> On Wed, Jun 12, 2019 at 03:38:38PM +0200, Greg KH wrote:
> > On Wed, Jun 12, 2019 at 09:15:06PM +0800, Gen Zhang wrote:
> > > In function con_init(), the pointer variable vc_cons[currcons].d, vc and 
> > > vc->vc_screenbuf is allocated by kzalloc(). However, kzalloc() returns 
> > > NULL when fails. Therefore, we should check the return value and handle 
> > > the error.
> > > 
> > > Signed-off-by: Gen Zhang 
> > > ---
> > 
> > What changed from v1, v2, and v3?
> Thanks for your timely response. I am not a native English speaker, so
> I am not sure I understand this correctly. Does this mean that I should
> use "v5", rather than "v4"? 

Given that this is your v4 patch, you should list what has changed since 
earlier versions. Greg is asking you to produce a v5 with that 
information added. Obviously, the change from v4 to v5 would be 
something like "added version change information".

> > That always goes below the --- line.
> And I can't see what goes wrong with "---". Could you please make some
> explaination?

The version change information should be put below the --- line in your 
post, not above it. This way it won't be captured in commit log.


Nicolas


Re: [PATCH v3] vt: Fix a missing-check bug in con_init()

2019-06-10 Thread Nicolas Pitre
On Sun, 9 Jun 2019, Gen Zhang wrote:

> On Sat, Jun 08, 2019 at 08:15:46PM -0400, Nicolas Pitre wrote:
> > On Sat, 8 Jun 2019, Greg KH wrote:
> > 
> > > And please provide "real" names for the
> > > labels, "fail1" and "fail2" do not tell anything here.
> > 
> > That I agree with.
> > 
> > 
> > Nicolas
> Thanks for your comments. Then am I supposed to revise the patch and
> send a v4 version?

Yes.


Nicolas


Re: [PATCH v3] vt: Fix a missing-check bug in con_init()

2019-06-08 Thread Nicolas Pitre
On Sat, 8 Jun 2019, Greg KH wrote:

> On Sun, Jun 09, 2019 at 12:01:38AM +0800, Gen Zhang wrote:
> > On Tue, May 28, 2019 at 08:45:29AM +0800, Gen Zhang wrote:
> > > In function con_init(), the pointer variable vc_cons[currcons].d, vc and
> > > vc->vc_screenbuf is allocated by kzalloc(). And they are used in the 
> > > following codes. However, kzalloc() returns NULL when fails, and null 
> > > pointer dereference may happen. And it will cause the kernel to crash. 
> > > Therefore, we should check the return value and handle the error.
> > > 
> > > Further, since the allcoation is in a loop, we should free all the 
> > > allocated memory in a loop.
> > > 
> > > Signed-off-by: Gen Zhang 
> > > Reviewed-by: Nicolas Pitre 
> > > ---
> > > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> > > index fdd12f8..d50f68f 100644
> > > --- a/drivers/tty/vt/vt.c
> > > +++ b/drivers/tty/vt/vt.c
> > > @@ -3350,10 +3350,14 @@ static int __init con_init(void)
> > >  
> > >   for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++) {
> > >   vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
> > > GFP_NOWAIT);
> > > + if (!vc)
> > > + goto fail1;
> > >   INIT_WORK(_cons[currcons].SAK_work, vc_SAK);
> > >   tty_port_init(>port);
> > >   visual_init(vc, currcons, 1);
> > >   vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
> > > + if (!vc->vc_screenbuf)
> > > + goto fail2;
> > >   vc_init(vc, vc->vc_rows, vc->vc_cols,
> > >   currcons || !vc->vc_sw->con_save_screen);
> > >   }
> > > @@ -3375,6 +3379,16 @@ static int __init con_init(void)
> > >   register_console(_console_driver);
> > >  #endif
> > >   return 0;
> > > +fail1:
> > > + while (currcons > 0) {
> > > + currcons--;
> > > + kfree(vc_cons[currcons].d->vc_screenbuf);
> > > +fail2:
> > > + kfree(vc_cons[currcons].d);
> > > + vc_cons[currcons].d = NULL;
> > > + }
> 
> Wait, will that even work?  You can jump into the middle of a while
> loop?

Absolutely.

> Ugh, that's beyond ugly.

That was me who suggested to do it like that. To me, this is nicer than 
the proposed alternatives. For an error path that is rather unlikely to 
happen, I think this is a very concise and eleguant way to do it.

> And please provide "real" names for the
> labels, "fail1" and "fail2" do not tell anything here.

That I agree with.


Nicolas


Re: [PATCH] binfmt_flat: make load_flat_shared_library() work

2019-05-27 Thread Nicolas Pitre
On Mon, 27 May 2019, Jann Horn wrote:

> On Sat, May 25, 2019 at 11:43 PM Andrew Morton
>  wrote:
> > On Fri, 24 May 2019 22:18:17 +0200 Jann Horn  wrote:
> > > load_flat_shared_library() is broken: It only calls load_flat_file() if
> > > prepare_binprm() returns zero, but prepare_binprm() returns the number of
> > > bytes read - so this only happens if the file is empty.
> >
> > ouch.
> >
> > > Instead, call into load_flat_file() if the number of bytes read is
> > > non-negative. (Even if the number of bytes is zero - in that case,
> > > load_flat_file() will see nullbytes and return a nice -ENOEXEC.)
> > >
> > > In addition, remove the code related to bprm creds and stop using
> > > prepare_binprm() - this code is loading a library, not a main executable,
> > > and it only actually uses the members "buf", "file" and "filename" of the
> > > linux_binprm struct. Instead, call kernel_read() directly.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Fixes: 287980e49ffc ("remove lots of IS_ERR_VALUE abuses")
> > > Signed-off-by: Jann Horn 
> > > ---
> > > I only found the bug by looking at the code, I have not verified its
> > > existence at runtime.
> > > Also, this patch is compile-tested only.
> > > It would be nice if someone who works with nommu Linux could have a
> > > look at this patch.
> >
> > 287980e49ffc was three years ago!  Has it really been broken for all
> > that time?  If so, it seems a good source of freed disk space...
> 
> Maybe... but I didn't want to rip it out without having one of the
> maintainers confirm that this really isn't likely to be used anymore.

Last time I played with this, I couldn't figure out the toolchain to 
produce shared libs. Only static executables worked fine. If I recall, 
existing binfmt_flat distros don't use shared libs either.

For shared lib support on no-MMU target, binfmt_elf_fdpic is a much 
saner choice.


Nicolas


Re: [PATCH v3] vt: Fix a missing-check bug in drivers/tty/vt/vt.c

2019-05-22 Thread Nicolas Pitre
On Wed, 22 May 2019, Gen Zhang wrote:

> In function con_init(), the pointer variable vc_cons[currcons].d, vc and
> vc->vc_screenbuf is allocated a memory space via kzalloc(). And they are
> used in the following codes.
> However, when there is a memory allocation error, kzalloc() can fail.
> Thus null pointer (vc_cons[currcons].d, vc and vc->vc_screenbuf)
> dereference may happen. And it will cause the kernel to crash. Therefore,
> we should check return value and handle the error.
> Further, since the allcoation is in a loop, we should free all the 
> allocated memory in a loop.
> 
> Signed-off-by: Gen Zhang 

Reviewed-by: Nicolas Pitre 


> 
> ---
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index fdd12f8..d50f68f 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -3350,10 +3350,14 @@ static int __init con_init(void)
>  
>   for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++) {
>   vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
> GFP_NOWAIT);
> + if (!vc)
> + goto fail1;
>   INIT_WORK(_cons[currcons].SAK_work, vc_SAK);
>   tty_port_init(>port);
>   visual_init(vc, currcons, 1);
>   vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
> + if (!vc->vc_screenbuf)
> + goto fail2;
>   vc_init(vc, vc->vc_rows, vc->vc_cols,
>   currcons || !vc->vc_sw->con_save_screen);
>   }
> @@ -3375,6 +3379,16 @@ static int __init con_init(void)
>   register_console(_console_driver);
>  #endif
>   return 0;
> +fail1:
> + while (currcons > 0) {
> + currcons--;
> + kfree(vc_cons[currcons].d->vc_screenbuf);
> +fail2:
> + kfree(vc_cons[currcons].d);
> + vc_cons[currcons].d = NULL;
> + }
> + console_unlock();
> + return -ENOMEM;
>  }
>  console_initcall(con_init);
>  
> ---
> 


Re: [PATCH v2] vt: Fix a missing-check bug in drivers/tty/vt/vt.c

2019-05-21 Thread Nicolas Pitre
On Tue, 21 May 2019, Gen Zhang wrote:

> On Tue, May 21, 2019 at 12:30:38AM -0400, Nicolas Pitre wrote:
> > Now imagine that MIN_NR_CONSOLES is defined to 10 instead of 1.
> > 
> > What happens with allocated memory if the err_vc condition is met on the 
> > 5th loop?
> Yes, vc->vc_screenbuf from the last loop is still not freed, right? I
> don't have idea to solve this one. Could please give some advice? Since
> we have to consider the err_vc condition.
> 
> > If err_vc_screenbuf condition is encountered on the 5th loop (curcons = 
> > 4), what is the value of vc_cons[4].d? Isn't it the same as vc that you 
> > just freed?
> > 
> > 
> > Nicolas
> Thanks for your explaination! You mean a double free situation may 
> happen, right? But in vc_allocate() there is also such a kfree(vc) and 
> vc_cons[currcons].d = NULL operation. This situation is really confusing
> me.

What you could do is something that looks like:

for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++) {
vc_cons[currcons].d = vc = kzalloc(...);
if (!vc)
goto fail1;
...
vc->vc_screenbuf = kzalloc(...);
if (!vc->vc_screenbuf)
goto fail2;
...

return 0;

/* free already allocated memory on error */
fail1:
while (curcons > 0) {
curcons--;
kfree(vc_cons[currcons].d->vc_screenbuf);
fail2:
kfree(vc_cons[currcons].d);
vc_cons[currcons].d = NULL;
}
console_unlock();
return -ENOMEM;


Nicolas


Re: [PATCH v2] vt: Fix a missing-check bug in drivers/tty/vt/vt.c

2019-05-20 Thread Nicolas Pitre
On Tue, 21 May 2019, Gen Zhang wrote:

> On Mon, May 20, 2019 at 11:26:20PM -0400, Nicolas Pitre wrote:
> > On Tue, 21 May 2019, Gen Zhang wrote:
> > 
> > > On Mon, May 20, 2019 at 10:55:40PM -0400, Nicolas Pitre wrote:
> > > > On Tue, 21 May 2019, Gen Zhang wrote:
> > > > 
> > > > > In function con_init(), the pointer variable vc_cons[currcons].d, vc 
> > > > > and
> > > > > vc->vc_screenbuf is allocated a memory space via kzalloc(). And they 
> > > > > are
> > > > > used in the following codes.
> > > > > However, when there is a memory allocation error, kzalloc() can fail.
> > > > > Thus null pointer (vc_cons[currcons].d, vc and vc->vc_screenbuf)
> > > > > dereference may happen. And it will cause the kernel to crash. 
> > > > > Therefore,
> > > > > we should check return value and handle the error.
> > > > > Further,the loop condition MIN_NR_CONSOLES is defined as 1 in
> > > > > include/uapi/linux/vt.h. So there is no need to unwind the loop.
> > > > 
> > > > But what if someone changes that define? It won't be obvious that some 
> > > > code did rely on it to be defined to 1.
> > > I re-examine the source code. MIN_NR_CONSOLES is only defined once and
> > > no other changes to it.
> > 
> > Yes, that is true today.  But if someone changes that in the future, how 
> > will that person know that you relied on it to be 1 for not needing to 
> > unwind the loop?
> > 
> > 
> > Nicolas
> Hi Nicolas,
> Thanks for your explaination! And I got your point. And is this way 
> proper?

Not quite.

> err_vc_screenbuf:
> kfree(vc);
>   for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++)
>   vc_cons[currcons].d = NULL;
>   return -ENOMEM;
> err_vc:
>   console_unlock();
>   return -ENOMEM;

Now imagine that MIN_NR_CONSOLES is defined to 10 instead of 1.

What happens with allocated memory if the err_vc condition is met on the 
5th loop?

If err_vc_screenbuf condition is encountered on the 5th loop (curcons = 
4), what is the value of vc_cons[4].d? Isn't it the same as vc that you 
just freed?


Nicolas


Re: [PATCH v2] vt: Fix a missing-check bug in drivers/tty/vt/vt.c

2019-05-20 Thread Nicolas Pitre
On Tue, 21 May 2019, Gen Zhang wrote:

> On Mon, May 20, 2019 at 10:55:40PM -0400, Nicolas Pitre wrote:
> > On Tue, 21 May 2019, Gen Zhang wrote:
> > 
> > > In function con_init(), the pointer variable vc_cons[currcons].d, vc and
> > > vc->vc_screenbuf is allocated a memory space via kzalloc(). And they are
> > > used in the following codes.
> > > However, when there is a memory allocation error, kzalloc() can fail.
> > > Thus null pointer (vc_cons[currcons].d, vc and vc->vc_screenbuf)
> > > dereference may happen. And it will cause the kernel to crash. Therefore,
> > > we should check return value and handle the error.
> > > Further,the loop condition MIN_NR_CONSOLES is defined as 1 in
> > > include/uapi/linux/vt.h. So there is no need to unwind the loop.
> > 
> > But what if someone changes that define? It won't be obvious that some 
> > code did rely on it to be defined to 1.
> I re-examine the source code. MIN_NR_CONSOLES is only defined once and
> no other changes to it.

Yes, that is true today.  But if someone changes that in the future, how 
will that person know that you relied on it to be 1 for not needing to 
unwind the loop?


Nicolas


Re: [PATCH v2] vt: Fix a missing-check bug in drivers/tty/vt/vt.c

2019-05-20 Thread Nicolas Pitre
On Tue, 21 May 2019, Gen Zhang wrote:

> In function con_init(), the pointer variable vc_cons[currcons].d, vc and
> vc->vc_screenbuf is allocated a memory space via kzalloc(). And they are
> used in the following codes.
> However, when there is a memory allocation error, kzalloc() can fail.
> Thus null pointer (vc_cons[currcons].d, vc and vc->vc_screenbuf)
> dereference may happen. And it will cause the kernel to crash. Therefore,
> we should check return value and handle the error.
> Further,the loop condition MIN_NR_CONSOLES is defined as 1 in
> include/uapi/linux/vt.h. So there is no need to unwind the loop.

But what if someone changes that define? It won't be obvious that some 
code did rely on it to be defined to 1.

> Signed-off-by: Gen Zhang 
> 
> ---
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index fdd12f8..b756609 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -3350,10 +3350,14 @@ static int __init con_init(void)
>  
>   for (currcons = 0; currcons < MIN_NR_CONSOLES; currcons++) {
>   vc_cons[currcons].d = vc = kzalloc(sizeof(struct vc_data), 
> GFP_NOWAIT);
> + if (!vc_cons[currcons].d || !vc)

Both vc_cons[currcons].d and vc are assigned the same value on the 
previous line. You don't have to test them both.

> + goto err_vc;
>   INIT_WORK(_cons[currcons].SAK_work, vc_SAK);
>   tty_port_init(>port);
>   visual_init(vc, currcons, 1);
>   vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
> + if (!vc->vc_screenbuf)
> + goto err_vc_screenbuf;
>   vc_init(vc, vc->vc_rows, vc->vc_cols,
>   currcons || !vc->vc_sw->con_save_screen);
>   }
> @@ -3375,6 +3379,14 @@ static int __init con_init(void)
>   register_console(_console_driver);
>  #endif
>   return 0;
> +err_vc:
> + console_unlock();
> + return -ENOMEM;
> +err_vc_screenbuf:
> + console_unlock();
> + kfree(vc);
> + vc_cons[currcons].d = NULL;
> + return -ENOMEM;

As soon as you release the lock, another thread could come along and 
start using the memory pointed by vc_cons[currcons].d you're about to 
free here. This is unlikely for an initcall, but still.

You should consider this ordering instead:

err_vc_screenbuf:
kfree(vc);
vc_cons[currcons].d = NULL;
err_vc:
console_unlock();
return -ENOMEM;


>  }
>  console_initcall(con_init);
>  
>  ---
> 


Re: [PATCH] update my email address

2019-04-03 Thread Nicolas Pitre
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 2 Apr 2019, Linus Torvalds wrote:

> On Tue, Apr 2, 2019 at 7:18 AM Nicolas Pitre  wrote:
> >
> > The @linaro version won't be valid much longer.
> 
> Generally I like seeing these come in from the old address just to verify.
> 
> But at least the old address is cc'd..

Fair enough. Here's a confirmation.

Nicolas

-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJcpLvrAAoJEH9KYoIL9GO3NlwH/iSnLw3VgKBJBTCbiOmgZTMu
+X5Pebwe9TlVJxqnFBMnszJT4GJTrg5oA4m1hAh8vxoQnwQ/HIYJs1ioAlGLl9wi
KQ2AsMiokO5LXLz2L0YjD5/5YY5G5c6o8qP9cVt6e3ep6jOJej0QSY1rc0FYu2sf
QOTal+HswoaEb8Nu/nir3aNaKJhBK7rf+t7KpvbfHXjH72C5VHe9ECCgsPOSgm3Z
d/jalyarPk237xDQ0la/hiXunnen/3yQrZ12ILQs5g426CAhykK5L9gIPoSnFT90
uIP1Wpe0pCy3v3UoF9etZmQD3uavLm5lEC0RzSESSaUAprV2pK/3uMebyQnrN3M=
=xJL1
-END PGP SIGNATURE-


[PATCH] update my email address

2019-04-02 Thread Nicolas Pitre
The @linaro version won't be valid much longer.

Signed-off-by: Nicolas Pitre 

diff --git a/.mailmap b/.mailmap
index b2cde8668d..ae2bcad06f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -156,6 +156,8 @@ Morten Welinder 
 Morten Welinder 
 Mythri P K 
 Nguyen Anh Quynh 
+Nicolas Pitre  
+Nicolas Pitre  
 Paolo 'Blaisorblade' Giarrusso 
 Patrick Mochel 
 Paul Burton  
diff --git a/MAINTAINERS b/MAINTAINERS
index 43b36dbed4..391405091c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4129,7 +4129,7 @@ F:drivers/cpuidle/*
 F: include/linux/cpuidle.h
 
 CRAMFS FILESYSTEM
-M: Nicolas Pitre 
+M: Nicolas Pitre 
 S: Maintained
 F: Documentation/filesystems/cramfs.txt
 F: fs/cramfs/


Re: [RFC PATCH 48/68] vfs: Convert cramfs to use the new mount API

2019-04-01 Thread Nicolas Pitre
On Wed, 27 Mar 2019, David Howells wrote:

> Convert the cramfs filesystem to the new internal mount API as the old
> one will be obsoleted and removed.  This allows greater flexibility in
> communication of mount parameters between userspace, the VFS and the
> filesystem.
> 
> See Documentation/filesystems/mount_api.txt for more information.
> 
> Signed-off-by: David Howells 
> cc: Nicolas Pitre 
> cc: linux-...@lists.infradead.org
> cc: linux-bl...@vger.kernel.org

Tested-by: Nicolas Pitre 
Acked-by: Nicolas Pitre 




> ---
> 
>  fs/cramfs/inode.c |   69 
> ++---
>  1 file changed, 39 insertions(+), 30 deletions(-)
> 
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 9352487bd0fc..2ee89a353d64 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -506,18 +507,19 @@ static void cramfs_kill_sb(struct super_block *sb)
>   kfree(sbi);
>  }
>  
> -static int cramfs_remount(struct super_block *sb, int *flags, char *data)
> +static int cramfs_reconfigure(struct fs_context *fc)
>  {
> - sync_filesystem(sb);
> - *flags |= SB_RDONLY;
> + sync_filesystem(fc->root->d_sb);
> + fc->sb_flags |= SB_RDONLY;
>   return 0;
>  }
>  
> -static int cramfs_read_super(struct super_block *sb,
> -  struct cramfs_super *super, int silent)
> +static int cramfs_read_super(struct super_block *sb, struct fs_context *fc,
> +  struct cramfs_super *super)
>  {
>   struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
>   unsigned long root_offset;
> + bool silent = fc->sb_flags & SB_SILENT;
>  
>   /* We don't know the real size yet */
>   sbi->size = PAGE_SIZE;
> @@ -532,7 +534,7 @@ static int cramfs_read_super(struct super_block *sb,
>   /* check for wrong endianness */
>   if (super->magic == CRAMFS_MAGIC_WEND) {
>   if (!silent)
> - pr_err("wrong endianness\n");
> + errorf(fc, "cramfs: wrong endianness");
>   return -EINVAL;
>   }
>  
> @@ -544,22 +546,22 @@ static int cramfs_read_super(struct super_block *sb,
>   mutex_unlock(_mutex);
>   if (super->magic != CRAMFS_MAGIC) {
>   if (super->magic == CRAMFS_MAGIC_WEND && !silent)
> - pr_err("wrong endianness\n");
> + errorf(fc, "cramfs: wrong endianness");
>   else if (!silent)
> - pr_err("wrong magic\n");
> + errorf(fc, "cramfs: wrong magic");
>   return -EINVAL;
>   }
>   }
>  
>   /* get feature flags first */
>   if (super->flags & ~CRAMFS_SUPPORTED_FLAGS) {
> - pr_err("unsupported filesystem features\n");
> + errorf(fc, "cramfs: unsupported filesystem features");
>   return -EINVAL;
>   }
>  
>   /* Check that the root inode is in a sane state */
>   if (!S_ISDIR(super->root.mode)) {
> - pr_err("root is not a directory\n");
> + errorf(fc, "cramfs: root is not a directory");
>   return -EINVAL;
>   }
>   /* correct strange, hard-coded permissions of mkcramfs */
> @@ -578,12 +580,12 @@ static int cramfs_read_super(struct super_block *sb,
>   sbi->magic = super->magic;
>   sbi->flags = super->flags;
>   if (root_offset == 0)
> - pr_info("empty filesystem");
> + infof(fc, "cramfs: empty filesystem");
>   else if (!(super->flags & CRAMFS_FLAG_SHIFTED_ROOT_OFFSET) &&
>((root_offset != sizeof(struct cramfs_super)) &&
> (root_offset != 512 + sizeof(struct cramfs_super
>   {
> - pr_err("bad root offset %lu\n", root_offset);
> + errorf(fc, "cramfs: bad root offset %lu", root_offset);
>   return -EINVAL;
>   }
>  
> @@ -607,8 +609,7 @@ static int cramfs_finalize_super(struct super_block *sb,
>   return 0;
>  }
>  
> -static int cramfs_blkdev_fill_super(struct super_block *sb, void *data,
> - int silent)
> +static int cramfs_blkdev_fill_super(struct super_block *sb, struct 
> fs_context *fc

Re: [PATCH 3/3] ARM: mvebu: prefix coprocessor operand with p

2019-03-23 Thread nicolas . pitre
On Sat, 23 Mar 2019, Stefan Agner wrote:

> In every other instance where mrc is used the coprocessor operand
> is prefix with p (e.g. p15). Use the p prefix in this case too.
> This fixes a build issue when using LLVM's integrated assembler:
>   arch/arm/mach-mvebu/coherency_ll.S:69:6: error: invalid operand for 
> instruction
>mrc 15, 0, r3, cr0, cr0, 5
>^
>   arch/arm/mach-mvebu/pmsu_ll.S:19:6: error: invalid operand for instruction
>mrc 15, 0, r0, cr0, cr0, 5 @ get the CPU ID
>^
> 
> Signed-off-by: Stefan Agner 

Acked-by: Nicolas Pitre 


> ---
>  arch/arm/mach-mvebu/coherency_ll.S | 2 +-
>  arch/arm/mach-mvebu/pmsu_ll.S  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/mach-mvebu/coherency_ll.S 
> b/arch/arm/mach-mvebu/coherency_ll.S
> index 8b2fbc8b6bc6..2d962fe48821 100644
> --- a/arch/arm/mach-mvebu/coherency_ll.S
> +++ b/arch/arm/mach-mvebu/coherency_ll.S
> @@ -66,7 +66,7 @@ ENDPROC(ll_get_coherency_base)
>   * fabric registers
>   */
>  ENTRY(ll_get_coherency_cpumask)
> - mrc 15, 0, r3, cr0, cr0, 5
> + mrc p15, 0, r3, cr0, cr0, 5
>   and r3, r3, #15
>   mov r2, #(1 << 24)
>   lsl r3, r2, r3
> diff --git a/arch/arm/mach-mvebu/pmsu_ll.S b/arch/arm/mach-mvebu/pmsu_ll.S
> index c1fb713e9306..7aae9a25cfeb 100644
> --- a/arch/arm/mach-mvebu/pmsu_ll.S
> +++ b/arch/arm/mach-mvebu/pmsu_ll.S
> @@ -16,7 +16,7 @@
>  ENTRY(armada_38x_scu_power_up)
>   mrc p15, 4, r1, c15, c0 @ get SCU base address
>   orr r1, r1, #0x8@ SCU CPU Power Status Register
> - mrc 15, 0, r0, cr0, cr0, 5  @ get the CPU ID
> + mrc p15, 0, r0, cr0, cr0, 5 @ get the CPU ID
>   and r0, r0, #15
>   add r1, r1, r0
>   mov r0, #0x0
> -- 
> 2.21.0
> 
> 


Re: [PATCH 2/3] ARM: mvebu: drop unnecessary label

2019-03-23 Thread nicolas . pitre
On Sat, 23 Mar 2019, Stefan Agner wrote:

> The label mvebu_boot_wa_start is not necessary and causes a build
> issue when building with LLVM's integrated assembler:
> AS  arch/arm/mach-mvebu/pmsu_ll.o
>   arch/arm/mach-mvebu/pmsu_ll.S:59:1: error: invalid symbol redefinition
>   mvebu_boot_wa_start:
>   ^
> 
> Drop the label.
> 
> Signed-off-by: Stefan Agner 

Acked-by: Nicolas Pitre 

> ---
>  arch/arm/mach-mvebu/pmsu_ll.S | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/mach-mvebu/pmsu_ll.S b/arch/arm/mach-mvebu/pmsu_ll.S
> index 88651221dbdd..c1fb713e9306 100644
> --- a/arch/arm/mach-mvebu/pmsu_ll.S
> +++ b/arch/arm/mach-mvebu/pmsu_ll.S
> @@ -56,7 +56,6 @@ ENDPROC(armada_38x_cpu_resume)
>  
>  /* The following code will be executed from SRAM */
>  ENTRY(mvebu_boot_wa_start)
> -mvebu_boot_wa_start:
>  ARM_BE8(setend   be)
>   adr r0, 1f
>   ldr r0, [r0]@ load the address of the
> -- 
> 2.21.0
> 
> 


Re: [RFC 1/1] tty: vt.c: Fix TIOCL_BLANKSCREEN VT console blanking if blankinterval == 0

2019-02-25 Thread Nicolas Pitre
On Tue, 26 Feb 2019, Yifeng Li wrote:

> In vt.c, "blank_state" will be initialized to "blank_normal_wait" in
> con_init() if AND ONLY IF ("blankinterval" > 0). If "blankinterval" is 0,
> "blank_state" will be "blank_off" (== 0), and a call to do_blank_screen()
> will always abort. Even if a forced blanking is required from the user by
> calling TIOCL_BLANKSCREEN, the console won't be blanked.
> 
> In this commit,  we introduce a 4th "blank_state" - "blank_normal_notimer",
> it indicates the console can be blanked, but not automatically by a timer.
> Then, we made a change to "con_init()" - if (blankinterval == 0),
> "blank_state" will be set to "blank_normal_notimer" (we have similar
> changes to "do_unblank_screen()" and "poke_blanked_console()"). Finally,
> we change do_blank_screen() - now it will return if "blank_state" is
> neither "blank_normal_wait" nor "blank_normal_notimer", thus allowing
> the console to be blanked even if there's no timed autoblanking.
> 
> Signed-off-by: Yifeng Li 

Wouldn't this problem be fixed simply by the following:

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 9646ff63e7..41b17c4b5a 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -4151,8 +4151,6 @@ void do_blank_screen(int entering_gfx)
return;
}
 
-   if (blank_state != blank_normal_wait)
-   return;
blank_state = blank_off;
 
/* don't blank graphics */

I just can't find a logical reason for that conditional return.


Nicolas


Re: [PATCH] ARM: pm: fix HYP/SVC mode mismatch when MCPM is used

2019-02-14 Thread Nicolas Pitre
On Thu, 14 Feb 2019, Marek Szyprowski wrote:

> MCPM does a soft reset of the CPUs and uses common cpu_resume() routine to
> perform low-level platform initialization. This results in a try to install
> HYP stubs for the second time for each CPU and results in false HYP/SVC
> mode mismatch detection. The HYP stubs are already installed at the
> beginning of the kernel initialization on the boot CPU (head.S) or in the
> secondary_startup() for other CPUs. To fix this issue MCPM code should use
> a cpu_resume() routine without HYP stubs installation.
> 
> This change fixes HYP/SVC mode mismatch on Samsung Exynos5422-based Odroid
> XU3/XU4/HC1 boards.
> 
> Fixes: 3721924c8154 ("ARM: 8081/1: MCPM: provide infrastructure to allow for 
> MCPM loopback")
> Signed-off-by: Marek Szyprowski 

Acked-by: Nicolas Pitre 


> ---
>  arch/arm/common/mcpm_entry.c   |  2 +-
>  arch/arm/include/asm/suspend.h |  1 +
>  arch/arm/kernel/sleep.S| 11 +++
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c
> index ad574d20415c..1b1b82b37ce0 100644
> --- a/arch/arm/common/mcpm_entry.c
> +++ b/arch/arm/common/mcpm_entry.c
> @@ -381,7 +381,7 @@ static int __init nocache_trampoline(unsigned long _arg)
>   unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>   phys_reset_t phys_reset;
>  
> - mcpm_set_entry_vector(cpu, cluster, cpu_resume);
> + mcpm_set_entry_vector(cpu, cluster, cpu_resume_no_hyp);
>   setup_mm_for_reboot();
>  
>   __mcpm_cpu_going_down(cpu, cluster);
> diff --git a/arch/arm/include/asm/suspend.h b/arch/arm/include/asm/suspend.h
> index 452bbdcbcc83..506314265c6f 100644
> --- a/arch/arm/include/asm/suspend.h
> +++ b/arch/arm/include/asm/suspend.h
> @@ -10,6 +10,7 @@ struct sleep_save_sp {
>  };
>  
>  extern void cpu_resume(void);
> +extern void cpu_resume_no_hyp(void);
>  extern void cpu_resume_arm(void);
>  extern int cpu_suspend(unsigned long, int (*)(unsigned long));
>  
> diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S
> index a8257fc9cf2a..b856d183691e 100644
> --- a/arch/arm/kernel/sleep.S
> +++ b/arch/arm/kernel/sleep.S
> @@ -122,6 +122,11 @@ ENDPROC(cpu_resume_after_mmu)
>  
>  #ifdef CONFIG_MMU
>   .arm
> +#ifdef CONFIG_MCPM
> +ENTRY(cpu_resume_no_hyp)
> +ARM_BE8(setend be)   @ ensure we are in BE mode
> + b   0f
> +#endif
>  ENTRY(cpu_resume_arm)
>   THUMB(  badrr9, 1f  )   @ Kernel is entered in ARM.
>   THUMB(  bx  r9  )   @ If this is a Thumb-2 kernel,
> @@ -135,6 +140,9 @@ ARM_BE8(setend be)@ ensure we are 
> in BE mode
>   bl  __hyp_stub_install_secondary
>  #endif
>   safe_svcmode_maskall r1
> +#ifdef CONFIG_MCPM
> +0:
> +#endif
>   mov r1, #0
>   ALT_SMP(mrc p15, 0, r0, c0, c0, 5)
>   ALT_UP_B(1f)
> @@ -163,6 +171,9 @@ ENDPROC(cpu_resume)
>  
>  #ifdef CONFIG_MMU
>  ENDPROC(cpu_resume_arm)
> +#endif
> +#ifdef CONFIG_MCPM
> +ENDPROC(cpu_resume_no_hyp)
>  #endif
>  
>   .align 2
> -- 
> 2.17.1
> 
> 


[PATCH] vt: perform safe console erase in the right order

2019-02-11 Thread Nicolas Pitre
Commit 4b4ecd9cb853 ("vt: Perform safe console erase only once") removed
what appeared to be an extra call to scr_memsetw(). This missed the fact
that set_origin() must be called before clearing the screen otherwise
old screen content gets restored on the screen when using vgacon. Let's
fix that by moving all the scrollback handling to flush_scrollback()
where it logically belongs, and invoking it before the actual screen
clearing in csi_J(), making the code simpler in the end.

Reported-by: Matthew Whitehead 
Signed-off-by: Nicolas Pitre 
Tested-by: Matthew Whitehead 
Fixes: 4b4ecd9cb853 ("vt: Perform safe console erase only once")
Cc: sta...@vger.kernel.org

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index bba75560d1..9646ff63e7 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -935,8 +935,11 @@ static void flush_scrollback(struct vc_data *vc)
 {
WARN_CONSOLE_UNLOCKED();
 
+   set_origin(vc);
if (vc->vc_sw->con_flush_scrollback)
vc->vc_sw->con_flush_scrollback(vc);
+   else
+   vc->vc_sw->con_switch(vc);
 }
 
 /*
@@ -1503,8 +1506,10 @@ static void csi_J(struct vc_data *vc, int vpar)
count = ((vc->vc_pos - vc->vc_origin) >> 1) + 1;
start = (unsigned short *)vc->vc_origin;
break;
+   case 3: /* include scrollback */
+   flush_scrollback(vc);
+   /* fallthrough */
case 2: /* erase whole display */
-   case 3: /* (and scrollback buffer later) */
vc_uniscr_clear_lines(vc, 0, vc->vc_rows);
count = vc->vc_cols * vc->vc_rows;
start = (unsigned short *)vc->vc_origin;
@@ -1513,13 +1518,7 @@ static void csi_J(struct vc_data *vc, int vpar)
return;
}
scr_memsetw(start, vc->vc_video_erase_char, 2 * count);
-   if (vpar == 3) {
-   set_origin(vc);
-   flush_scrollback(vc);
-   if (con_is_visible(vc))
-   update_screen(vc);
-   } else if (con_should_update(vc))
-   do_update_region(vc, (unsigned long) start, count);
+   update_region(vc, (unsigned long) start, count);
vc->vc_need_wrap = 0;
 }
 


Re: [PATCH v2 0/5] ARM: convert to unified syntax

2019-02-10 Thread Nicolas Pitre
On Sun, 10 Feb 2019, Stefan Agner wrote:

> This patchset converts all assembly code to unified assembler
> language (UAL) compatible assembly code. From what I can tell,
> this mainly boils down to using conditional infixes instead of
> postfixes.
> 
> Most of the conversion has been done using the following regular
> expression:
>   find ./arch/arm/ -name "*.[hSc]" -exec sed -i -r \
> 
> "s/^((\s*[._a-zA-Z0-9]*[\:\(])?\s*)([a-z]{3})(eq|ne|cs|hs|cc|lo|mi|pl|vs|vc|hi|ls|ge|lt|gt|le|al)([a-z]{1,2})(\s)/\1\3\5\4\6/"
> \
> {} \;
> 
> The expression resulted in some false positives and missed some
> instances where infix conditionals have been used. With this
> changes applied, I compiled several kernel configurations
> successfully and without a warning.
> 
> The file arch/arm/probes/kprobes/test-arm.c is still using some
> divided syntax assembler.
> 
> This does not allow to use LLVM's integrated assembler just yet,
> there is still some assembler which the integrated assembler does
> not like (yet). But it is a big step towards that direction.

OK

Please document why you added .syntax unified in the commit log of each 
patch where this is added along with a link to the gcc PR.  Then you may 
add Acked-by: Nicolas Pitre  to the whole series.


Nicolas


Re: [PATCH 2/5] ARM: use unified assembler in headers

2019-02-07 Thread Nicolas Pitre
On Thu, 7 Feb 2019, Stefan Agner wrote:

> Use unified assembler syntax (UAL) in headers. Divided syntax is
> considered depricated. This will also allow to build the kernel
> using LLVM's integrated assembler.
> 
> Signed-off-by: Stefan Agner 
> ---
>  arch/arm/include/asm/assembler.h | 8 
>  arch/arm/include/asm/vfpmacros.h | 8 
>  arch/arm/lib/bitops.h| 8 
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm/include/asm/assembler.h 
> b/arch/arm/include/asm/assembler.h
> index 28a48e0d4cca..60465b55683c 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -376,7 +376,7 @@ THUMB(orr \reg , \reg , #PSR_T_BIT)
>   .macro  usraccoff, instr, reg, ptr, inc, off, cond, abort, t=TUSER()
>  :
>   .if \inc == 1
> - \instr\cond\()b\()\t\().w \reg, [\ptr, #\off]
> + \instr\()b\cond\()\t\().w \reg, [\ptr, #\off]

Similar comment here: you added a \() between \instr and b as needed, 
but the one between \cond and \t (which was already redundant before) 
may go.


Nicolas


Re: [PATCH 1/5] ARM: use unified assembler in macros

2019-02-07 Thread Nicolas Pitre
On Thu, 7 Feb 2019, Stefan Agner wrote:

> Use unified assembler syntax (UAL) in macros. Divided syntax is
> considered depricated. This will also allow to build the kernel
> using LLVM's integrated assembler.
> 
> Signed-off-by: Stefan Agner 
> ---
>  arch/arm/lib/copy_from_user.S | 2 +-
>  arch/arm/lib/copy_to_user.S   | 2 +-
>  arch/arm/lib/memcpy.S | 4 ++--
>  3 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
> index 0d4c189c7f4f..712ca399f559 100644
> --- a/arch/arm/lib/copy_from_user.S
> +++ b/arch/arm/lib/copy_from_user.S
> @@ -91,7 +91,7 @@
>   .endm
>  
>   .macro str1b ptr reg cond=al abort
> - str\cond\()b \reg, [\ptr], #1
> + strb\cond\() \reg, [\ptr], #1

You don't need the \() any longer. This was used simply to separate the 
b suffix from the cond parameter name. 


Nicolas


[PATCH] vgacon: unconfuse vc_origin when using soft scrollback

2019-01-20 Thread Nicolas Pitre
When CONFIG_VGACON_SOFT_SCROLLBACK is selected, the VGA display memory
index and vc_visible_origin don't change when scrollback is activated.
The actual screen content is saved away and the scrollbackdata is copied
over it. However the vt code, and /dev/vcs devices in particular, still
expect vc_origin to always point at the actual screen content not the
displayed scrollback content.

So adjust vc_origin to point at the saved screen content when scrollback
is active and set it back to vc_visible_origin when restoring the screen.

This fixes /dev/vcsa that return scrollback content when they
shouldn't (onli /dev/vcsa without a number should), and also fixes
/dev/vcsu that should return scrollback content when scrollback is
active but currently doesn't.

An unnecessary call to vga_set_mem_top() is also removed.

Signed-off-by: Nicolas Pitre 
Cc: sta...@vger.kernel.org # v4.19+

---

I tagged it for stable starting with v4.19 as this is the kernel that
introduced /dev/vcsu* which is directly affected. Users of earlier kernels
most likely won't care.

diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
index 09731b2f68..c6b3bdbbdb 100644
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -271,6 +271,7 @@ static void vgacon_scrollback_update(struct vc_data *c, int 
t, int count)
 
 static void vgacon_restore_screen(struct vc_data *c)
 {
+   c->vc_origin = c->vc_visible_origin;
vgacon_scrollback_cur->save = 0;
 
if (!vga_is_gfx && !vgacon_scrollback_cur->restore) {
@@ -287,8 +288,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int start, end, count, soff;
 
if (!lines) {
-   c->vc_visible_origin = c->vc_origin;
-   vga_set_mem_top(c);
+   vgacon_restore_screen(c);
return;
}
 
@@ -298,6 +298,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
if (!vgacon_scrollback_cur->save) {
vgacon_cursor(c, CM_ERASE);
vgacon_save_screen(c);
+   c->vc_origin = (unsigned long)c->vc_screenbuf;
vgacon_scrollback_cur->save = 1;
}
 
@@ -335,7 +336,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int copysize;
 
int diff = c->vc_rows - count;
-   void *d = (void *) c->vc_origin;
+   void *d = (void *) c->vc_visible_origin;
void *s = (void *) c->vc_screenbuf;
 
count *= c->vc_size_row;


Re: [PATCH 07/11] mmc: mvsdio: handle highmem pages

2019-01-14 Thread Nicolas Pitre
On Mon, 14 Jan 2019, Christoph Hellwig wrote:

> Instead of setting up a kernel pointer to track the current PIO address,
> track the offset in the current page, and do an atomic kmap for the page
> while doing the actual PIO operations.
> 
> Signed-off-by: Christoph Hellwig 

Just a small suggestion below


> ---
>  drivers/mmc/host/mvsdio.c | 48 +++
>  1 file changed, 29 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/mmc/host/mvsdio.c b/drivers/mmc/host/mvsdio.c
> index e22bbff89c8d..545e370d6dae 100644
> --- a/drivers/mmc/host/mvsdio.c
> +++ b/drivers/mmc/host/mvsdio.c
> @@ -12,6 +12,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -42,7 +43,8 @@ struct mvsd_host {
>   unsigned int intr_en;
>   unsigned int ctrl;
>   unsigned int pio_size;
> - void *pio_ptr;
> + struct scatterlist *pio_sg;
> + unsigned int pio_offset; /* offset in words into the segment */
>   unsigned int sg_frags;
>   unsigned int ns_per_clk;
>   unsigned int clock;
> @@ -96,9 +98,9 @@ static int mvsd_setup_data(struct mvsd_host *host, struct 
> mmc_data *data)
>   if (tmout_index > MVSD_HOST_CTRL_TMOUT_MAX)
>   tmout_index = MVSD_HOST_CTRL_TMOUT_MAX;
>  
> - dev_dbg(host->dev, "data %s at 0x%08x: blocks=%d blksz=%d tmout=%u 
> (%d)\n",
> + dev_dbg(host->dev, "data %s at 0x%08llx: blocks=%d blksz=%d tmout=%u 
> (%d)\n",
>   (data->flags & MMC_DATA_READ) ? "read" : "write",
> - (u32)sg_virt(data->sg), data->blocks, data->blksz,
> + (u64)sg_phys(data->sg), data->blocks, data->blksz,
>   tmout, tmout_index);
>  
>   host->ctrl &= ~MVSD_HOST_CTRL_TMOUT_MASK;
> @@ -118,10 +120,11 @@ static int mvsd_setup_data(struct mvsd_host *host, 
> struct mmc_data *data)
>* boundary.
>*/
>   host->pio_size = data->blocks * data->blksz;
> - host->pio_ptr = sg_virt(data->sg);
> + host->pio_sg = data->sg;
> + host->pio_offset = data->sg->offset / 2;
>   if (!nodma)
> - dev_dbg(host->dev, "fallback to PIO for data at 0x%p 
> size %d\n",
> - host->pio_ptr, host->pio_size);
> + dev_dbg(host->dev, "fallback to PIO for data at 0x%x 
> size %d\n",
> + host->pio_offset, host->pio_size);
>   return 1;
>   } else {
>   dma_addr_t phys_addr;
> @@ -291,8 +294,9 @@ static u32 mvsd_finish_data(struct mvsd_host *host, 
> struct mmc_data *data,
>  {
>   void __iomem *iobase = host->base;
>  
> - if (host->pio_ptr) {
> - host->pio_ptr = NULL;
> + if (host->pio_sg) {
> + host->pio_sg = NULL;
> + host->pio_offset = 0;
>   host->pio_size = 0;
>   } else {
>   dma_unmap_sg(mmc_dev(host->mmc), data->sg, host->sg_frags,
> @@ -376,11 +380,12 @@ static irqreturn_t mvsd_irq(int irq, void *dev)
>   if (host->pio_size &&
>   (intr_status & host->intr_en &
>(MVSD_NOR_RX_READY | MVSD_NOR_RX_FIFO_8W))) {
> - u16 *p = host->pio_ptr;
> + u16 *p = kmap_atomic(sg_page(host->pio_sg));
> + unsigned int o = host->pio_offset;

To minimize code change, you could do:

u16 *p0 = kmap_atomic(sg_page(host->pio_sg));
u16 *p = p0 + host->pio_offset;

and at the end:

host->pio_offset = p - p0;

leaving the middle code unchanged.

This might also produce slightly better assembly with the post increment 
instructions on ARM if the compiler doesn't figure it out on its own 
otherwise.


Nicolas


Re: commit 83d817f410 broke my ability to use Linux with a braille display

2019-01-11 Thread Nicolas Pitre
On Fri, 11 Jan 2019, Greg Kroah-Hartman wrote:

> On Fri, Jan 11, 2019 at 01:33:09PM -0500, Nicolas Pitre wrote:
> > I use Linux with the help of a braille display and the brltty daemon. It 
> > turns out that the latest mainline kernel I can work with comes from 
> > commit 231f8fd0cc. Anything past that and I lose the ability to read the 
> > console barely a few seconds after the system has booted as brltty is 
> > thrown a wrench and the braille display becomes completely inoperable.
> 
> Please try the patch below, it was just queued up to my tree and should
> resolve the issue.  If not, please let us know.

Yes, it works. Thanks.

I also re-validated all the vt/vcs patches I sent you this week for 
which I started entertaining some doubts about their stability. But they 
all work fine again with the tty fix applied.


Nicolas


Re: commit 83d817f410 broke my ability to use Linux with a braille display

2019-01-11 Thread Nicolas Pitre
On Fri, 11 Jan 2019, Vito Caputo wrote:

> On Fri, Jan 11, 2019 at 01:33:09PM -0500, Nicolas Pitre wrote:
> > I use Linux with the help of a braille display and the brltty daemon. It 
> > turns out that the latest mainline kernel I can work with comes from 
> > commit 231f8fd0cc. Anything past that and I lose the ability to read the 
> > console barely a few seconds after the system has booted as brltty is 
> > thrown a wrench and the braille display becomes completely inoperable.
> > 
> > Things get somewhat better with commit c96cf923a9 as brltty is not 
> > longer incapacitated, but some programs would randomly crash. Even the 
> > very first login attempt won't work as I soon as I hit enter after my 
> > user name the password prompt is skipped over, just like if the enter 
> > key had been hit twice. Then lynx (the text web browser) would crash as 
> > soon as I switch the virtual console with LeftAlt+FN. Mind you, this 
> > isn't easy to perform bisection in those conditions.
> > 
> > And the worst commit i.e. 83d817f410 is marked for stable!  :-(
> 
> This all sounds familiar, and I suspect this is the fix:
> 
> https://lkml.org/lkml/2019/1/8/1379

Yep, I confirm this patch did solve all my issues.


Nicolas


commit 83d817f410 broke my ability to use Linux with a braille display

2019-01-11 Thread Nicolas Pitre
I use Linux with the help of a braille display and the brltty daemon. It 
turns out that the latest mainline kernel I can work with comes from 
commit 231f8fd0cc. Anything past that and I lose the ability to read the 
console barely a few seconds after the system has booted as brltty is 
thrown a wrench and the braille display becomes completely inoperable.

Things get somewhat better with commit c96cf923a9 as brltty is not 
longer incapacitated, but some programs would randomly crash. Even the 
very first login attempt won't work as I soon as I hit enter after my 
user name the password prompt is skipped over, just like if the enter 
key had been hit twice. Then lynx (the text web browser) would crash as 
soon as I switch the virtual console with LeftAlt+FN. Mind you, this 
isn't easy to perform bisection in those conditions.

And the worst commit i.e. 83d817f410 is marked for stable!  :-(

Some interaction with brltty must be at play here otherwise such 
breakage would never have survived up to the mainline kernel.

As far as latest mainline is concerned, I managed to reproduce at least 
one of the unwelcome behavior change (hoping that's all there is to this 
issue) with a very simple test case so you won't have to learn braille 
to debug this:

# from any vt, make sure tty40 is allocated and empty
openvt -c 40 -f -- true

# open it and wait on read()
cat /dev/tty40

# from a second vt, simply open tty40 again
true < /dev/tty40

# come back to the first vt and watch cat bailing out with EAGAIN.

Please fix.


Nicolas


Re: [PATCH] vgacon: unconfuse vc_origin when using soft scrollback

2019-01-10 Thread Nicolas Pitre
On Thu, 10 Jan 2019, Nicolas Pitre wrote:

> The vt code is just a frigging mess of integer variables and pointer 
> variables referring to the same thing. It is therefore littered with 
> casts all over the place. And of course I missed one in this patch.  
> 
> Here's a revised patch to quiet a warning:

And of course I forgot to commit and reposted the same patch 
with the missing cast. So here it is again:

- >8

Subject: [PATCH] vgacon: unconfuse vc_origin when using soft scrollback

When CONFIG_VGACON_SOFT_SCROLLBACK is selected, the VGA display memory
index and vc_visible_origin don't change when scrollback is activated.
The actual screen content is saved away and the scrollbackdata is copied
over it. However the vt code, and /dev/vcs devices in particular, still
expect vc_origin to always point at the actual screen content not the
displayed scrollback content.

So adjust vc_origin to point at the saved screen content when scrollback
is active and set it back to vc_visible_origin when restoring the screen.

This fixes /dev/vcsa that return scrollback content when they
shouldn't (onli /dev/vcsa without a number should), and also fixes
/dev/vcsu that should return scrollback content when scrollback is
active but currently doesn't.

An unnecessary call to vga_set_mem_top() is also removed.

Signed-off-by: Nicolas Pitre 
Cc: sta...@vger.kernel.org # v4.19+

---

I tagged it for stable starting with v4.19 as this is the kernel that
introduced /dev/vcsu* which is directly affected. Users of earlier kernels
most likely won't care.

diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
index 09731b2f68..c6b3bdbbdb 100644
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -271,6 +271,7 @@ static void vgacon_scrollback_update(struct vc_data *c, int 
t, int count)
 
 static void vgacon_restore_screen(struct vc_data *c)
 {
+   c->vc_origin = c->vc_visible_origin;
vgacon_scrollback_cur->save = 0;
 
if (!vga_is_gfx && !vgacon_scrollback_cur->restore) {
@@ -287,8 +288,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int start, end, count, soff;
 
if (!lines) {
-   c->vc_visible_origin = c->vc_origin;
-   vga_set_mem_top(c);
+   vgacon_restore_screen(c);
return;
}
 
@@ -298,6 +298,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
if (!vgacon_scrollback_cur->save) {
vgacon_cursor(c, CM_ERASE);
vgacon_save_screen(c);
+   c->vc_origin = (unsigned long)c->vc_screenbuf;
vgacon_scrollback_cur->save = 1;
}
 
@@ -335,7 +336,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int copysize;
 
int diff = c->vc_rows - count;
-   void *d = (void *) c->vc_origin;
+   void *d = (void *) c->vc_visible_origin;
void *s = (void *) c->vc_screenbuf;
 
count *= c->vc_size_row;


Re: [PATCH] vgacon: unconfuse vc_origin when using soft scrollback

2019-01-10 Thread Nicolas Pitre
On Thu, 10 Jan 2019, Nicolas Pitre wrote:

> When CONFIG_VGACON_SOFT_SCROLLBACK is selected, the VGA display memory
> index and vc_visible_origin don't change when scrollback is activated.
> The actual screen content is saved away and the scrollbackdata is copied
> over it. However the vt code, and /dev/vcs devices in particular, still
> expect vc_origin to always point at the actual screen content not the
> displayed scrollback content.
> 
> So adjust vc_origin to point at the saved screen content when scrollback
> is active and set it back to vc_visible_origin when restoring the screen.
> 
> This fixes /dev/vcsa that return scrollback content when they
> shouldn't (onli /dev/vcsa without a number should), and also fixes
> /dev/vcsu that should return scrollback content when scrollback is
> active but currently doesn't.

The vt code is just a frigging mess of integer variables and pointer 
variables referring to the same thing. It is therefore littered with 
casts all over the place. And of course I missed one in this patch.  

Here's a revised patch to quiet a warning:

- >8
Subject: [PATCH] vgacon: unconfuse vc_origin when using soft scrollback

When CONFIG_VGACON_SOFT_SCROLLBACK is selected, the VGA display memory
index and vc_visible_origin don't change when scrollback is activated.
The actual screen content is saved away and the scrollbackdata is copied
over it. However the vt code, and /dev/vcs devices in particular, still
expect vc_origin to always point at the actual screen content not the
displayed scrollback content.

So adjust vc_origin to point at the saved screen content when scrollback
is active and set it back to vc_visible_origin when restoring the screen.

This fixes /dev/vcsa that return scrollback content when they
shouldn't (onli /dev/vcsa without a number should), and also fixes
/dev/vcsu that should return scrollback content when scrollback is
active but currently doesn't.

An unnecessary call to vga_set_mem_top() is also removed.

Signed-off-by: Nicolas Pitre 
Cc: sta...@vger.kernel.org # v4.19+

---

I tagged it for stable starting with v4.19 as this is the kernel that
introduced /dev/vcsu* which is directly affected. Users of earlier kernels
most likely won't care.

diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
index 09731b2f68..a3353a9970 100644
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -271,6 +271,7 @@ static void vgacon_scrollback_update(struct vc_data *c, int 
t, int count)
 
 static void vgacon_restore_screen(struct vc_data *c)
 {
+   c->vc_origin = c->vc_visible_origin;
vgacon_scrollback_cur->save = 0;
 
if (!vga_is_gfx && !vgacon_scrollback_cur->restore) {
@@ -287,8 +288,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int start, end, count, soff;
 
if (!lines) {
-   c->vc_visible_origin = c->vc_origin;
-   vga_set_mem_top(c);
+   vgacon_restore_screen(c);
return;
}
 
@@ -298,6 +298,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
if (!vgacon_scrollback_cur->save) {
vgacon_cursor(c, CM_ERASE);
vgacon_save_screen(c);
+   c->vc_origin = c->vc_screenbuf;
vgacon_scrollback_cur->save = 1;
}
 
@@ -335,7 +336,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int copysize;
 
int diff = c->vc_rows - count;
-   void *d = (void *) c->vc_origin;
+   void *d = (void *) c->vc_visible_origin;
void *s = (void *) c->vc_screenbuf;
 
count *= c->vc_size_row;



[PATCH] vgacon: unconfuse vc_origin when using soft scrollback

2019-01-10 Thread Nicolas Pitre
When CONFIG_VGACON_SOFT_SCROLLBACK is selected, the VGA display memory
index and vc_visible_origin don't change when scrollback is activated.
The actual screen content is saved away and the scrollbackdata is copied
over it. However the vt code, and /dev/vcs devices in particular, still
expect vc_origin to always point at the actual screen content not the
displayed scrollback content.

So adjust vc_origin to point at the saved screen content when scrollback
is active and set it back to vc_visible_origin when restoring the screen.

This fixes /dev/vcsa that return scrollback content when they
shouldn't (onli /dev/vcsa without a number should), and also fixes
/dev/vcsu that should return scrollback content when scrollback is
active but currently doesn't.

An unnecessary call to vga_set_mem_top() is also removed.

Signed-off-by: Nicolas Pitre 
Cc: sta...@vger.kernel.org # v4.19+

---

I tagged it for stable starting with v4.19 as this is the kernel that
introduced /dev/vcsu* which is directly affected. Users of earlier kernels
most likely won't care.

diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
index 09731b2f68..a3353a9970 100644
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -271,6 +271,7 @@ static void vgacon_scrollback_update(struct vc_data *c, int 
t, int count)
 
 static void vgacon_restore_screen(struct vc_data *c)
 {
+   c->vc_origin = c->vc_visible_origin;
vgacon_scrollback_cur->save = 0;
 
if (!vga_is_gfx && !vgacon_scrollback_cur->restore) {
@@ -287,8 +288,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int start, end, count, soff;
 
if (!lines) {
-   c->vc_visible_origin = c->vc_origin;
-   vga_set_mem_top(c);
+   vgacon_restore_screen(c);
return;
}
 
@@ -298,6 +298,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
if (!vgacon_scrollback_cur->save) {
vgacon_cursor(c, CM_ERASE);
vgacon_save_screen(c);
+   c->vc_origin = c->vc_screenbuf;
vgacon_scrollback_cur->save = 1;
}
 
@@ -335,7 +336,7 @@ static void vgacon_scrolldelta(struct vc_data *c, int lines)
int copysize;
 
int diff = c->vc_rows - count;
-   void *d = (void *) c->vc_origin;
+   void *d = (void *) c->vc_visible_origin;
void *s = (void *) c->vc_screenbuf;
 
count *= c->vc_size_row;


[PATCH 7/6] vcs: restore and document initial POLLPRI event

2019-01-09 Thread Nicolas Pitre
Restore and document the forced initial POLLPRI event reporting when
poll() is used for the first time. This used to be the implemented
behavior before recent changes. Because of the way poll() is implemented,
this prevents losing an event happening between the last read() and the
first poll() invocation.

Since poll() for /dev/vcs* was not always supported, user space probes
for its availability as follows:

int fd = open("/dev/vcsa", O_RDONLY);
struct pollfd p = { .fd = fd, .events = POLLPRI };
available = (poll(, 1, 0) == 1);

Semantically, it makes sense to signal the first event as such even if
it might be spurious. The screen could be modified, and modified back
to its initial state before we get to read it, so users must be prepared
for that anyway.

Signed-off-by: Nicolas Pitre 

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index 1d887113ff..160f46115a 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -140,6 +140,15 @@ vcs_poll_data_get(struct file *file)
poll->cons_num = console(file_inode(file));
init_waitqueue_head(>waitq);
poll->notifier.notifier_call = vcs_notifier;
+   /*
+* In order not to lose any update event, we must pretend one might
+* have occurred before we have a chance to register our notifier.
+* This is also how user space has come to detect which kernels
+* support POLLPRI on /dev/vcs* devices i.e. using poll() with
+* POLLPRI and a zero timeout.
+*/
+   poll->event = VT_UPDATE;
+
if (register_vt_notifier(>notifier) != 0) {
kfree(poll);
return NULL;


Re: [PATCH 5/6] vcs: poll(): cope with a deallocated vt

2019-01-09 Thread Nicolas Pitre
On Tue, 8 Jan 2019, Nicolas Pitre wrote:

> When VT_DISALLOCATE is used on a vt, user space waiting with poll() on
> the corresponding /dev/vcs device is not awakened. This is now fixed by
> returning POLLHUP|POLLERR to user space.
> 
> Also, in the normal screen update case, we don't set POLLERR anymore as
> POLLPRI alone is a much more logical response in a non-error situation,
> saving some confusion on the user space side. The only known user app
> making use of poll() on /dev/vcs* is BRLTTY which is known to cope with
> that change already, so the risk of breakage is pretty much nonexistent.
> 
> Signed-off-by: Nicolas Pitre 

That patch introduced a small unwanted behavior change that I intend to 
fix in a follow-up patch (it will be tagged [PATCH 7/6]). I prefer to go 
with a separate patch rather than respinning this one as this gives me 
the opportunity to separately document said behavior.

Nicolas


[PATCH 5/6] vcs: poll(): cope with a deallocated vt

2019-01-08 Thread Nicolas Pitre
When VT_DISALLOCATE is used on a vt, user space waiting with poll() on
the corresponding /dev/vcs device is not awakened. This is now fixed by
returning POLLHUP|POLLERR to user space.

Also, in the normal screen update case, we don't set POLLERR anymore as
POLLPRI alone is a much more logical response in a non-error situation,
saving some confusion on the user space side. The only known user app
making use of poll() on /dev/vcs* is BRLTTY which is known to cope with
that change already, so the risk of breakage is pretty much nonexistent.

Signed-off-by: Nicolas Pitre 
---
 drivers/tty/vt/vc_screen.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index 3dba60825c..1bbe2a30cd 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -80,7 +80,7 @@
 struct vcs_poll_data {
struct notifier_block notifier;
unsigned int cons_num;
-   bool seen_last_update;
+   int event;
wait_queue_head_t waitq;
struct fasync_struct *fasync;
 };
@@ -94,7 +94,7 @@ vcs_notifier(struct notifier_block *nb, unsigned long code, 
void *_param)
container_of(nb, struct vcs_poll_data, notifier);
int currcons = poll->cons_num;
 
-   if (code != VT_UPDATE)
+   if (code != VT_UPDATE && code != VT_DEALLOCATE)
return NOTIFY_DONE;
 
if (currcons == 0)
@@ -104,7 +104,7 @@ vcs_notifier(struct notifier_block *nb, unsigned long code, 
void *_param)
if (currcons != vc->vc_num)
return NOTIFY_DONE;
 
-   poll->seen_last_update = false;
+   poll->event = code;
wake_up_interruptible(>waitq);
kill_fasync(>fasync, SIGIO, POLL_IN);
return NOTIFY_OK;
@@ -261,7 +261,7 @@ vcs_read(struct file *file, char __user *buf, size_t count, 
loff_t *ppos)
 
poll = file->private_data;
if (count && poll)
-   poll->seen_last_update = true;
+   poll->event = 0;
read = 0;
ret = 0;
while (count) {
@@ -616,12 +616,21 @@ static __poll_t
 vcs_poll(struct file *file, poll_table *wait)
 {
struct vcs_poll_data *poll = vcs_poll_data_get(file);
-   __poll_t ret = DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
+   __poll_t ret = DEFAULT_POLLMASK|EPOLLERR;
 
if (poll) {
poll_wait(file, >waitq, wait);
-   if (poll->seen_last_update)
+   switch (poll->event) {
+   case VT_UPDATE:
+   ret = DEFAULT_POLLMASK|EPOLLPRI;
+   break;
+   case VT_DEALLOCATE:
+   ret = DEFAULT_POLLMASK|EPOLLHUP|EPOLLERR;
+   break;
+   case 0:
ret = DEFAULT_POLLMASK;
+   break;
+   }
}
return ret;
 }
-- 
2.20.1



[PATCH 6/6] vcs: fasync(): make it consistent with poll()

2019-01-08 Thread Nicolas Pitre
We use POLLPRI not POLLIN to wait for data with poll() as there is
never any incoming data stream per se. Let's use the same semantic
with fasync() for consistency, including the fact that a vt may go away.

No known user space ever relied on the SIGIO reason so far, let alone
FASYNC, so the risk of breakage is pretty much nonexistent.

Signed-off-by: Nicolas Pitre 
---
 drivers/tty/vt/vc_screen.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index 1bbe2a30cd..1d887113ff 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -93,9 +93,18 @@ vcs_notifier(struct notifier_block *nb, unsigned long code, 
void *_param)
struct vcs_poll_data *poll =
container_of(nb, struct vcs_poll_data, notifier);
int currcons = poll->cons_num;
-
-   if (code != VT_UPDATE && code != VT_DEALLOCATE)
+   int fa_band;
+
+   switch (code) {
+   case VT_UPDATE:
+   fa_band = POLL_PRI;
+   break;
+   case VT_DEALLOCATE:
+   fa_band = POLL_HUP;
+   break;
+   default:
return NOTIFY_DONE;
+   }
 
if (currcons == 0)
currcons = fg_console;
@@ -106,7 +115,7 @@ vcs_notifier(struct notifier_block *nb, unsigned long code, 
void *_param)
 
poll->event = code;
wake_up_interruptible(>waitq);
-   kill_fasync(>fasync, SIGIO, POLL_IN);
+   kill_fasync(>fasync, SIGIO, fa_band);
return NOTIFY_OK;
 }
 
-- 
2.20.1



[PATCH 4/6] vcsa: clamp header values when they don't fit

2019-01-08 Thread Nicolas Pitre
The /dev/vcsa* devices have a fixed char-sized header that stores the
screen geometry and cursor location. Let's make sure it doesn't contain
random garbage when those values exceed 255. If ever it becomes necessary
to convey larger screen info to user space then a larger header in the
not-yet-implemented /dev/vcsua* devices should be considered.

Signed-off-by: Nicolas Pitre 
---
 drivers/tty/vt/vc_screen.c | 5 +++--
 drivers/tty/vt/vt.c| 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c
index 2384ea85ff..3dba60825c 100644
--- a/drivers/tty/vt/vc_screen.c
+++ b/drivers/tty/vt/vc_screen.c
@@ -335,8 +335,9 @@ vcs_read(struct file *file, char __user *buf, size_t count, 
loff_t *ppos)
if (p < HEADER_SIZE) {
size_t tmp_count;
 
-   con_buf0[0] = (char)vc->vc_rows;
-   con_buf0[1] = (char)vc->vc_cols;
+   /* clamp header values if they don't fit */
+   con_buf0[0] = min(vc->vc_rows, 0xFFu);
+   con_buf0[1] = min(vc->vc_cols, 0xFFu);
getconsxy(vc, con_buf0 + 2);
 
con_buf_start += p;
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index bba75560d1..f519c22e70 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -4591,8 +4591,9 @@ EXPORT_SYMBOL_GPL(screen_pos);
 
 void getconsxy(struct vc_data *vc, unsigned char *p)
 {
-   p[0] = vc->vc_x;
-   p[1] = vc->vc_y;
+   /* clamp values if they don't fit */
+   p[0] = min(vc->vc_x, 0xFFu);
+   p[1] = min(vc->vc_y, 0xFFu);
 }
 
 void putconsxy(struct vc_data *vc, unsigned char *p)
-- 
2.20.1



[PATCH 0/6] assorted console/vt/vcs fixes

2019-01-08 Thread Nicolas Pitre
Hello Greg,

Here's a few patches fixing various issues. The first 2 are most
certainly stable material. I don't know if the other issues are serious enough 
to be stable worthy though.
Please feel free to tag them as such if so.


Nicolas



  1   2   3   4   5   6   7   8   9   10   >