[PATCH RFC rebase 0/9] powerpc barrier_nospec

2018-03-15 Thread Michal Suchanek
Yes, it is good idea to add some commit messages.

Also I rebased the patches on top v3 of series

Setup RFI flush after PowerVM LPM migration

Thanks

Michal

Michal Suchanek (9):
  powerpc: Add barrier_nospec
  powerpc: Use barrier_nospec in copy_from_user
  powerpc/64: Use barrier_nospec in syscall entry
  powerpc/64s: Use barrier_nospec in RFI_FLUSH_SLOT
  powerpc/64s: Add support for ori barrier_nospec patching
  powerpc/64: Patch barrier_nospec in modules
  powerpc/64: barrier_nospec: Add debugfs trigger
  powerpc/64s: barrier_nospec: Add hcall triggerr
  powerpc/64: barrier_nospec: Add commandline trigger

 arch/powerpc/include/asm/barrier.h|  9 
 arch/powerpc/include/asm/exception-64s.h  |  2 +-
 arch/powerpc/include/asm/feature-fixups.h |  9 
 arch/powerpc/include/asm/setup.h  | 11 
 arch/powerpc/include/asm/uaccess.h| 11 +++-
 arch/powerpc/kernel/entry_64.S|  3 ++
 arch/powerpc/kernel/module.c  |  6 +++
 arch/powerpc/kernel/setup_64.c| 87 +++
 arch/powerpc/kernel/vmlinux.lds.S |  7 +++
 arch/powerpc/lib/feature-fixups.c | 47 ++---
 arch/powerpc/platforms/pseries/mobility.c |  2 +-
 arch/powerpc/platforms/pseries/pseries.h  |  2 +-
 arch/powerpc/platforms/pseries/setup.c| 37 +
 13 files changed, 213 insertions(+), 20 deletions(-)

-- 
2.13.6



[PATCH RFC rebase 3/9] powerpc/64: Use barrier_nospec in syscall entry

2018-03-15 Thread Michal Suchanek
On powerpc syscall entry is done in assembly so patch in an explicit
barrier_nospec.

Signed-off-by: Michal Suchanek <msucha...@suse.de>
---
 arch/powerpc/kernel/entry_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2cb5109a7ea3..7bfc4cf48af2 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #ifdef CONFIG_PPC_BOOK3S
 #include 
@@ -159,6 +160,7 @@ system_call:/* label this so stack 
traces look sane */
andi.   r11,r10,_TIF_SYSCALL_DOTRACE
bne .Lsyscall_dotrace   /* does not return */
cmpldi  0,r0,NR_syscalls
+   barrier_nospec
bge-.Lsyscall_enosys
 
 .Lsyscall:
@@ -319,6 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld  r10,TI_FLAGS(r10)
 
cmpldi  r0,NR_syscalls
+   barrier_nospec
blt+.Lsyscall
 
/* Return code is already in r3 thanks to do_syscall_trace_enter() */
-- 
2.13.6



[PATCH RFC rebase 7/9] powerpc/64: barrier_nospec: Add debugfs trigger

2018-03-15 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek <msucha...@suse.de>
---
 arch/powerpc/kernel/setup_64.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f60e0e3b5ad2..f6678a7b6114 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -963,6 +963,41 @@ static __init int rfi_flush_debugfs_init(void)
return 0;
 }
 device_initcall(rfi_flush_debugfs_init);
+
+static int barrier_nospec_set(void *data, u64 val)
+{
+   switch (val) {
+   case 0:
+   case 1:
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (!!val == !!barrier_nospec_enabled)
+   return 0;
+
+   barrier_nospec_enable(!!val);
+
+   return 0;
+}
+
+static int barrier_nospec_get(void *data, u64 *val)
+{
+   *val = barrier_nospec_enabled ? 1 : 0;
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec,
+   barrier_nospec_get, barrier_nospec_set, "%llu\n");
+
+static __init int barrier_nospec_debugfs_init(void)
+{
+   debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL,
+   _barrier_nospec);
+   return 0;
+}
+device_initcall(barrier_nospec_debugfs_init);
 #endif
 
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
-- 
2.13.6



Re: doing lots of disk writes causes oom killer to kill processes

2013-10-09 Thread Michal Suchanek
Hello,

On 19 September 2013 12:13, Jan Kara  wrote:
> On Wed 18-09-13 16:56:08, Michal Suchanek wrote:
>> On 17 September 2013 23:13, Jan Kara  wrote:
>> >   Hello,
>>
>> The default for dirty_ratio/dirty_background_ratio is 60/40. Setting
>   Ah, that's not upstream default. Upstream has 20/10. In SLES we use 40/10
> to better accomodate some workloads but 60/40 on 8 GB machines with
> SATA drive really seems too much. That is going to give memory management a
> headache.
>
> The problem is that a good SATA drive can do ~100 MB/s if we are
> lucky and IO is sequential. Thus if you have 5 GB of dirty data to write,
> it takes 50s at best to write it, with more random IO to image file it can
> well take several minutes to write. That may cause some increased latency
> when memory reclaim waits for writeback to clean some pages.
>
>> these to 5/2 gives about the same result as running the script that
>> syncs every 5s. Setting to 30/10 gives larger data chunks and
>> intermittent lockup before every chunk is written.
>>
>> It is quite possible to set kernel parameters that kill the kernel but
>>
>> 1) this is the default
>   Not upstream one so you should raise this with Debian I guess. 60/40
> looks way out of reasonable range for todays machines.
>
>> 2) the parameter is set in units that do not prevent the issue in
>> general (% RAM vs #blocks)
>   You can set the number of bytes instead of percentage -
> /proc/sys/vm/dirty_bytes / dirty_background_bytes. It's just that proper
> sizing depends on amount of memory, storage HW, workload. So it's more an
> administrative task to set this tunable properly.
>
>> 3) WTH is the system doing? It's 4core 3GHz cpu so it can handle
>> traversing a structure holding 800M data in the background. Something
>> is seriously rotten somewhere.
>   Likely processes are waiting in direct reclaim for IO to finish. But that
> is just guessing. Try running attached script (forgot to attach it to
> previous email). You will need systemtap and kernel debuginfo installed.
> The script doesn't work with all versions of systemtap (as it is sadly a
> moving target) so if it fails, tell me your version of systemtap and I'll
> update the script accordingly.

This was fixed for me by the patch posted earlier by Hillf Danton so I
guess this answers what the system was (not) doing:

--- a/mm/vmscan.c Wed Sep 18 08:44:08 2013
+++ b/mm/vmscan.c Wed Sep 18 09:31:34 2013
@@ -1543,8 +1543,11 @@ shrink_inactive_list(unsigned long nr_to
  * implies that pages are cycling through the LRU faster than
  * they are written so also forcibly stall.
  */
- if (nr_unqueued_dirty == nr_taken || nr_immediate)
+ if (nr_unqueued_dirty == nr_taken || nr_immediate) {
+ if (current_is_kswapd())
+ wakeup_flusher_threads(0, WB_REASON_TRY_TO_FREE_PAGES);
  congestion_wait(BLK_RW_ASYNC, HZ/10);
+ }
  }

  /*

Also 75485363 is hopefully addressing this issue in mainline.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-09-17 Thread Michal Suchanek
On 5 September 2013 12:12, Michal Suchanek  wrote:
> Hello
>
> On 26 August 2013 15:51, Michal Suchanek  wrote:
>> On 12 March 2013 03:15, Hillf Danton  wrote:
>>>>On 11 March 2013 13:15, Michal Suchanek  wrote:
>>>>>On 8 February 2013 17:31, Michal Suchanek  wrote:
>>>>> Hello,
>>>>>
>>>>> I am dealing with VM disk images and performing something like wiping
>>>>> free space to prepare image for compressing and storing on server or
>>>>> copying it to external USB disk causes
>>>>>
>>>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>>>>> are 100% used by system and the machine is basicaly unusable
>>>>>
>>>>> 2) oom killer killing processes
>>>>>
>>>>> This all on system with 8G ram so there should be plenty space to work 
>>>>> with.
>>>>>
>>>>> This happens with kernels 3.6.4 or 3.7.1
>>>>>
>>>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>>>>> problem even with less ram.
>>>>>
>>>>> I have  vm.swappiness = 0 set for a long  time already.
>>>>>
>>>>>
>>>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>>>>kernel still causes all cores to loop somewhere in system when writing
>>>>lots of data to disk.
>>>>
>>>>With swappiness as much as 90 processes still get killed on large disk 
>>>>writes.
>>>>
>>>>Given that the max is 100 the interval in which mm works at all is
>>>>going to be very narrow, less than 10% of the paramater range. This is
>>>>a severe regression as is the cpu time consumed by the kernel.
>>>>
>>>>The io scheduler is the default cfq.
>>>>
>>>>If you have any idea what to try other than downgrading to an earlier
>>>>unaffected kernel I would like to hear.
>>>>
>>> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
>>> deadlock caused by too_many_isolated())?
>>>
>>> Or try 3.8 and/or 3.9, additionally?
>>>
>>
>> Hello,
>>
>> with deadline IO scheduler I experience this issue less often but it
>> still happens.
>>
>> I am on 3.9.6 Debian kernel so 3.8 did not fix this problem.
>>
>> Do you have some idea what to log so that useful information about the
>> lockup is gathered?
>>
>
> This appears to be fixed in vanilla 3.11 kernel.
>
> I still get short intermittent lockups and cpu usage spikes up to 20%
> on a core but nowhere near the minute+ long lockups with all cores
> 100% on earlier kernels.
>

So I did more testing on the 3.11 kernel and while it works OK with
tar you can get severe lockups with mc or kvm. The difference is
probably the fact that sane tools do fsync() on files they close
forcing the file to write out and the kernel returning possible write
errors before they move on to next file.

With kvm writing to a file used as virtual disk the system would stall
indefinitely until the disk driver in the emulated system would time
out, return disk IO error, and the emulated system would stop writing.
In top I see all CPU cores 90%+ in wait. System is unusable. With mc
the lockups would be indefinite, probably because there is no timeout
on writing a file in mc.

I tried tuning swappiness and eleveators but the the basic problem is
solved by neither: the dirty buffers fill up memory and system stalls
trying to resolve the situation.

Obviously the kernel puts off writing any dirty buffers until the
memory pressure is overwhelming and the vmm flops.

At least the OOM killer does not get invoked anymore since there is
lots of memory - just Linux does not know how to use it.

The solution to this problem is quite simple - use the ancient
userspace bdflushd or what it was called. I emulate it with
{ while true ; do sleep 5; sync ; done } &

The system performance suddenly increases - to the awesome Debian stable levels.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-09-17 Thread Michal Suchanek
On 17 September 2013 23:13, Jan Kara  wrote:
>   Hello,
>
> On Tue 17-09-13 15:31:31, Michal Suchanek wrote:
>> On 5 September 2013 12:12, Michal Suchanek  wrote:
>> > On 26 August 2013 15:51, Michal Suchanek  wrote:
>> >> On 12 March 2013 03:15, Hillf Danton  wrote:
>> >>>>On 11 March 2013 13:15, Michal Suchanek  wrote:
>> >>>>>On 8 February 2013 17:31, Michal Suchanek  wrote:
>> >>>>> Hello,
>> >>>>>
>> >>>>> I am dealing with VM disk images and performing something like wiping
>> >>>>> free space to prepare image for compressing and storing on server or
>> >>>>> copying it to external USB disk causes
>> >>>>>
>> >>>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>> >>>>> are 100% used by system and the machine is basicaly unusable
>> >>>>>
>> >>>>> 2) oom killer killing processes
>> >>>>>
>> >>>>> This all on system with 8G ram so there should be plenty space to work 
>> >>>>> with.
>> >>>>>
>> >>>>> This happens with kernels 3.6.4 or 3.7.1
>> >>>>>
>> >>>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>> >>>>> problem even with less ram.
>> >>>>>
>> >>>>> I have  vm.swappiness = 0 set for a long  time already.
>> >>>>>
>> >>>>>
>> >>>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>> >>>>kernel still causes all cores to loop somewhere in system when writing
>> >>>>lots of data to disk.
>> >>>>
>> >>>>With swappiness as much as 90 processes still get killed on large disk 
>> >>>>writes.
>> >>>>
>> >>>>Given that the max is 100 the interval in which mm works at all is
>> >>>>going to be very narrow, less than 10% of the paramater range. This is
>> >>>>a severe regression as is the cpu time consumed by the kernel.
>> >>>>
>> >>>>The io scheduler is the default cfq.
>> >>>>
>> >>>>If you have any idea what to try other than downgrading to an earlier
>> >>>>unaffected kernel I would like to hear.
>> >>>>
>> >>> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
>> >>> deadlock caused by too_many_isolated())?
>> >>>
>> >>> Or try 3.8 and/or 3.9, additionally?
>> >>>
>> >>
>> >> Hello,
>> >>
>> >> with deadline IO scheduler I experience this issue less often but it
>> >> still happens.
>> >>
>> >> I am on 3.9.6 Debian kernel so 3.8 did not fix this problem.
>> >>
>> >> Do you have some idea what to log so that useful information about the
>> >> lockup is gathered?
>> >>
>> >
>> > This appears to be fixed in vanilla 3.11 kernel.
>> >
>> > I still get short intermittent lockups and cpu usage spikes up to 20%
>> > on a core but nowhere near the minute+ long lockups with all cores
>> > 100% on earlier kernels.
>> >
>>
>> So I did more testing on the 3.11 kernel and while it works OK with
>> tar you can get severe lockups with mc or kvm. The difference is
>> probably the fact that sane tools do fsync() on files they close
>> forcing the file to write out and the kernel returning possible write
>> errors before they move on to next file.
>   Sorry for chiming in a bit late. But is this really writing to a normal
> disk? SATA drive or something else?

It's a LVM volume on a SATA drive. I sometimes use USB disks as well
but most of the time it's SATA or eSATA.

>
>> With kvm writing to a file used as virtual disk the system would stall
>> indefinitely until the disk driver in the emulated system would time
>> out, return disk IO error, and the emulated system would stop writing.
>> In top I see all CPU cores 90%+ in wait. System is unusable. With mc
>> the lockups would be indefinite, probably because there is no timeout
>> on writing a file in mc.
>>
>> I tried tuning swappiness and eleveators but the the basic problem is
>> solved by neither: the dirty buffers fill up memory and system stalls
>> trying to resolve the situation.
>   This is really strange. There is /proc/sys/vm/dirty_ratio, which limits
> amount of dirty memory. By default it is set to 20% of memory which tends
> to be too much for 8 GB machine. Can you set it to something like 5% and
> /proc/sys/vm/dirty_background_ratio to 2%? That would be more appropriate
> sizing (assuming standard SATA drive). Does it change anything?

I can try that but I don't really mind if the kernel uses 2G ram for
buffers. The problem is it cannot manage those buffers. Does some
kernel structure grow out of proportion when the buffers reach this
size or something?

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-09-18 Thread Michal Suchanek
On 17 September 2013 23:13, Jan Kara  wrote:
>   Hello,
>
> On Tue 17-09-13 15:31:31, Michal Suchanek wrote:
>> On 5 September 2013 12:12, Michal Suchanek  wrote:
>> > On 26 August 2013 15:51, Michal Suchanek  wrote:
>> >> On 12 March 2013 03:15, Hillf Danton  wrote:
>> >>>>On 11 March 2013 13:15, Michal Suchanek  wrote:
>> >>>>>On 8 February 2013 17:31, Michal Suchanek  wrote:
>> >>>>> Hello,
>> >>>>>
>> >>>>> I am dealing with VM disk images and performing something like wiping
>> >>>>> free space to prepare image for compressing and storing on server or
>> >>>>> copying it to external USB disk causes
>> >>>>>
>> >>>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>> >>>>> are 100% used by system and the machine is basicaly unusable
>> >>>>>
>> >>>>> 2) oom killer killing processes
>> >>>>>
>> >>>>> This all on system with 8G ram so there should be plenty space to work 
>> >>>>> with.
>> >>>>>
>> >>>>> This happens with kernels 3.6.4 or 3.7.1
>> >>>>>
>> >>>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>> >>>>> problem even with less ram.
>> >>>>>
>> >>>>> I have  vm.swappiness = 0 set for a long  time already.
>> >>>>>
>> >>>>>
>> >>>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>> >>>>kernel still causes all cores to loop somewhere in system when writing
>> >>>>lots of data to disk.
>> >>>>
>> >>>>With swappiness as much as 90 processes still get killed on large disk 
>> >>>>writes.
>> >>>>
>> >>>>Given that the max is 100 the interval in which mm works at all is
>> >>>>going to be very narrow, less than 10% of the paramater range. This is
>> >>>>a severe regression as is the cpu time consumed by the kernel.
>> >>>>
>> >>>>The io scheduler is the default cfq.
>> >>>>
>> >>>>If you have any idea what to try other than downgrading to an earlier
>> >>>>unaffected kernel I would like to hear.
>> >>>>
>> >>> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
>> >>> deadlock caused by too_many_isolated())?
>> >>>
>> >>> Or try 3.8 and/or 3.9, additionally?
>> >>>
>> >>
>> >> Hello,
>> >>
>> >> with deadline IO scheduler I experience this issue less often but it
>> >> still happens.
>> >>
>> >> I am on 3.9.6 Debian kernel so 3.8 did not fix this problem.
>> >>
>> >> Do you have some idea what to log so that useful information about the
>> >> lockup is gathered?
>> >>
>> >
>> > This appears to be fixed in vanilla 3.11 kernel.
>> >
>> > I still get short intermittent lockups and cpu usage spikes up to 20%
>> > on a core but nowhere near the minute+ long lockups with all cores
>> > 100% on earlier kernels.
>> >
>>
>> So I did more testing on the 3.11 kernel and while it works OK with
>> tar you can get severe lockups with mc or kvm. The difference is
>> probably the fact that sane tools do fsync() on files they close
>> forcing the file to write out and the kernel returning possible write
>> errors before they move on to next file.
>   Sorry for chiming in a bit late. But is this really writing to a normal
> disk? SATA drive or something else?
>
>> With kvm writing to a file used as virtual disk the system would stall
>> indefinitely until the disk driver in the emulated system would time
>> out, return disk IO error, and the emulated system would stop writing.
>> In top I see all CPU cores 90%+ in wait. System is unusable. With mc
>> the lockups would be indefinite, probably because there is no timeout
>> on writing a file in mc.
>>
>> I tried tuning swappiness and eleveators but the the basic problem is
>> solved by neither: the dirty buffers fill up memory and system stalls
>> trying to resolve the situation.
>   This is really strange. There is /proc/sys/vm/dirty_ratio, which limits
> amount of dirty memory. By default it is set to 20% of memory which tends
> to be too much for 8 GB machine. Can you set it to something like 5% and
> /proc/sys/vm/dirty_background_ratio to 2%? That would be more appropriate
> sizing (assuming standard SATA drive). Does it change anything?

The default for dirty_ratio/dirty_background_ratio is 60/40. Setting
these to 5/2 gives about the same result as running the script that
syncs every 5s. Setting to 30/10 gives larger data chunks and
intermittent lockup before every chunk is written.

It is quite possible to set kernel parameters that kill the kernel but

1) this is the default
2) the parameter is set in units that do not prevent the issue in
general (% RAM vs #blocks)
3) WTH is the system doing? It's 4core 3GHz cpu so it can handle
traversing a structure holding 800M data in the background. Something
is seriously rotten somewhere.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-10-15 Thread Michal Suchanek
On 9 October 2013 16:19, Michal Suchanek  wrote:
> Hello,
>
> On 19 September 2013 12:13, Jan Kara  wrote:
>> On Wed 18-09-13 16:56:08, Michal Suchanek wrote:
>>> On 17 September 2013 23:13, Jan Kara  wrote:
>>> >   Hello,
>>>
>>> The default for dirty_ratio/dirty_background_ratio is 60/40. Setting
>>   Ah, that's not upstream default. Upstream has 20/10. In SLES we use 40/10
>> to better accomodate some workloads but 60/40 on 8 GB machines with
>> SATA drive really seems too much. That is going to give memory management a
>> headache.
>>
>> The problem is that a good SATA drive can do ~100 MB/s if we are
>> lucky and IO is sequential. Thus if you have 5 GB of dirty data to write,
>> it takes 50s at best to write it, with more random IO to image file it can
>> well take several minutes to write. That may cause some increased latency
>> when memory reclaim waits for writeback to clean some pages.
>>
>>> these to 5/2 gives about the same result as running the script that
>>> syncs every 5s. Setting to 30/10 gives larger data chunks and
>>> intermittent lockup before every chunk is written.
>>>
>>> It is quite possible to set kernel parameters that kill the kernel but
>>>
>>> 1) this is the default
>>   Not upstream one so you should raise this with Debian I guess. 60/40
>> looks way out of reasonable range for todays machines.
>>
>>> 2) the parameter is set in units that do not prevent the issue in
>>> general (% RAM vs #blocks)
>>   You can set the number of bytes instead of percentage -
>> /proc/sys/vm/dirty_bytes / dirty_background_bytes. It's just that proper
>> sizing depends on amount of memory, storage HW, workload. So it's more an
>> administrative task to set this tunable properly.
>>
>>> 3) WTH is the system doing? It's 4core 3GHz cpu so it can handle
>>> traversing a structure holding 800M data in the background. Something
>>> is seriously rotten somewhere.
>>   Likely processes are waiting in direct reclaim for IO to finish. But that
>> is just guessing. Try running attached script (forgot to attach it to
>> previous email). You will need systemtap and kernel debuginfo installed.
>> The script doesn't work with all versions of systemtap (as it is sadly a
>> moving target) so if it fails, tell me your version of systemtap and I'll
>> update the script accordingly.
>
> This was fixed for me by the patch posted earlier by Hillf Danton so I
> guess this answers what the system was (not) doing:
>
> --- a/mm/vmscan.c Wed Sep 18 08:44:08 2013
> +++ b/mm/vmscan.c Wed Sep 18 09:31:34 2013
> @@ -1543,8 +1543,11 @@ shrink_inactive_list(unsigned long nr_to
>   * implies that pages are cycling through the LRU faster than
>   * they are written so also forcibly stall.
>   */
> - if (nr_unqueued_dirty == nr_taken || nr_immediate)
> + if (nr_unqueued_dirty == nr_taken || nr_immediate) {
> + if (current_is_kswapd())
> + wakeup_flusher_threads(0, WB_REASON_TRY_TO_FREE_PAGES);
>   congestion_wait(BLK_RW_ASYNC, HZ/10);
> + }
>   }
>
>   /*
>
> Also 75485363 is hopefully addressing this issue in mainline.
>

Actually, this was in 3.11 already and it did make the behaviour a bit
better but was not enough.

So is something like the vmscan.c patch going to make it into the
mainline kernel?

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-09-20 Thread Michal Suchanek
Hello,

On 19 September 2013 10:07, Hillf Danton  wrote:
> Hello Michal
>
> Take it easy please, the kernel is made by human hands.
>
> Can you please try the diff(and sorry if mail agent reformats it)?
>
> Best Regards
> Hillf
>
>
> --- a/mm/vmscan.c Wed Sep 18 08:44:08 2013
> +++ b/mm/vmscan.c Wed Sep 18 09:31:34 2013
> @@ -1543,8 +1543,11 @@ shrink_inactive_list(unsigned long nr_to
>   * implies that pages are cycling through the LRU faster than
>   * they are written so also forcibly stall.
>   */
> - if (nr_unqueued_dirty == nr_taken || nr_immediate)
> + if (nr_unqueued_dirty == nr_taken || nr_immediate) {
> + if (current_is_kswapd())
> + wakeup_flusher_threads(0, WB_REASON_TRY_TO_FREE_PAGES);
>   congestion_wait(BLK_RW_ASYNC, HZ/10);
> + }
>   }
>
>   /*
> --

I applied the patch and raised the dirty block ratios to 30/10 and the
default 60/40 while imaging a VM and did not observe any problems so I
guess this solves it.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-09-05 Thread Michal Suchanek
Hello

On 26 August 2013 15:51, Michal Suchanek  wrote:
> On 12 March 2013 03:15, Hillf Danton  wrote:
>>>On 11 March 2013 13:15, Michal Suchanek  wrote:
>>>>On 8 February 2013 17:31, Michal Suchanek  wrote:
>>>> Hello,
>>>>
>>>> I am dealing with VM disk images and performing something like wiping
>>>> free space to prepare image for compressing and storing on server or
>>>> copying it to external USB disk causes
>>>>
>>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>>>> are 100% used by system and the machine is basicaly unusable
>>>>
>>>> 2) oom killer killing processes
>>>>
>>>> This all on system with 8G ram so there should be plenty space to work 
>>>> with.
>>>>
>>>> This happens with kernels 3.6.4 or 3.7.1
>>>>
>>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>>>> problem even with less ram.
>>>>
>>>> I have  vm.swappiness = 0 set for a long  time already.
>>>>
>>>>
>>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>>>kernel still causes all cores to loop somewhere in system when writing
>>>lots of data to disk.
>>>
>>>With swappiness as much as 90 processes still get killed on large disk 
>>>writes.
>>>
>>>Given that the max is 100 the interval in which mm works at all is
>>>going to be very narrow, less than 10% of the paramater range. This is
>>>a severe regression as is the cpu time consumed by the kernel.
>>>
>>>The io scheduler is the default cfq.
>>>
>>>If you have any idea what to try other than downgrading to an earlier
>>>unaffected kernel I would like to hear.
>>>
>> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
>> deadlock caused by too_many_isolated())?
>>
>> Or try 3.8 and/or 3.9, additionally?
>>
>
> Hello,
>
> with deadline IO scheduler I experience this issue less often but it
> still happens.
>
> I am on 3.9.6 Debian kernel so 3.8 did not fix this problem.
>
> Do you have some idea what to log so that useful information about the
> lockup is gathered?
>

This appears to be fixed in vanilla 3.11 kernel.

I still get short intermittent lockups and cpu usage spikes up to 20%
on a core but nowhere near the minute+ long lockups with all cores
100% on earlier kernels.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: doing lots of disk writes causes oom killer to kill processes

2013-03-12 Thread Michal Suchanek
On 12 March 2013 03:15, Hillf Danton  wrote:
>>On 11 March 2013 13:15, Michal Suchanek  wrote:
>>>On 8 February 2013 17:31, Michal Suchanek  wrote:
>>> Hello,
>>>
>>> I am dealing with VM disk images and performing something like wiping
>>> free space to prepare image for compressing and storing on server or
>>> copying it to external USB disk causes
>>>
>>> 1) system lockup in order of a few tens of seconds when all CPU cores
>>> are 100% used by system and the machine is basicaly unusable
>>>
>>> 2) oom killer killing processes
>>>
>>> This all on system with 8G ram so there should be plenty space to work with.
>>>
>>> This happens with kernels 3.6.4 or 3.7.1
>>>
>>> With earlier kernel versions (some 3.0 or 3.2 kernels) this was not a
>>> problem even with less ram.
>>>
>>> I have  vm.swappiness = 0 set for a long  time already.
>>>
>>>
>>I did some testing with 3.7.1 and with swappiness as much as 75 the
>>kernel still causes all cores to loop somewhere in system when writing
>>lots of data to disk.
>>
>>With swappiness as much as 90 processes still get killed on large disk writes.
>>
>>Given that the max is 100 the interval in which mm works at all is
>>going to be very narrow, less than 10% of the paramater range. This is
>>a severe regression as is the cpu time consumed by the kernel.
>>
>>The io scheduler is the default cfq.
>>
>>If you have any idea what to try other than downgrading to an earlier
>>unaffected kernel I would like to hear.
>>
> Can you try commit 3cf23841b4b7(mm/vmscan.c: avoid possible
> deadlock caused by too_many_isolated())?
>
> Or try 3.8 and/or 3.9, additionally?

Hello,

in the meantime I tried setting io scheduler to deadline because I
remember using that one in my self-built kernels due to cfq breaking
some obscure block driver.

With the deadline io scheduler I can set swappiness back to 0 and the
system works normally even for moderate amount of IO - restoring disk
images from network. This would cause lockups and oom killer running
loose with the cfq scheduler.

So I guess I found what breaks the system and it is not so much the
kernel version. It's using pre-built kernels with the default
scheduler.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] Re: [RFC PATCH 0/9] mtd: nand: add sunxi NAND Flash Controller support

2014-01-29 Thread Michal Suchanek
On 13 January 2014 10:02, boris brezillon  wrote:
> Hi Henrik,
>
>
> On 11/01/2014 22:11, Henrik Nordström wrote:
>>
>>  thanks for pointing out your documents
>>  I'm trying to get the NAND driver with HW ECC (and HW RND)
>> without using DMA at all
>>
>> I tried many things but did not quite get the ECC reading command to
>> return meaningful resuts. But should work somehow.
>>
>>  do you have any other information I could use to do this ?
>>
>> Not really. There is no known code to look at using the nand controller
>> without DMA. All allwinner code uses DMA even the boot ROM (BROM).
>>
>>  For example, I wonder why there are 2 RAM sectors (the
>> driver I found only make use of RAM0)
>>
>> I think it's used during DMA to fetch next sector while the previous one
>> is transferred by DMA. But not sure.
>
>
> Some feedback on my tests:
>
> - I managed to get HW ECC working without any DMA transfer (using CMD = 01):
>   * I only tested the sequential ECC => ECC are stored between 2 data blocks
> (1024 byte)
>   * Non sequential ECC should work if I store ECC bytes in the OOB area too
> (I'll just have
>  to send RANDOM_OUT commands to move to the OOB area before sending the
> ECC
>  cmd and another RANDOM_OUT to go back to the DATA area)
>
> - The HW RND (randomizer) works too, I'll just have to figure out how this
> could be
>   mainlined:
>* using a simple dt property to tell the controller it should enable the
> randomizer
>* provide an interface (like the nand_ecc_ctrl struct ) for other to add
> their own
>   randomizer implementation (this was requested:
> https://lkml.org/lkml/2013/12/13/154)
>
>
> The most complicated part is the boot0 partition.
>
> Tell me if I'm wrong, but here's what I understood from your work (and yuq's
> work too):
>
> boot 0 part properties:
> - uses sequential ECC
> - uses 1024 bytes ECC blocks
> - boot0 code is stored only on the first ECC block of each page (1024 bytes
> + ecc bytes)
> - boot0 code is stored on the first 64 pages of the first block
> - boot0 uses HW randomizer with a specific rnd seed (0x4a80)
>
> It's not that complicated to read/write from/to boot0, but it's a bit more
> to mainline this
> implementation:
>  - the nand chip must use the same ECC algorithm and ECC layout on the whole
> flash
>(no partition specific config available)
> - you cannot mark some part of pages as unused => the nand driver will write
> the
>   whole page, not just the first ECC block (1024 bytes)
>
> I thought about manually creating an mtd device that fullfils these needs
> (in case we
> encounter the "allwinner,nandn-boot" property on a nand@X node), but I'm not
> sure
> this is the right approach.
>
> Any ideas ?

Maybe if varying parameters on one MTD device is not acceptable you
could export parts of the flash as different MTD devices each with its
own parameters. Since the boot0 part is fixed size this should not
really be an issue. Existing MTD drivers that share hardware with
other devices exist - eg. the MTD driver which exports part of RAM as
MDT device.

I wonder if it would be good idea to make it possible to use the NAND
only for storage without a boot0 area. If this is selected by a DT
parameter as suggested changing the parameter will probably make the
NAND unreadable.

Thanks

Michal

>
>
> Best Regards,
>
> Boris
>
>>
>> Regards
>> Henrik
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "linux-sunxi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to linux-sunxi+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] Re: [RFC PATCH 0/9] mtd: nand: add sunxi NAND Flash Controller support

2014-01-29 Thread Michal Suchanek
On 29 January 2014 16:43, boris brezillon dev  wrote:
> Hello Michal,
>
>
> On 29/01/2014 16:11, Michal Suchanek wrote:
>>
>> On 13 January 2014 10:02, boris brezillon  wrote:
>>>
>>>
>>> boot 0 part properties:
>>> - uses sequential ECC
>>> - uses 1024 bytes ECC blocks
>>> - boot0 code is stored only on the first ECC block of each page (1024
>>> bytes
>>> + ecc bytes)
>>> - boot0 code is stored on the first 64 pages of the first block
>>> - boot0 uses HW randomizer with a specific rnd seed (0x4a80)
>>>
>>> It's not that complicated to read/write from/to boot0, but it's a bit
>>> more
>>> to mainline this
>>> implementation:
>>>   - the nand chip must use the same ECC algorithm and ECC layout on the
>>> whole
>>> flash
>>> (no partition specific config available)
>>> - you cannot mark some part of pages as unused => the nand driver will
>>> write
>>> the
>>>whole page, not just the first ECC block (1024 bytes)
>>>
>>> I thought about manually creating an mtd device that fullfils these needs
>>> (in case we
>>> encounter the "allwinner,nandn-boot" property on a nand@X node), but I'm
>>> not
>>> sure
>>> this is the right approach.
>>>
>>> Any ideas ?
>>
>> Maybe if varying parameters on one MTD device is not acceptable you
>> could export parts of the flash as different MTD devices each with its
>> own parameters. Since the boot0 part is fixed size this should not
>> really be an issue. Existing MTD drivers that share hardware with
>> other devices exist - eg. the MTD driver which exports part of RAM as
>> MDT device.
>
>
> I considered this option (exposing 2 mtd devices which use the
> same nand chip: one for the boot partition and the other one
> for the remaining space).
> I might give it a try.
>
> For the moment I'm trying to use standard partitions and then
> attach one of these partitions as a sunxi-nand-boot-interface.
> Something similar to what UBI is doing when attaching to an MTD
> device.
>
> This way we can use the NAND as a standard MTD dev and when one
> partition is attached as a sunxi-nand-boot-interface you can access
> the boot0 partition using a char dev (/dev/snbi0 ?).
> The sunxi-nand-boot-interface will provide the appropriate abstraction
> to hide the specific boot0 layout...
>
> What do you think ?

If it works with MTD, sure.

The problem the two devices avoid is that with uniform parameters
across MTD device the boot0 partition is invalid.

>
>
>>
>> I wonder if it would be good idea to make it possible to use the NAND
>> only for storage without a boot0 area. If this is selected by a DT
>> parameter as suggested changing the parameter will probably make the
>> NAND unreadable.
>
> Actually the NAND controller supports up to 8 chips. I guess only the
> first one can be used as a boot device.
> Reserving space for the boot partition on all of these chips is kind of
> useless.

This actually depends on the BROM.

I did not read the BROM code so I don't know what it does.

> Moreover, we can't tell if the user wants to boot from the NAND or
> from another storage (MMC for example), in this case we don't need
> to expose the boot0 partition.

It's possible to use the NAND only for storage, sure.

However, a NAND on which the boo0 area is reserved would be unreadable
without reserving boot0 area in the driver, right?

The best we can tell is if user specified to reserve the area in the
DT. It might be possible to verify the boot0 area the same way BROM
does when booting from it. This might be nice option when you don't
know what you have on the chip and want to read it but most of the
time you will want to enforce bootable or non-bootable format when
writing the NAND.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 00/10] net: stmmac: Add sun7i GMAC glue layer

2013-12-06 Thread Michal Suchanek
On 6 December 2013 18:29, Chen-Yu Tsai  wrote:
> Hi,
>
> This patch series adds Allwinner sun7i support to stmmac.
> The Allwinner sun7i SoC A20 integrates an early version of
> dwmac IP from Synopsys. On top of that is a hardware glue
> layer. This layer needs to be configured before the dwmac
> can be used.

...

> Comments?
>
> Thanks,
>
> wens
>
>
> Chen-Yu Tsai (10):
>   net: stmmac: Enable stmmac main clock when probing hardware
>   net: stmmac: Honor DT parameter to force DMA store and forward mode
>   net: stmmac: Use platform data tied with compatible strings
>   net: stmmac: sunxi platfrom extensions for GMAC in Allwinner A20 SoC's
>   ARM: dts: sun7i: Add GMAC controller node to sun7i DTSI
>   ARM: dts: sun7i: Add pin muxing options for the GMAC
>   ARM: dts: sun7i: cubietruck: Enable the GMAC
>   ARM: dts: sun7i: cubieboard2: Enable GMAC instead of EMAC
>   ARM: dts: sun7i: olinuxino-micro: Enable GMAC instead of EMAC
>   ARM: dts: sun7i: Add ethernet alias for GMAC

Tested-By: Michal Suchanek 

Works for me with RGMII and MII phy on top of 3.13rc3.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] Re: [PATCH 3/3] ARM: sunxi: dts: Add ahci support to a few A10 and A20 boards

2013-12-07 Thread Michal Suchanek
On 7 December 2013 12:47, Olliver Schinagl  wrote:
> Hey maxime,
>
> On 06-12-13 19:33, Maxime Ripard wrote:
>>
>> Hi Oliver,
>>
>> On Wed, Dec 04, 2013 at 01:10:55PM +0100, oli...@schinagl.nl wrote:
>>>
>>> From: Oliver Schinagl 
>>>
>>> This patch adds sunxi sata support to A10 and A20 boards that have such
>>> a connector. Some boards also feature a regulator via a GPIO and support
>>> for this is also added.
>>>
>>> Signed-off-by: Olliver Schinagl 
>>
>>
>> Your git setup seems to be pretty uncertain about how your first name is
>> spelled :)
>
> I should have formally mention it to confuse less people,
>
> This is how officially my name is spelled (I left out any 'middle' letters.
> I never really used it as such, as it confuses people and they always write
> it wrong anyway. After years I decided that at least on these patches, I
> should write it down properly (googleability etc in the future). So formally
> it's Olliver 'oliver' M. Schinagl.
>
> And no, I won't share my middle name :p
>
> There! :)
>
>>
>>> ---
>>>   arch/arm/boot/dts/sun4i-a10-cubieboard.dts  | 26
>>> +
>>>   arch/arm/boot/dts/sun4i-a10.dtsi|  9 +
>>>   arch/arm/boot/dts/sun7i-a20-cubieboard2.dts | 26
>>> +
>>>   arch/arm/boot/dts/sun7i-a20-cubietruck.dts  | 26
>>> +
>>>   arch/arm/boot/dts/sun7i-a20-olinuxino-micro.dts | 26
>>> +
>>>   arch/arm/boot/dts/sun7i-a20.dtsi|  9 +
>>>   6 files changed, 122 insertions(+)
>>
>>
>> Could you split this into several patches please?
>
> Yes, appologies, will take care of this! Sorry,
>
> Oliver
>
>>
>> At least one per SoC.
>>
>>> diff --git a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
>>> b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
>>> index 425a7db..b620084 100644
>>> --- a/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
>>> +++ b/arch/arm/boot/dts/sun4i-a10-cubieboard.dts
>>> @@ -42,7 +42,18 @@
>>> };
>>> };
>>>
>>> +   sata: ahci@01c18000 {
>>> +   pwr-supply = <_ahci_5v>;
>>> +   status = "okay";
>>> +   };
>>> +
>>> pinctrl@01c20800 {
>>> +   ahci_pwr_pin: ahci_pwr_pin@0 {
>>
>>
>> Please prefix it with name of the board.
>>
>>> +   allwinner,pins = "PB8";
>>> +   allwinner,function = "gpio_out";
>>> +   allwinner,driver = <0>;
>>> +   allwinner,pull = <0>;
>>> +   };
>>
>>
>> Please add a newline here.
>>
>>> led_pins_cubieboard: led_pins@0 {
>>> allwinner,pins = "PH20", "PH21";
>>> allwinner,function = "gpio_out";
>>> @@ -86,4 +97,19 @@
>>> linux,default-trigger = "heartbeat";
>>> };
>>> };
>>> +
>>> +   regulators {
>>> +   compatible = "simple-bus";
>>> +   pinctrl-names = "default";
>>> +
>>> +   reg_ahci_5v: ahci-5v {
>>> +   compatible = "regulator-fixed";
>>> +   regulator-name = "ahci-5v";
>>> +   regulator-min-microvolt = <500>;
>>> +   regulator-max-microvolt = <500>;
>>> +   pinctrl-0 = <_pwr_pin>;
>>> +   gpio = < 1 8 0>;
>>> +   enable-active-high;
>>> +   };
>>> +   };
>>>   };
>>> diff --git a/arch/arm/boot/dts/sun4i-a10.dtsi
>>> b/arch/arm/boot/dts/sun4i-a10.dtsi
>>> index 4dccdb0..53c6cdb 100644
>>> --- a/arch/arm/boot/dts/sun4i-a10.dtsi
>>> +++ b/arch/arm/boot/dts/sun4i-a10.dtsi
>>> @@ -306,6 +306,15 @@
>>> #size-cells = <0>;
>>> };
>>>
>>> +   sata: ahci@01c18000 {
>>> +   compatible = "allwinner,sun4i-a10-ahci";
>>
>>
>> Please use sun4i-ahci for consistency.
>>
>>> +   reg = <0x01c18000 0x1000>;
>>> +   interrupts = <0 56 1>;
>>
>>
>> The interrupt here doesn't seem right. Is it actually working at all?
>>
>>> +   clocks = <_gates 25>, < 0>;
>>> +   clock-names = "ahb_sata", "pll6_sata";
>>> +   status = "disabled";
>>> +   };
>>> +
>>> intc: interrupt-controller@01c20400 {
>>> compatible = "allwinner,sun4i-ic";
>>> reg = <0x01c20400 0x400>;
>>> diff --git a/arch/arm/boot/dts/sun7i-a20-cubieboard2.dts
>>> b/arch/arm/boot/dts/sun7i-a20-cubieboard2.dts
>>> index 5c51cb8..99c5e78 100644
>>> --- a/arch/arm/boot/dts/sun7i-a20-cubieboard2.dts
>>> +++ b/arch/arm/boot/dts/sun7i-a20-cubieboard2.dts
>>> @@ -34,7 +34,18 @@
>>> };
>>> };
>>>

Re: doing lots of disk writes causes oom killer to kill processes

2014-07-07 Thread Michal Suchanek
On 9 October 2013 16:19, Michal Suchanek  wrote:
> Hello,
>
> On 19 September 2013 12:13, Jan Kara  wrote:
>> On Wed 18-09-13 16:56:08, Michal Suchanek wrote:
>>> On 17 September 2013 23:13, Jan Kara  wrote:
>>> >   Hello,
>>>
>>> The default for dirty_ratio/dirty_background_ratio is 60/40. Setting
>>   Ah, that's not upstream default. Upstream has 20/10. In SLES we use 40/10
>> to better accomodate some workloads but 60/40 on 8 GB machines with
>> SATA drive really seems too much. That is going to give memory management a
>> headache.
>>
>> The problem is that a good SATA drive can do ~100 MB/s if we are
>> lucky and IO is sequential. Thus if you have 5 GB of dirty data to write,
>> it takes 50s at best to write it, with more random IO to image file it can
>> well take several minutes to write. That may cause some increased latency
>> when memory reclaim waits for writeback to clean some pages.
>>
>>> these to 5/2 gives about the same result as running the script that
>>> syncs every 5s. Setting to 30/10 gives larger data chunks and
>>> intermittent lockup before every chunk is written.
>>>
>>> It is quite possible to set kernel parameters that kill the kernel but
>>>
>>> 1) this is the default
>>   Not upstream one so you should raise this with Debian I guess. 60/40
>> looks way out of reasonable range for todays machines.
>>
>>> 2) the parameter is set in units that do not prevent the issue in
>>> general (% RAM vs #blocks)
>>   You can set the number of bytes instead of percentage -
>> /proc/sys/vm/dirty_bytes / dirty_background_bytes. It's just that proper
>> sizing depends on amount of memory, storage HW, workload. So it's more an
>> administrative task to set this tunable properly.
>>
>>> 3) WTH is the system doing? It's 4core 3GHz cpu so it can handle
>>> traversing a structure holding 800M data in the background. Something
>>> is seriously rotten somewhere.
>>   Likely processes are waiting in direct reclaim for IO to finish. But that
>> is just guessing. Try running attached script (forgot to attach it to
>> previous email). You will need systemtap and kernel debuginfo installed.
>> The script doesn't work with all versions of systemtap (as it is sadly a
>> moving target) so if it fails, tell me your version of systemtap and I'll
>> update the script accordingly.
>
> This was fixed for me by the patch posted earlier by Hillf Danton so I
> guess this answers what the system was (not) doing:
>
> --- a/mm/vmscan.c Wed Sep 18 08:44:08 2013
> +++ b/mm/vmscan.c Wed Sep 18 09:31:34 2013
> @@ -1543,8 +1543,11 @@ shrink_inactive_list(unsigned long nr_to
>   * implies that pages are cycling through the LRU faster than
>   * they are written so also forcibly stall.
>   */
> - if (nr_unqueued_dirty == nr_taken || nr_immediate)
> + if (nr_unqueued_dirty == nr_taken || nr_immediate) {
> + if (current_is_kswapd())
> + wakeup_flusher_threads(0, WB_REASON_TRY_TO_FREE_PAGES);
>   congestion_wait(BLK_RW_ASYNC, HZ/10);
> + }
>   }
>
>   /*
>

Hello,

Is this being addressed somehow?

It seems the 3.15 kernel still has this issue  .. unless it happens to
lock up for some other reason in similar situations.

Thanks

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 8/8] powerpc/64: barrier_nospec: Add commandline trigger

2018-03-13 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/setup_64.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4b67b7b877d9..257f0e6be107 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -840,6 +840,14 @@ static int __init handle_no_pti(char *p)
 }
 early_param("nopti", handle_no_pti);
 
+static int __init handle_no_nospec(char *p)
+{
+   pr_info("barrier_nospec: disabled on command line.");
+   no_nospec = true;
+   return 0;
+}
+early_param("no_nospec", handle_no_nospec);
+
 static void do_nothing(void *unused)
 {
/*
-- 
2.13.6



[PATCH RFC 7/8] powerpc/64s: barrier_nospec: Add hcall triggerr

2018-03-13 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/platforms/pseries/setup.c | 38 ++
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 1a527625acf7..b779ddb8e250 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -459,38 +459,50 @@ static void __init find_and_init_phbs(void)
of_pci_check_probe_only();
 }
 
-static void pseries_setup_rfi_flush(void)
+static void pseries_setup_rfi_nospec(void)
 {
struct h_cpu_char_result result;
-   enum l1d_flush_type types;
-   bool enable;
+   enum l1d_flush_type flush_types;
+   enum spec_barrier_type barrier_type;
+   bool flush_enable;
+   bool barrier_enable;
long rc;
 
/* Enable by default */
-   enable = true;
+   flush_enable = true;
+   barrier_enable = true;
+   /* no fallback if the firmware does not tell us */
+   barrier_type = SPEC_BARRIER_NONE;
 
rc = plpar_get_cpu_characteristics();
if (rc == H_SUCCESS) {
-   types = L1D_FLUSH_NONE;
+   flush_types = L1D_FLUSH_NONE;
 
if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
-   types |= L1D_FLUSH_MTTRIG;
+   flush_types |= L1D_FLUSH_MTTRIG;
if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
-   types |= L1D_FLUSH_ORI;
+   flush_types |= L1D_FLUSH_ORI;
+   if (result.character & H_CPU_CHAR_SPEC_BAR_ORI31)
+   barrier_type |= SPEC_BARRIER_ORI;
 
/* Use fallback if nothing set in hcall */
-   if (types == L1D_FLUSH_NONE)
-   types = L1D_FLUSH_FALLBACK;
+   if (flush_types == L1D_FLUSH_NONE)
+   flush_types = L1D_FLUSH_FALLBACK;
 
if ((!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) ||
(!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)))
-   enable = false;
+   flush_enable = false;
+
+   if ((!(result.behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR)) ||
+   (!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)))
+   barrier_enable = false;
} else {
/* Default to fallback if case hcall is not available */
-   types = L1D_FLUSH_FALLBACK;
+   flush_types = L1D_FLUSH_FALLBACK;
}
 
-   setup_rfi_flush(types, enable);
+   setup_barrier_nospec(barrier_type, barrier_enable);
+   setup_rfi_flush(flush_types, flush_enable);
 }
 
 #ifdef CONFIG_PCI_IOV
@@ -666,7 +678,7 @@ static void __init pSeries_setup_arch(void)
 
fwnmi_init();
 
-   pseries_setup_rfi_flush();
+   pseries_setup_rfi_nospec();
 
/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);
-- 
2.13.6



[PATCH RFC 3/8] powerpc/64: Use barrier_nospec in syscall entry

2018-03-13 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/entry_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2cb5109a7ea3..7bfc4cf48af2 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #ifdef CONFIG_PPC_BOOK3S
 #include 
@@ -159,6 +160,7 @@ system_call:/* label this so stack 
traces look sane */
andi.   r11,r10,_TIF_SYSCALL_DOTRACE
bne .Lsyscall_dotrace   /* does not return */
cmpldi  0,r0,NR_syscalls
+   barrier_nospec
bge-.Lsyscall_enosys
 
 .Lsyscall:
@@ -319,6 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld  r10,TI_FLAGS(r10)
 
cmpldi  r0,NR_syscalls
+   barrier_nospec
blt+.Lsyscall
 
/* Return code is already in r3 thanks to do_syscall_trace_enter() */
-- 
2.13.6



[PATCH RFC 6/8] powerpc/64: barrier_nospec: Add debugfs trigger

2018-03-13 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/setup_64.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index d1d9f047161e..4b67b7b877d9 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -955,6 +955,41 @@ static __init int rfi_flush_debugfs_init(void)
return 0;
 }
 device_initcall(rfi_flush_debugfs_init);
+
+static int barrier_nospec_set(void *data, u64 val)
+{
+   switch (val) {
+   case 0:
+   case 1:
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (!!val == !!barrier_nospec_enabled)
+   return 0;
+
+   barrier_nospec_enable(!!val);
+
+   return 0;
+}
+
+static int barrier_nospec_get(void *data, u64 *val)
+{
+   *val = barrier_nospec_enabled ? 1 : 0;
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec,
+   barrier_nospec_get, barrier_nospec_set, "%llu\n");
+
+static __init int barrier_nospec_debugfs_init(void)
+{
+   debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL,
+   _barrier_nospec);
+   return 0;
+}
+device_initcall(barrier_nospec_debugfs_init);
 #endif
 
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
-- 
2.13.6



[PATCH RFC 5/8] powerpc/64: Patch barrier_nospec in modules

2018-03-13 Thread Michal Suchanek
Copypasta from lwsync patching.

Note that unlike RFI which is patched only in kernel the nospec state
reflects settings at the time the module was loaded.

Iterating all modules and re-patching every time the settings change is
not implemented.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/setup.h  |  5 -
 arch/powerpc/kernel/module.c  |  6 ++
 arch/powerpc/kernel/setup_64.c|  4 ++--
 arch/powerpc/lib/feature-fixups.c | 17 ++---
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 486d02e4a310..7e3a41248810 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -58,7 +58,10 @@ enum spec_barrier_type {
 void __init setup_rfi_flush(enum l1d_flush_type, bool enable);
 void do_rfi_flush_fixups(enum l1d_flush_type types);
 void __init setup_barrier_nospec(enum spec_barrier_type, bool enable);
-void do_barrier_nospec_fixups(enum spec_barrier_type type);
+void do_barrier_nospec_fixups_kernel(enum spec_barrier_type type);
+void do_barrier_nospec_fixups(enum spec_barrier_type type,
+ void *start, void *end);
+extern enum spec_barrier_type powerpc_barrier_nospec;
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 3f7ba0f5bf29..7b6d0ec06a21 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -72,6 +72,12 @@ int module_finalize(const Elf_Ehdr *hdr,
do_feature_fixups(powerpc_firmware_features,
  (void *)sect->sh_addr,
  (void *)sect->sh_addr + sect->sh_size);
+
+   sect = find_section(hdr, sechdrs, "__spec_barrier_fixup");
+   if (sect != NULL)
+   do_barrier_nospec_fixups(powerpc_barrier_nospec,
+ (void *)sect->sh_addr,
+ (void *)sect->sh_addr + sect->sh_size);
 #endif
 
sect = find_section(hdr, sechdrs, "__lwsync_fixup");
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 09f21a954bfc..d1d9f047161e 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -909,11 +909,11 @@ void barrier_nospec_enable(bool enable)
 
if (enable) {
powerpc_barrier_nospec = barrier_nospec_type;
-   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   do_barrier_nospec_fixups_kernel(powerpc_barrier_nospec);
on_each_cpu(do_nothing, NULL, 1);
} else {
powerpc_barrier_nospec = SPEC_BARRIER_NONE;
-   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   do_barrier_nospec_fixups_kernel(powerpc_barrier_nospec);
}
 }
 
diff --git a/arch/powerpc/lib/feature-fixups.c 
b/arch/powerpc/lib/feature-fixups.c
index 000e153184ad..b59ebc2215e8 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -156,14 +156,15 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i);
 }
 
-void do_barrier_nospec_fixups(enum spec_barrier_type type)
+void do_barrier_nospec_fixups(enum spec_barrier_type type,
+ void *fixup_start, void *fixup_end)
 {
unsigned int instr, *dest;
long *start, *end;
int i;
 
-   start = PTRRELOC(&__start___spec_barrier_fixup),
-   end = PTRRELOC(&__stop___spec_barrier_fixup);
+   start = fixup_start;
+   end = fixup_end;
 
instr = 0x6000; /* nop */
 
@@ -182,6 +183,16 @@ void do_barrier_nospec_fixups(enum spec_barrier_type type)
printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
 
+void do_barrier_nospec_fixups_kernel(enum spec_barrier_type type)
+{
+   void *start, *end;
+
+   start = PTRRELOC(&__start___spec_barrier_fixup),
+   end = PTRRELOC(&__stop___spec_barrier_fixup);
+
+   do_barrier_nospec_fixups(type, start, end);
+}
+
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
-- 
2.13.6



[PATCH RFC 1/8] powerpc: Add barrier_nospec

2018-03-13 Thread Michal Suchanek
Copypasta from original gmb() and rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/barrier.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 10daa1d56e0a..8e47b3abe405 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -75,6 +75,15 @@ do { 
\
___p1;  \
 })
 
+/* TODO: add patching so this can be disabled */
+/* Prevent speculative execution past this barrier. */
+#define barrier_nospec_asm ori 31,31,0
+#ifdef __ASSEMBLY__
+#define barrier_nospec barrier_nospec_asm
+#else
+#define barrier_nospec() __asm__ __volatile__ 
(stringify_in_c(barrier_nospec_asm) : : :)
+#endif
+
 #include 
 
 #endif /* _ASM_POWERPC_BARRIER_H */
-- 
2.13.6



[PATCH RFC 4/8] powerpc/64s: Add support for ori barrier_nospec

2018-03-13 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/barrier.h|  4 ++--
 arch/powerpc/include/asm/feature-fixups.h |  9 +
 arch/powerpc/include/asm/setup.h  |  8 
 arch/powerpc/kernel/setup_64.c| 29 +
 arch/powerpc/kernel/vmlinux.lds.S |  7 +++
 arch/powerpc/lib/feature-fixups.c | 27 +++
 6 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 8e47b3abe405..4079a95e84c2 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -75,9 +75,9 @@ do {  
\
___p1;  \
 })
 
-/* TODO: add patching so this can be disabled */
 /* Prevent speculative execution past this barrier. */
-#define barrier_nospec_asm ori 31,31,0
+#define barrier_nospec_asm SPEC_BARRIER_FIXUP_SECTION; \
+   nop
 #ifdef __ASSEMBLY__
 #define barrier_nospec barrier_nospec_asm
 #else
diff --git a/arch/powerpc/include/asm/feature-fixups.h 
b/arch/powerpc/include/asm/feature-fixups.h
index 1e82eb3caabd..9d3382618ffd 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -195,11 +195,20 @@ label##3: \
FTR_ENTRY_OFFSET 951b-952b; \
.popsection;
 
+#define SPEC_BARRIER_FIXUP_SECTION \
+953:   \
+   .pushsection __spec_barrier_fixup,"a";  \
+   .align 2;   \
+954:   \
+   FTR_ENTRY_OFFSET 953b-954b; \
+   .popsection;
+
 
 #ifndef __ASSEMBLY__
 #include 
 
 extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+extern long __start___spec_barrier_fixup, __stop___spec_barrier_fixup;
 
 void apply_feature_fixups(void);
 void setup_feature_keys(void);
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 469b7fdc9be4..486d02e4a310 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -49,8 +49,16 @@ enum l1d_flush_type {
L1D_FLUSH_MTTRIG= 0x8,
 };
 
+/* These are bit flags */
+enum spec_barrier_type {
+   SPEC_BARRIER_NONE   = 0x1,
+   SPEC_BARRIER_ORI= 0x2,
+};
+
 void __init setup_rfi_flush(enum l1d_flush_type, bool enable);
 void do_rfi_flush_fixups(enum l1d_flush_type types);
+void __init setup_barrier_nospec(enum spec_barrier_type, bool enable);
+void do_barrier_nospec_fixups(enum spec_barrier_type type);
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index c388cc3357fa..09f21a954bfc 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -815,6 +815,10 @@ static enum l1d_flush_type enabled_flush_types;
 static void *l1d_flush_fallback_area;
 static bool no_rfi_flush;
 bool rfi_flush;
+enum spec_barrier_type powerpc_barrier_nospec;
+static enum spec_barrier_type barrier_nospec_type;
+static bool no_nospec;
+bool barrier_nospec_enabled;
 
 static int __init handle_no_rfi_flush(char *p)
 {
@@ -899,6 +903,31 @@ void __init setup_rfi_flush(enum l1d_flush_type types, 
bool enable)
rfi_flush_enable(enable);
 }
 
+void barrier_nospec_enable(bool enable)
+{
+   barrier_nospec_enabled = enable;
+
+   if (enable) {
+   powerpc_barrier_nospec = barrier_nospec_type;
+   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   on_each_cpu(do_nothing, NULL, 1);
+   } else {
+   powerpc_barrier_nospec = SPEC_BARRIER_NONE;
+   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   }
+}
+
+void __init setup_barrier_nospec(enum spec_barrier_type type, bool enable)
+{
+   if (type & SPEC_BARRIER_ORI)
+   pr_info("barrier_nospec: Using ori type flush\n");
+
+   barrier_nospec_type = type;
+
+   if (!no_nospec)
+   barrier_nospec_enable(enable);
+}
+
 #ifdef CONFIG_DEBUG_FS
 static int rfi_flush_set(void *data, u64 val)
 {
diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index c8af90ff49f0..744b58ff77f1 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -139,6 +139,13 @@ SECTIONS
*(__rfi_flush_fixup)
__stop___rfi_flush_fixup = .;
}
+
+   . = ALIGN(8);
+   __spec_barrier_fixup : AT(ADDR(__spec_barrier_fixup) - LOAD_OFFSET) {
+   __start___spec_barrier_fixup = .;
+   *(__spec_barrier_fixup)
+   __s

[PATCH RFC 0/8] powerpc barrier_nospec

2018-03-13 Thread Michal Suchanek
Hello,

this is patchset adding barrier_nospec on powerpc. It is based on the
out-of-tree gmb() patch and the existing rfi patches.

I do not have the tests for the Spectre/Meltdown issues available so this is
untested.

Feedback on the general approach as well as actual effectivity is welcome.

Thanks

Michal


Michal Suchanek (8):
  powerpc: Add barrier_nospec
  powerpc: Use barrier_nospec in copy_from_user
  powerpc/64: Use barrier_nospec in syscall entry
  powerpc/64s: Add support for ori barrier_nospec
  powerpc/64: Patch barrier_nospec in modules
  powerpc/64: barrier_nospec: Add debugfs trigger
  powerpc/64s: barrier_nospec: Add hcall triggerr
  powerpc/64: barrier_nospec: Add commandline trigger

 arch/powerpc/include/asm/barrier.h|  9 
 arch/powerpc/include/asm/feature-fixups.h |  9 
 arch/powerpc/include/asm/setup.h  | 11 +
 arch/powerpc/include/asm/uaccess.h| 11 -
 arch/powerpc/kernel/entry_64.S|  3 ++
 arch/powerpc/kernel/module.c  |  6 +++
 arch/powerpc/kernel/setup_64.c| 72 +++
 arch/powerpc/kernel/vmlinux.lds.S |  7 +++
 arch/powerpc/lib/feature-fixups.c | 38 
 arch/powerpc/platforms/pseries/setup.c| 38 ++--
 10 files changed, 190 insertions(+), 14 deletions(-)

-- 
2.13.6



[PATCH RFC 2/8] powerpc: Use barrier_nospec in copy_from_user

2018-03-13 Thread Michal Suchanek
Coopypasta from x86.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/uaccess.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 51bfeb8777f0..af9b0e731f46 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -248,6 +248,7 @@ do {
\
__chk_user_ptr(ptr);\
if (!is_kernel_addr((unsigned long)__gu_addr))  \
might_fault();  \
+   barrier_nospec();   \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__typeof__(*(ptr)))__gu_val; \
__gu_err;   \
@@ -258,8 +259,10 @@ do {   
\
long __gu_err = -EFAULT;\
unsigned long  __gu_val = 0;\
const __typeof__(*(ptr)) __user *__gu_addr = (ptr); \
+   int can_access = access_ok(VERIFY_READ, __gu_addr, (size)); \
might_fault();  \
-   if (access_ok(VERIFY_READ, __gu_addr, (size)))  \
+   barrier_nospec();   \
+   if (can_access) \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; 
\
__gu_err;   \
@@ -271,6 +274,7 @@ do {
\
unsigned long __gu_val; \
const __typeof__(*(ptr)) __user *__gu_addr = (ptr); \
__chk_user_ptr(ptr);\
+   barrier_nospec();   \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err;   \
@@ -298,15 +302,19 @@ static inline unsigned long raw_copy_from_user(void *to,
 
switch (n) {
case 1:
+   barrier_nospec();
__get_user_size(*(u8 *)to, from, 1, ret);
break;
case 2:
+   barrier_nospec();
__get_user_size(*(u16 *)to, from, 2, ret);
break;
case 4:
+   barrier_nospec();
__get_user_size(*(u32 *)to, from, 4, ret);
break;
case 8:
+   barrier_nospec();
__get_user_size(*(u64 *)to, from, 8, ret);
break;
}
@@ -314,6 +322,7 @@ static inline unsigned long raw_copy_from_user(void *to,
return 0;
}
 
+   barrier_nospec();
return __copy_tofrom_user((__force void __user *)to, from, n);
 }
 
-- 
2.13.6



[PATCH 1/2] mmc: bcm2835: reset host on timeout

2018-02-14 Thread Michal Suchanek
The bcm2835 mmc host tends to lock up for unknown reason so reset it on
timeout. The upper mmc block layer tries retransimitting with single
blocks which tends to work out after a long wait.

This is better than giving up and leaving the machine broken for no
obvious reason.

Signed-off-by: Michal Suchanek 
---
 drivers/mmc/host/bcm2835.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/mmc/host/bcm2835.c b/drivers/mmc/host/bcm2835.c
index 229dc18f0581..ce05fe72f865 100644
--- a/drivers/mmc/host/bcm2835.c
+++ b/drivers/mmc/host/bcm2835.c
@@ -286,6 +286,7 @@ static void bcm2835_reset(struct mmc_host *mmc)
 
if (host->dma_chan)
dmaengine_terminate_sync(host->dma_chan);
+   host->dma_chan = NULL;
bcm2835_reset_internal(host);
 }
 
@@ -837,6 +838,8 @@ static void bcm2835_timeout(struct work_struct *work)
dev_err(dev, "timeout waiting for hardware interrupt.\n");
bcm2835_dumpregs(host);
 
+   bcm2835_reset(host->mmc);
+
if (host->data) {
host->data->error = -ETIMEDOUT;
bcm2835_finish_data(host);
-- 
2.13.6



[PATCH 2/2] mmc: bcm2835: print some informational messages during reset

2018-02-14 Thread Michal Suchanek
The previous patch does reset during hardware error so make the reset
progress more visible.

Signed-off-by: Michal Suchanek 
---
 drivers/mmc/host/bcm2835.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/bcm2835.c b/drivers/mmc/host/bcm2835.c
index ce05fe72f865..4dde8b2b62a9 100644
--- a/drivers/mmc/host/bcm2835.c
+++ b/drivers/mmc/host/bcm2835.c
@@ -283,10 +283,14 @@ static void bcm2835_reset_internal(struct bcm2835_host 
*host)
 static void bcm2835_reset(struct mmc_host *mmc)
 {
struct bcm2835_host *host = mmc_priv(mmc);
+   struct device *dev = >pdev->dev;
 
-   if (host->dma_chan)
+   if (host->dma_chan) {
+   dev_info(dev, "tearing down dma");
dmaengine_terminate_sync(host->dma_chan);
+   }
host->dma_chan = NULL;
+   dev_info(dev, "resetting");
bcm2835_reset_internal(host);
 }
 
-- 
2.13.6



[PATCH RFC rebase 7/9] powerpc/64: barrier_nospec: Add debugfs trigger

2018-03-15 Thread Michal Suchanek
Copypasta from rfi implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/setup_64.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f60e0e3b5ad2..f6678a7b6114 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -963,6 +963,41 @@ static __init int rfi_flush_debugfs_init(void)
return 0;
 }
 device_initcall(rfi_flush_debugfs_init);
+
+static int barrier_nospec_set(void *data, u64 val)
+{
+   switch (val) {
+   case 0:
+   case 1:
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (!!val == !!barrier_nospec_enabled)
+   return 0;
+
+   barrier_nospec_enable(!!val);
+
+   return 0;
+}
+
+static int barrier_nospec_get(void *data, u64 *val)
+{
+   *val = barrier_nospec_enabled ? 1 : 0;
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec,
+   barrier_nospec_get, barrier_nospec_set, "%llu\n");
+
+static __init int barrier_nospec_debugfs_init(void)
+{
+   debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL,
+   _barrier_nospec);
+   return 0;
+}
+device_initcall(barrier_nospec_debugfs_init);
 #endif
 
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
-- 
2.13.6



[PATCH RFC rebase 8/9] powerpc/64s: barrier_nospec: Add hcall triggerr

2018-03-15 Thread Michal Suchanek
Adapted from the RFI implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/platforms/pseries/mobility.c |  2 +-
 arch/powerpc/platforms/pseries/pseries.h  |  2 +-
 arch/powerpc/platforms/pseries/setup.c| 37 ++-
 3 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 8a8033a249c7..9d506be1580e 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -349,7 +349,7 @@ void post_mobility_fixup(void)
"failed: %d\n", rc);
 
/* Possibly switch to a new RFI flush type */
-   pseries_setup_rfi_flush();
+   pseries_setup_rfi_nospec();
 
return;
 }
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index 27cdcb69fd18..d49670c67686 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -100,6 +100,6 @@ static inline unsigned long cmo_get_page_size(void)
 
 int dlpar_workqueue_init(void);
 
-void pseries_setup_rfi_flush(void);
+void pseries_setup_rfi_nospec(void);
 
 #endif /* _PSERIES_PSERIES_H */
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 9877c3dfcdc8..4b899a4db6dd 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -459,30 +459,47 @@ static void __init find_and_init_phbs(void)
of_pci_check_probe_only();
 }
 
-void pseries_setup_rfi_flush(void)
+void pseries_setup_rfi_nospec(void)
 {
struct h_cpu_char_result result;
-   enum l1d_flush_type types;
-   bool enable;
+   enum l1d_flush_type flush_types;
+   enum spec_barrier_type barrier_type;
+   bool flush_enable;
+   bool barrier_enable;
long rc;
 
/* Enable by default */
-   enable = true;
-   types = L1D_FLUSH_FALLBACK;
+   flush_enable = true;
+   flush_types = L1D_FLUSH_FALLBACK;
+   barrier_enable = true;
+   /* no fallback available if the firmware does not tell us */
+   barrier_type = SPEC_BARRIER_NONE;
 
rc = plpar_get_cpu_characteristics();
if (rc == H_SUCCESS) {
if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
-   types |= L1D_FLUSH_MTTRIG;
+   flush_types |= L1D_FLUSH_MTTRIG;
if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
-   types |= L1D_FLUSH_ORI;
+   flush_types |= L1D_FLUSH_ORI;
+   if (result.character & H_CPU_CHAR_SPEC_BAR_ORI31)
+   barrier_type |= SPEC_BARRIER_ORI;
 
if ((!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) ||
(!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)))
-   enable = false;
+   flush_enable = false;
+   /*
+* Do not check H_CPU_BEHAV_BNDS_CHK_SPEC_BAR - the ORI does
+* nothing anyway when not supported.
+*/
+   if ((!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)))
+   barrier_enable = false;
+   } else {
+   /* Default to fallback if case hcall is not available */
+   flush_types = L1D_FLUSH_FALLBACK;
}
 
-   setup_rfi_flush(types, enable);
+   setup_barrier_nospec(barrier_type, barrier_enable);
+   setup_rfi_flush(flush_types, flush_enable);
 }
 
 #ifdef CONFIG_PCI_IOV
@@ -658,7 +675,7 @@ static void __init pSeries_setup_arch(void)
 
fwnmi_init();
 
-   pseries_setup_rfi_flush();
+   pseries_setup_rfi_nospec();
 
/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);
-- 
2.13.6



[PATCH RFC rebase 6/9] powerpc/64: Patch barrier_nospec in modules

2018-03-15 Thread Michal Suchanek
Note that unlike RFI which is patched only in kernel the nospec state
reflects settings at the time the module was loaded.

Iterating all modules and re-patching every time the settings change is
not implemented.

Based on lwsync patching.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/setup.h  |  5 -
 arch/powerpc/kernel/module.c  |  6 ++
 arch/powerpc/kernel/setup_64.c|  4 ++--
 arch/powerpc/lib/feature-fixups.c | 17 ++---
 4 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index c7e9e66c2a38..92520d2483b8 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -58,7 +58,10 @@ enum spec_barrier_type {
 void setup_rfi_flush(enum l1d_flush_type, bool enable);
 void do_rfi_flush_fixups(enum l1d_flush_type types);
 void setup_barrier_nospec(enum spec_barrier_type, bool enable);
-void do_barrier_nospec_fixups(enum spec_barrier_type type);
+void do_barrier_nospec_fixups_kernel(enum spec_barrier_type type);
+void do_barrier_nospec_fixups(enum spec_barrier_type type,
+ void *start, void *end);
+extern enum spec_barrier_type powerpc_barrier_nospec;
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 3f7ba0f5bf29..7b6d0ec06a21 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -72,6 +72,12 @@ int module_finalize(const Elf_Ehdr *hdr,
do_feature_fixups(powerpc_firmware_features,
  (void *)sect->sh_addr,
  (void *)sect->sh_addr + sect->sh_size);
+
+   sect = find_section(hdr, sechdrs, "__spec_barrier_fixup");
+   if (sect != NULL)
+   do_barrier_nospec_fixups(powerpc_barrier_nospec,
+ (void *)sect->sh_addr,
+ (void *)sect->sh_addr + sect->sh_size);
 #endif
 
sect = find_section(hdr, sechdrs, "__lwsync_fixup");
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 767240074cad..f60e0e3b5ad2 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -910,11 +910,11 @@ void barrier_nospec_enable(bool enable)
 
if (enable) {
powerpc_barrier_nospec = barrier_nospec_type;
-   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   do_barrier_nospec_fixups_kernel(powerpc_barrier_nospec);
on_each_cpu(do_nothing, NULL, 1);
} else {
powerpc_barrier_nospec = SPEC_BARRIER_NONE;
-   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   do_barrier_nospec_fixups_kernel(powerpc_barrier_nospec);
}
 }
 
diff --git a/arch/powerpc/lib/feature-fixups.c 
b/arch/powerpc/lib/feature-fixups.c
index dfeb7feeccef..a529ac6b2a5d 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -160,14 +160,15 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
: "unknown");
 }
 
-void do_barrier_nospec_fixups(enum spec_barrier_type type)
+void do_barrier_nospec_fixups(enum spec_barrier_type type,
+ void *fixup_start, void *fixup_end)
 {
unsigned int instr, *dest;
long *start, *end;
int i;
 
-   start = PTRRELOC(&__start___spec_barrier_fixup),
-   end = PTRRELOC(&__stop___spec_barrier_fixup);
+   start = fixup_start;
+   end = fixup_end;
 
instr = 0x6000; /* nop */
 
@@ -186,6 +187,16 @@ void do_barrier_nospec_fixups(enum spec_barrier_type type)
printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
 }
 
+void do_barrier_nospec_fixups_kernel(enum spec_barrier_type type)
+{
+   void *start, *end;
+
+   start = PTRRELOC(&__start___spec_barrier_fixup),
+   end = PTRRELOC(&__stop___spec_barrier_fixup);
+
+   do_barrier_nospec_fixups(type, start, end);
+}
+
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
-- 
2.13.6



[PATCH RFC rebase 5/9] powerpc/64s: Add support for ori barrier_nospec patching

2018-03-15 Thread Michal Suchanek
Based on the RFI patching. This is required to be able to disable the
speculation barrier.

Only one barrier type is supported and it does nothing when the firmware
does not enable it. Also re-patching modules is not supported So the
only meaningful thing that can be done is patching out the speculation
barrier at boot when the user says it is not wanted.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/barrier.h|  4 ++--
 arch/powerpc/include/asm/feature-fixups.h |  9 +
 arch/powerpc/include/asm/setup.h  |  8 
 arch/powerpc/kernel/setup_64.c| 30 ++
 arch/powerpc/kernel/vmlinux.lds.S |  7 +++
 arch/powerpc/lib/feature-fixups.c | 27 +++
 6 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 8e47b3abe405..4079a95e84c2 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -75,9 +75,9 @@ do {  
\
___p1;  \
 })
 
-/* TODO: add patching so this can be disabled */
 /* Prevent speculative execution past this barrier. */
-#define barrier_nospec_asm ori 31,31,0
+#define barrier_nospec_asm SPEC_BARRIER_FIXUP_SECTION; \
+   nop
 #ifdef __ASSEMBLY__
 #define barrier_nospec barrier_nospec_asm
 #else
diff --git a/arch/powerpc/include/asm/feature-fixups.h 
b/arch/powerpc/include/asm/feature-fixups.h
index 1e82eb3caabd..9d3382618ffd 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -195,11 +195,20 @@ label##3: \
FTR_ENTRY_OFFSET 951b-952b; \
.popsection;
 
+#define SPEC_BARRIER_FIXUP_SECTION \
+953:   \
+   .pushsection __spec_barrier_fixup,"a";  \
+   .align 2;   \
+954:   \
+   FTR_ENTRY_OFFSET 953b-954b; \
+   .popsection;
+
 
 #ifndef __ASSEMBLY__
 #include 
 
 extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+extern long __start___spec_barrier_fixup, __stop___spec_barrier_fixup;
 
 void apply_feature_fixups(void);
 void setup_feature_keys(void);
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index bbcdf929be54..c7e9e66c2a38 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -49,8 +49,16 @@ enum l1d_flush_type {
L1D_FLUSH_MTTRIG= 0x8,
 };
 
+/* These are bit flags */
+enum spec_barrier_type {
+   SPEC_BARRIER_NONE   = 0x1,
+   SPEC_BARRIER_ORI= 0x2,
+};
+
 void setup_rfi_flush(enum l1d_flush_type, bool enable);
 void do_rfi_flush_fixups(enum l1d_flush_type types);
+void setup_barrier_nospec(enum spec_barrier_type, bool enable);
+void do_barrier_nospec_fixups(enum spec_barrier_type type);
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 4ec4a27b36a9..767240074cad 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -815,6 +815,10 @@ static enum l1d_flush_type enabled_flush_types;
 static void *l1d_flush_fallback_area;
 static bool no_rfi_flush;
 bool rfi_flush;
+enum spec_barrier_type powerpc_barrier_nospec;
+static enum spec_barrier_type barrier_nospec_type;
+static bool no_nospec;
+bool barrier_nospec_enabled;
 
 static int __init handle_no_rfi_flush(char *p)
 {
@@ -900,6 +904,32 @@ void setup_rfi_flush(enum l1d_flush_type types, bool 
enable)
rfi_flush_enable(enable);
 }
 
+void barrier_nospec_enable(bool enable)
+{
+   barrier_nospec_enabled = enable;
+
+   if (enable) {
+   powerpc_barrier_nospec = barrier_nospec_type;
+   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   on_each_cpu(do_nothing, NULL, 1);
+   } else {
+   powerpc_barrier_nospec = SPEC_BARRIER_NONE;
+   do_barrier_nospec_fixups(powerpc_barrier_nospec);
+   }
+}
+
+void setup_barrier_nospec(enum spec_barrier_type type, bool enable)
+{
+   /*
+* Only one barrier type is supported and it does nothing when the
+* firmware does not enable it. So the only meaningful thing to do
+* here is check the user preference.
+*/
+   barrier_nospec_type = SPEC_BARRIER_ORI;
+
+   barrier_nospec_enable(!no_nospec && enable);
+}
+
 #ifdef CONFIG_DEBUG_FS
 static int rfi_flush_set(void *data, u64 val)
 {
diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index c8af90ff49f0..744b58ff77f1 100644
--- a/arch/powerpc/kernel/v

[PATCH RFC rebase 9/9] powerpc/64: barrier_nospec: Add commandline trigger

2018-03-15 Thread Michal Suchanek
Add commandline options spectre_v2 and nospectre_v2

These are named same as similar x86 options regardless of actual effect
to not require platform-specific configuration.

Supported options:
nospectre_v2 or spectre_v2=off - speculation barrier not used
spectre_v2=on or spectre_v2=auto - speculation barrier used

Changing the settings after boot is not supported and VM migration may
change requirements so auto is same as on.

Based on s390 implementation

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/setup_64.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index f6678a7b6114..c74e656265df 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -840,6 +840,28 @@ static int __init handle_no_pti(char *p)
 }
 early_param("nopti", handle_no_pti);
 
+static int __init nospectre_v2_setup_early(char *str)
+{
+   no_nospec = true;
+   return 0;
+}
+early_param("nospectre_v2", nospectre_v2_setup_early);
+
+static int __init spectre_v2_setup_early(char *str)
+{
+   if (str && !strncmp(str, "on", 2))
+   no_nospec = false;
+
+   if (str && !strncmp(str, "off", 3))
+   no_nospec = true;
+
+   if (str && !strncmp(str, "auto", 4))
+   no_nospec = false;
+
+   return 0;
+}
+early_param("spectre_v2", spectre_v2_setup_early);
+
 static void do_nothing(void *unused)
 {
/*
-- 
2.13.6



[PATCH RFC rebase 1/9] powerpc: Add barrier_nospec

2018-03-15 Thread Michal Suchanek
When the firmware supports it an otherwise useless combination of ORI
instruction arguments is interpreted as speculation barrier. Implement
barrier_nospec using this instruction.

Based on the out-of-tree gmb() implementation.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/barrier.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 10daa1d56e0a..8e47b3abe405 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -75,6 +75,15 @@ do { 
\
___p1;  \
 })
 
+/* TODO: add patching so this can be disabled */
+/* Prevent speculative execution past this barrier. */
+#define barrier_nospec_asm ori 31,31,0
+#ifdef __ASSEMBLY__
+#define barrier_nospec barrier_nospec_asm
+#else
+#define barrier_nospec() __asm__ __volatile__ 
(stringify_in_c(barrier_nospec_asm) : : :)
+#endif
+
 #include 
 
 #endif /* _ASM_POWERPC_BARRIER_H */
-- 
2.13.6



[PATCH RFC rebase 4/9] powerpc/64s: Use barrier_nospec in RFI_FLUSH_SLOT

2018-03-15 Thread Michal Suchanek
The RFI flush support patches the speculation barrier into
RFI_FLUSH_SLOT as part of the RFI flush. Use separate barrier_nospec
instead.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/exception-64s.h | 2 +-
 arch/powerpc/lib/feature-fixups.c| 9 +++--
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 471b2274fbeb..bb5a3052b29b 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -81,9 +81,9 @@
  * L1-D cache when returning to userspace or a guest.
  */
 #define RFI_FLUSH_SLOT \
+   barrier_nospec_asm; \
RFI_FLUSH_FIXUP_SECTION;\
nop;\
-   nop;\
nop
 
 #define RFI_TO_KERNEL  \
diff --git a/arch/powerpc/lib/feature-fixups.c 
b/arch/powerpc/lib/feature-fixups.c
index 35f80ab7cbd8..4cc2f0c5c863 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -119,7 +119,7 @@ void do_feature_fixups(unsigned long value, void 
*fixup_start, void *fixup_end)
 #ifdef CONFIG_PPC_BOOK3S_64
 void do_rfi_flush_fixups(enum l1d_flush_type types)
 {
-   unsigned int instrs[3], *dest;
+   unsigned int instrs[2], *dest;
long *start, *end;
int i;
 
@@ -128,15 +128,13 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 
instrs[0] = 0x6000; /* nop */
instrs[1] = 0x6000; /* nop */
-   instrs[2] = 0x6000; /* nop */
 
if (types & L1D_FLUSH_FALLBACK)
-   /* b .+16 to fallback flush */
-   instrs[0] = 0x4810;
+   /* b .+12 to fallback flush */
+   instrs[0] = 0x480c;
 
i = 0;
if (types & L1D_FLUSH_ORI) {
-   instrs[i++] = 0x63ff; /* ori 31,31,0 speculation barrier */
instrs[i++] = 0x63de; /* ori 30,30,0 L1d flush*/
}
 
@@ -150,7 +148,6 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
 
patch_instruction(dest, instrs[0]);
patch_instruction(dest + 1, instrs[1]);
-   patch_instruction(dest + 2, instrs[2]);
}
 
printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i,
-- 
2.13.6



[PATCH RFC rebase 3/9] powerpc/64: Use barrier_nospec in syscall entry

2018-03-15 Thread Michal Suchanek
On powerpc syscall entry is done in assembly so patch in an explicit
barrier_nospec.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/entry_64.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2cb5109a7ea3..7bfc4cf48af2 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #ifdef CONFIG_PPC_BOOK3S
 #include 
@@ -159,6 +160,7 @@ system_call:/* label this so stack 
traces look sane */
andi.   r11,r10,_TIF_SYSCALL_DOTRACE
bne .Lsyscall_dotrace   /* does not return */
cmpldi  0,r0,NR_syscalls
+   barrier_nospec
bge-.Lsyscall_enosys
 
 .Lsyscall:
@@ -319,6 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld  r10,TI_FLAGS(r10)
 
cmpldi  r0,NR_syscalls
+   barrier_nospec
blt+.Lsyscall
 
/* Return code is already in r3 thanks to do_syscall_trace_enter() */
-- 
2.13.6



[PATCH RFC rebase 2/9] powerpc: Use barrier_nospec in copy_from_user

2018-03-15 Thread Michal Suchanek
This is based on x86 patch doing the same.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/uaccess.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 51bfeb8777f0..af9b0e731f46 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -248,6 +248,7 @@ do {
\
__chk_user_ptr(ptr);\
if (!is_kernel_addr((unsigned long)__gu_addr))  \
might_fault();  \
+   barrier_nospec();   \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__typeof__(*(ptr)))__gu_val; \
__gu_err;   \
@@ -258,8 +259,10 @@ do {   
\
long __gu_err = -EFAULT;\
unsigned long  __gu_val = 0;\
const __typeof__(*(ptr)) __user *__gu_addr = (ptr); \
+   int can_access = access_ok(VERIFY_READ, __gu_addr, (size)); \
might_fault();  \
-   if (access_ok(VERIFY_READ, __gu_addr, (size)))  \
+   barrier_nospec();   \
+   if (can_access) \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; 
\
__gu_err;   \
@@ -271,6 +274,7 @@ do {
\
unsigned long __gu_val; \
const __typeof__(*(ptr)) __user *__gu_addr = (ptr); \
__chk_user_ptr(ptr);\
+   barrier_nospec();   \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err;   \
@@ -298,15 +302,19 @@ static inline unsigned long raw_copy_from_user(void *to,
 
switch (n) {
case 1:
+   barrier_nospec();
__get_user_size(*(u8 *)to, from, 1, ret);
break;
case 2:
+   barrier_nospec();
__get_user_size(*(u16 *)to, from, 2, ret);
break;
case 4:
+   barrier_nospec();
__get_user_size(*(u32 *)to, from, 4, ret);
break;
case 8:
+   barrier_nospec();
__get_user_size(*(u64 *)to, from, 8, ret);
break;
}
@@ -314,6 +322,7 @@ static inline unsigned long raw_copy_from_user(void *to,
return 0;
}
 
+   barrier_nospec();
return __copy_tofrom_user((__force void __user *)to, from, n);
 }
 
-- 
2.13.6



[PATCH RFC rebase 0/9] powerpc barrier_nospec

2018-03-15 Thread Michal Suchanek
Yes, it is good idea to add some commit messages.

Also I rebased the patches on top v3 of series

Setup RFI flush after PowerVM LPM migration

Thanks

Michal

Michal Suchanek (9):
  powerpc: Add barrier_nospec
  powerpc: Use barrier_nospec in copy_from_user
  powerpc/64: Use barrier_nospec in syscall entry
  powerpc/64s: Use barrier_nospec in RFI_FLUSH_SLOT
  powerpc/64s: Add support for ori barrier_nospec patching
  powerpc/64: Patch barrier_nospec in modules
  powerpc/64: barrier_nospec: Add debugfs trigger
  powerpc/64s: barrier_nospec: Add hcall triggerr
  powerpc/64: barrier_nospec: Add commandline trigger

 arch/powerpc/include/asm/barrier.h|  9 
 arch/powerpc/include/asm/exception-64s.h  |  2 +-
 arch/powerpc/include/asm/feature-fixups.h |  9 
 arch/powerpc/include/asm/setup.h  | 11 
 arch/powerpc/include/asm/uaccess.h| 11 +++-
 arch/powerpc/kernel/entry_64.S|  3 ++
 arch/powerpc/kernel/module.c  |  6 +++
 arch/powerpc/kernel/setup_64.c| 87 +++
 arch/powerpc/kernel/vmlinux.lds.S |  7 +++
 arch/powerpc/lib/feature-fixups.c | 47 ++---
 arch/powerpc/platforms/pseries/mobility.c |  2 +-
 arch/powerpc/platforms/pseries/pseries.h  |  2 +-
 arch/powerpc/platforms/pseries/setup.c| 37 +
 13 files changed, 213 insertions(+), 20 deletions(-)

-- 
2.13.6



[PATCH] powerpc/xmon: really enable xmon when a breakpoint is set

2018-05-21 Thread Michal Suchanek
When single-stepping kernel code from xmon without a debug hook enabled
the kernel crashes. This can happen when kernel starts with xmon on
crash disabled but xmon is entered using sysrq.

Commit e1368d0c9edb ("powerpc/xmon: Setup debugger hooks when first
break-point is set") adds force_enable_xmon function that prints
"xmon: Enabling debugger hooks" but does not enable them.

Add the call to xmon_init to install the debugger hooks in
force_enable_xmon and also call force_enable_xmon when single-stepping
in xmon.

Fixes: e1368d0c9edb ("powerpc/xmon: Setup debugger hooks when first
break-point is set")

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/xmon/xmon.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index a0842f1ff72c..504bd1c3d8b0 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -179,6 +179,9 @@ static const char *getvecname(unsigned long vec);
 
 static int do_spu_cmd(void);
 
+static void xmon_init(int enable);
+static inline void force_enable_xmon(void);
+
 #ifdef CONFIG_44x
 static void dump_tlb_44x(void);
 #endif
@@ -1094,6 +1097,7 @@ static int do_step(struct pt_regs *regs)
unsigned int instr;
int stepped;
 
+   force_enable_xmon();
/* check we are in 64-bit kernel mode, translation enabled */
if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
if (mread(regs->nip, , 4) == 4) {
@@ -1275,6 +1279,7 @@ static inline void force_enable_xmon(void)
if (!xmon_on) {
printf("xmon: Enabling debugger hooks\n");
xmon_on = 1;
+   xmon_init(1);
}
 }
 
-- 
2.13.6



[PATCH v7 3/4] lib/cmdline.c Remove quotes symmetrically.

2017-08-17 Thread Michal Suchanek
Remove quotes from argument value only if there is qoute on both sides.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 6 ++
 lib/cmdline.c| 7 ++-
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index a1614d9b8a21..d7da4ce9f7ae 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -489,10 +489,8 @@ static void __init fadump_update_params(struct param_info 
*param_info,
*tgt++ = ' ';
 
/* next_arg removes one leading and one trailing '"' */
-   if (*tgt == '"')
-   shortening += 1;
-   if (*(tgt + vallen + shortening) == '"')
-   shortening += 1;
+   if ((*tgt == '"') && (*(tgt + vallen + shortening) == '"'))
+   shortening += 2;
 
/* remove one leading and one trailing quote if both are present */
if ((val[0] == '"') && (val[vallen - 1] == '"')) {
diff --git a/lib/cmdline.c b/lib/cmdline.c
index 4c0888c4a68d..01e701b2afe8 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -227,14 +227,11 @@ char *next_arg(char *args, char **param, char **val)
*val = args + equals + 1;
 
/* Don't include quotes in value. */
-   if (**val == '"') {
+   if ((**val == '"') && (args[i-1] == '"')) {
(*val)++;
-   if (args[i-1] == '"')
-   args[i-1] = '\0';
+   args[i-1] = '\0';
}
}
-   if (quoted && args[i-1] == '"')
-   args[i-1] = '\0';
 
if (args[i]) {
args[i] = '\0';
-- 
2.10.2



[PATCH v7 1/4] powerpc/fadump: reduce memory consumption for capture kernel

2017-08-17 Thread Michal Suchanek
From: Hari Bathini 

With fadump (dump capture) kernel booting like a regular kernel, it needs
almost the same amount of memory to boot as the production kernel, which is
unwarranted for a dump capture kernel. But with no option to disable some
of the unnecessary subsystems in fadump kernel, that much memory is wasted
on fadump, depriving the production kernel of that memory.

Introduce kernel parameter 'fadump_extra_args=' that would take regular
parameters as a space separated quoted string, to be enforced when fadump
is active. This 'fadump_extra_args=' parameter can be leveraged to pass
parameters like nr_cpus=1, cgroup_disable=memory and numa=off, to disable
unwarranted resources/subsystems.

Also, ensure the log "Firmware-assisted dump is active" is printed early
in the boot process to put the subsequent fadump messages in context.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
Changes from v6:
Correct and simplify quote handling. Ideally I would like to extend
parse_args to give the length of the original quoted value to callback.
However, parse_args removes at most one doubel-quote from the start and
one from the end so that is easy to detect. Otherwise all other users
will have to be updated to trash the new argument.
---
 arch/powerpc/include/asm/fadump.h |   2 +
 arch/powerpc/kernel/fadump.c  | 109 --
 arch/powerpc/kernel/prom.c|   7 +++
 3 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index ce88bbe1d809..98ae00943fb3 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -208,11 +208,13 @@ extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
 extern int fadump_reserve_mem(void);
 extern int setup_fadump(void);
+extern void enforce_fadump_extra_args(char *cmdline);
 extern int is_fadump_active(void);
 extern void crash_fadump(struct pt_regs *, const char *);
 extern void fadump_cleanup(void);
 
 #else  /* CONFIG_FA_DUMP */
+static inline void enforce_fadump_extra_args(char *cmdline) { }
 static inline int is_fadump_active(void) { return 0; }
 static inline void crash_fadump(struct pt_regs *regs, const char *str) { }
 #endif
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index dc0c49cfd90a..a1614d9b8a21 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -78,8 +78,10 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 * dump data waiting for us.
 */
fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump", NULL);
-   if (fdm_active)
+   if (fdm_active) {
+   pr_info("Firmware-assisted dump is active.\n");
fw_dump.dump_active = 1;
+   }
 
/* Get the sizes required to store dump data for the firmware provided
 * dump sections.
@@ -332,8 +334,11 @@ int __init fadump_reserve_mem(void)
 {
unsigned long base, size, memory_boundary;
 
-   if (!fw_dump.fadump_enabled)
+   if (!fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active)
+   pr_warn("Firmware-assisted dump was active but kernel 
booted with fadump disabled!\n");
return 0;
+   }
 
if (!fw_dump.fadump_supported) {
printk(KERN_INFO "Firmware-assisted dump is not supported on"
@@ -373,7 +378,6 @@ int __init fadump_reserve_mem(void)
memory_boundary = memblock_end_of_DRAM();
 
if (fw_dump.dump_active) {
-   printk(KERN_INFO "Firmware-assisted dump is active.\n");
/*
 * If last boot has crashed then reserve all the memory
 * above boot_memory_size so that we don't touch it until
@@ -460,6 +464,105 @@ static int __init early_fadump_reserve_mem(char *p)
 }
 early_param("fadump_reserve_mem", early_fadump_reserve_mem);
 
+#define FADUMP_EXTRA_ARGS_PARAM"fadump_extra_args="
+#define FADUMP_EXTRA_ARGS_LEN  (strlen(FADUMP_EXTRA_ARGS_PARAM) - 1)
+
+struct param_info {
+   char*cmdline;
+   char*tmp_cmdline;
+   int  shortening;
+};
+
+static void __init fadump_update_params(struct param_info *param_info,
+   char *param, char *val)
+{
+   ptrdiff_t param_offset = param - param_info->tmp_cmdline;
+   size_t vallen = val ? strlen(val) : 0;
+   char *tgt = param_info->cmdline + param_offset +
+   FADUMP_EXTRA_ARGS_LEN - param_info->shortening;
+   int shortening = 0;
+
+   if (!val)
+   return;
+
+   /* remove '=' */
+   *tgt++ = ' ';
+
+   /* next_arg removes one leading and one trailing '"' */
+  

[PATCH v7 4/4] boot/param: add pointer to next argument to unknown parameter callback

2017-08-17 Thread Michal Suchanek
The fadump parameter processing re-does the logic of next_arg quote
stripping to determine where the argument ends. Pass pointer to the
next argument instead to make this more robust.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c  | 13 +
 arch/powerpc/mm/hugetlbpage.c |  4 ++--
 include/linux/moduleparam.h   |  2 +-
 init/main.c   | 12 ++--
 kernel/module.c   |  4 ++--
 kernel/params.c   | 19 +++
 lib/dynamic_debug.c   |  2 +-
 7 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d7da4ce9f7ae..6ef96711ee9a 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -474,13 +474,14 @@ struct param_info {
 };
 
 static void __init fadump_update_params(struct param_info *param_info,
-   char *param, char *val)
+   char *param, char *val, char *next)
 {
ptrdiff_t param_offset = param - param_info->tmp_cmdline;
size_t vallen = val ? strlen(val) : 0;
char *tgt = param_info->cmdline + param_offset +
FADUMP_EXTRA_ARGS_LEN - param_info->shortening;
-   int shortening = 0;
+   int shortening = ((next - 1) - (param))
+   - (FADUMP_EXTRA_ARGS_LEN + 1 + vallen);
 
if (!val)
return;
@@ -488,10 +489,6 @@ static void __init fadump_update_params(struct param_info 
*param_info,
/* remove '=' */
*tgt++ = ' ';
 
-   /* next_arg removes one leading and one trailing '"' */
-   if ((*tgt == '"') && (*(tgt + vallen + shortening) == '"'))
-   shortening += 2;
-
/* remove one leading and one trailing quote if both are present */
if ((val[0] == '"') && (val[vallen - 1] == '"')) {
shortening += 2;
@@ -517,7 +514,7 @@ static void __init fadump_update_params(struct param_info 
*param_info,
  * to enforce the parameters passed through it
  */
 static int __init fadump_rework_cmdline_params(char *param, char *val,
-  const char *unused, void *arg)
+   char *next, const char *unused, void *arg)
 {
struct param_info *param_info = (struct param_info *)arg;
 
@@ -525,7 +522,7 @@ static int __init fadump_rework_cmdline_params(char *param, 
char *val,
 strlen(FADUMP_EXTRA_ARGS_PARAM) - 1))
return 0;
 
-   fadump_update_params(param_info, param, val);
+   fadump_update_params(param_info, param, val, next);
 
return 0;
 }
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index e1bf5ca397fe..3a4cce552906 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -268,8 +268,8 @@ int alloc_bootmem_huge_page(struct hstate *hstate)
 
 unsigned long gpage_npages[MMU_PAGE_COUNT];
 
-static int __init do_gpage_early_setup(char *param, char *val,
-  const char *unused, void *arg)
+static int __init do_gpage_early_setup(char *param, char *val, char *unused1,
+  const char *unused2, void *arg)
 {
static phys_addr_t size;
unsigned long npages;
diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 1ee7b30dafec..fec05a186c08 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -326,7 +326,7 @@ extern char *parse_args(const char *name,
  s16 level_min,
  s16 level_max,
  void *arg,
- int (*unknown)(char *param, char *val,
+ int (*unknown)(char *param, char *val, char *next,
 const char *doing, void *arg));
 
 /* Called by module remove. */
diff --git a/init/main.c b/init/main.c
index 052481fbe363..920c3564b2f0 100644
--- a/init/main.c
+++ b/init/main.c
@@ -239,7 +239,7 @@ static int __init loglevel(char *str)
 early_param("loglevel", loglevel);
 
 /* Change NUL term back to "=", to make "param" the whole string. */
-static int __init repair_env_string(char *param, char *val,
+static int __init repair_env_string(char *param, char *val, char *unused2,
const char *unused, void *arg)
 {
if (val) {
@@ -257,7 +257,7 @@ static int __init repair_env_string(char *param, char *val,
 }
 
 /* Anything after -- gets handed straight to init. */
-static int __init set_init_arg(char *param, char *val,
+static int __init set_init_arg(char *param, char *val, char *unused2,
   const char *unused, void *arg)
 {
unsigned int i;
@@ -265,7 +265,7 @@ static int __init set_init_arg(char *param, char *val,
if (panic_later)
return 0;
 
-   

[PATCH v7 2/4] powerpc/fadump: update documentation about 'fadump_extra_args=' parameter

2017-08-17 Thread Michal Suchanek
From: Hari Bathini 

With the introduction of 'fadump_extra_args=' parameter to pass additional
parameters to fadump (capture) kernel, update documentation about it.

Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
 Documentation/powerpc/firmware-assisted-dump.txt | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.txt 
b/Documentation/powerpc/firmware-assisted-dump.txt
index bdd344aa18d9..2df88524d2c7 100644
--- a/Documentation/powerpc/firmware-assisted-dump.txt
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -162,7 +162,19 @@ How to enable firmware-assisted dump (fadump):
 
 1. Set config option CONFIG_FA_DUMP=y and build kernel.
 2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
-3. Optionally, user can also set 'crashkernel=' kernel cmdline
+3. A user can pass additional command line parameters as a space
+   separated quoted list through 'fadump_extra_args=' parameter,
+   to be enforced when fadump is active. For example, parameter
+   'fadump_extra_args="nr_cpus=1 numa=off udev.children-max=2"'
+   will be changed to 'fadump_extra_args nr_cpus=1  numa=off
+   udev.children-max=2' in-place when fadump is active. This
+   parameter has no affect when fadump is not active. Multiple
+   instances of 'fadump_extra_args=' can be passed. This provision
+   can be used to reduce memory consumption during dump capture by
+   disabling unwarranted resources/subsystems like CPUs, NUMA
+   and such. Value with spaces can be passed as
+   'fadump_extra_args=""parameter="value with spaces"""'
+4. Optionally, user can also set 'crashkernel=' kernel cmdline
to specify size of the memory to reserve for boot memory dump
preservation.
 
@@ -172,6 +184,12 @@ NOTE: 1. 'fadump_reserve_mem=' parameter has been 
deprecated. Instead
   2. If firmware-assisted dump fails to reserve memory then it
  will fallback to existing kdump mechanism if 'crashkernel='
  option is set at kernel cmdline.
+  3. Special parameters like '--' passed inside fadump_extra_args are also
+ just left in-place. So, the user is advised to consider this while
+ specifying such parameters. It may be required to quote the argument
+ to fadump_extra_args when the bootloader uses double-quotes as
+ argument delimiter as well. eg
+append = " fadump_extra_args=\"nr_cpus=1 numa=off 
udev.children-max=2\""
 
 Sysfs/debugfs files:
 
-- 
2.10.2



[PATCH] bootwrapper: mspsc.c: fix pointer-to-int-cast warnings

2017-10-05 Thread Michal Suchanek
I get these warnings:

../arch/powerpc/boot/mpsc.c: In function 'mpsc_get_virtreg_of_phandle':
../arch/powerpc/boot/mpsc.c:113:35: warning: cast from pointer to
integer of different size [-Wpointer-to-int-cast]

../arch/powerpc/boot/mpsc.c: In function 'mpsc_console_init':
../arch/powerpc/boot/mpsc.c:147:12: warning: cast from pointer to
integer of different size [-Wpointer-to-int-cast]

Presumably the patch below fixes these, and presumably the DT defines
that pointes and integers have the same size in the DT so this is fine
regardless of 32bit/64bit target. I have not found a DT definition for
PowerPC, howewer. So any bugs in the property sizing and resulting
failures to read the properties are left as before.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/boot/mpsc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/mpsc.c b/arch/powerpc/boot/mpsc.c
index 425ad88cce8d..ea740493277a 100644
--- a/arch/powerpc/boot/mpsc.c
+++ b/arch/powerpc/boot/mpsc.c
@@ -110,7 +110,7 @@ static volatile char *mpsc_get_virtreg_of_phandle(void 
*devp, char *prop)
if (n != sizeof(v))
goto err_out;
 
-   devp = find_node_by_linuxphandle((u32)v);
+   devp = find_node_by_linuxphandle((intptr_t)v);
if (devp == NULL)
goto err_out;
 
@@ -144,7 +144,7 @@ int mpsc_console_init(void *devp, struct 
serial_console_data *scdp)
n = getprop(devp, "cell-index", , sizeof(v));
if (n != sizeof(v))
goto err_out;
-   reg_set = (int)v;
+   reg_set = (intptr_t)v;
 
mpscintr_base += (reg_set == 0) ? 0x4 : 0xc;
 
-- 
2.10.2



[PATCH 0/6] Fix cdrom autoclose

2017-12-14 Thread Michal Suchanek
Hello,

there is cdrom autoclose feature that is supposed to close the tray, wait for
the disc to become ready, and then open the device.

This used to work in ancient times. Then in old times there was a hack in
util-linux which worked around the breakage which probably resulted from
switching to scsi emulation.

Currently util-linux maintainer refuses to merge another hack on the basis that
kernel still has the feature so it should be fixed there. Indeed, to implement
this feature effectively from userspace one would need to know when the CD-ROM
is in the "drive becoming ready" state which is knowledge that never leaves the
hardware-specific driver and is passed neither to userspace nor the generic
cdrom driver.

So this patchset fixes the kernel autoclose implementation in cdrom.c and to
do so reports the "drive becoming ready" state from the harware specific
drivers.

Michal Suchanek (6):
  delay: add poll_event_interruptible
  cdrom: factor out common open_for_* code
  cdrom: wait for tray to close
  cdrom: introduce CDS_DRIVE_ERROR
  Documentetion: cdrom: introduce CDS_DRIVE_ERROR
  cdrom: wait for drive to become ready

 Documentation/cdrom/cdrom-standard.tex |   8 ++-
 Documentation/cdrom/ide-cd |   6 ++
 Documentation/ioctl/cdrom.txt  |   1 +
 drivers/block/paride/pcd.c |   2 +-
 drivers/cdrom/cdrom.c  | 124 -
 drivers/cdrom/gdrom.c  |   2 +-
 drivers/ide/ide-cd_ioctl.c |  12 ++--
 drivers/scsi/sr_ioctl.c|   2 +-
 include/linux/delay.h  |  12 
 include/uapi/linux/cdrom.h |   1 +
 10 files changed, 99 insertions(+), 71 deletions(-)

-- 
2.13.6



[PATCH 3/6] cdrom: wait for tray to close

2017-12-14 Thread Michal Suchanek
The scsi command to close tray only starts the motor and does not wait
for the tray to close. Wait until the state chages from TRAY_OPEN so
users do not race with the tray closing.

This looks like inifinte wait but unless the drive is broken it either
closes the tray within a few seconds or reports an error when it detects
the tray is blocked. At worst the wait can be interrupted by user.

Signed-off-by: Michal Suchanek 
---
 drivers/cdrom/cdrom.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index e976d3d0180d..040d3d466cd7 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -281,7 +281,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 
 /* used to tell the module to turn on full debugging messages */
@@ -1030,6 +1032,18 @@ static void cdrom_count_tracks(struct cdrom_device_info 
*cdi, tracktype *tracks)
   tracks->cdi, tracks->xa);
 }
 
+static int tray_close(struct cdrom_device_info *cdi)
+{
+   int ret;
+
+   ret = cdi->ops->tray_move(cdi, 0);
+   if (ret)
+   return ret;
+
+   return poll_event_interruptible(CDS_TRAY_OPEN !=
+   cdi->ops->drive_status(cdi, CDSL_CURRENT), 500);
+}
+
 static
 int open_for_common(struct cdrom_device_info *cdi, tracktype *tracks)
 {
@@ -1048,7 +1062,9 @@ int open_for_common(struct cdrom_device_info *cdi, 
tracktype *tracks)
if (CDROM_CAN(CDC_CLOSE_TRAY) &&
cdi->options & CDO_AUTO_CLOSE) {
cd_dbg(CD_OPEN, "trying to close the tray\n");
-   ret = cdo->tray_move(cdi, 0);
+   ret = tray_close(cdi);
+   if (ret == -ERESTARTSYS)
+   return ret;
if (ret) {
cd_dbg(CD_OPEN, "bummer. tried to close 
the tray but failed.\n");
/* Ignore the error from the low
@@ -2312,7 +2328,8 @@ static int cdrom_ioctl_closetray(struct cdrom_device_info 
*cdi)
 
if (!CDROM_CAN(CDC_CLOSE_TRAY))
return -ENOSYS;
-   return cdi->ops->tray_move(cdi, 0);
+
+   return tray_close(cdi);
 }
 
 static int cdrom_ioctl_eject_sw(struct cdrom_device_info *cdi,
-- 
2.13.6



[PATCH 4/6] cdrom: introduce CDS_DRIVE_ERROR

2017-12-14 Thread Michal Suchanek
CDS_DRIVE_NOT_READY is used for the state in which CDROM is 'becoming
ready' (typically analyzing the disc) but also as the fallback when
nothing else applies. Introduce CDS_DRIVE_ERROR for the fallback case.

Signed-off-by: Michal Suchanek 
---
 drivers/block/paride/pcd.c |  2 +-
 drivers/cdrom/gdrom.c  |  2 +-
 drivers/ide/ide-cd_ioctl.c | 12 
 drivers/scsi/sr_ioctl.c|  2 +-
 include/uapi/linux/cdrom.h |  1 +
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c
index 7b8c6368beb7..6e00093ff34e 100644
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -605,7 +605,7 @@ static int pcd_drive_status(struct cdrom_device_info *cdi, 
int slot_nr)
struct pcd_unit *cd = cdi->handle;
 
if (pcd_ready_wait(cd, PCD_READY_TMO))
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
if (pcd_atapi(cd, rc_cmd, 8, pcd_scratch, DBMSG("check media")))
return CDS_NO_DISC;
return CDS_DISC_OK;
diff --git a/drivers/cdrom/gdrom.c b/drivers/cdrom/gdrom.c
index 6495b03f576c..702f255bbe42 100644
--- a/drivers/cdrom/gdrom.c
+++ b/drivers/cdrom/gdrom.c
@@ -390,7 +390,7 @@ static int gdrom_drivestatus(struct cdrom_device_info 
*cd_info, int ignore)
if (sense == 0)
return CDS_DISC_OK;
if (sense == 0x20)
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
/* default */
return CDS_NO_INFO;
 }
diff --git a/drivers/ide/ide-cd_ioctl.c b/drivers/ide/ide-cd_ioctl.c
index 2acca12b9c94..9a26f50a2092 100644
--- a/drivers/ide/ide-cd_ioctl.c
+++ b/drivers/ide/ide-cd_ioctl.c
@@ -62,9 +62,13 @@ int ide_cdrom_drive_status(struct cdrom_device_info *cdi, 
int slot_nr)
return CDS_NO_DISC;
}
 
-   if (sense.sense_key == NOT_READY && sense.asc == 0x04
-   && sense.ascq == 0x04)
-   return CDS_DISC_OK;
+   if (sense.sense_key == NOT_READY && sense.asc == 0x04)
+   switch (sense.ascq) {
+   case 0x01:
+   return CDS_DRIVE_NOT_READY;
+   case 0x04:
+   return CDS_DISC_OK;
+   }
 
/*
 * If not using Mt Fuji extended media tray reports,
@@ -77,7 +81,7 @@ int ide_cdrom_drive_status(struct cdrom_device_info *cdi, int 
slot_nr)
else
return CDS_TRAY_OPEN;
}
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
 }
 
 /*
diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
index 2a21f2d48592..7c93f12a9cb8 100644
--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -333,7 +333,7 @@ int sr_drive_status(struct cdrom_device_info *cdi, int slot)
else
return CDS_TRAY_OPEN;
 
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
 }
 
 int sr_disk_status(struct cdrom_device_info *cdi)
diff --git a/include/uapi/linux/cdrom.h b/include/uapi/linux/cdrom.h
index 2817230148fd..339b1435f44e 100644
--- a/include/uapi/linux/cdrom.h
+++ b/include/uapi/linux/cdrom.h
@@ -398,6 +398,7 @@ struct cdrom_generic_command
 #define CDS_TRAY_OPEN  2
 #define CDS_DRIVE_NOT_READY3
 #define CDS_DISC_OK4
+#define CDS_DRIVE_ERROR5
 
 /* return values for the CDROM_DISC_STATUS ioctl */
 /* can also return CDS_NO_[INFO|DISC], from above */
-- 
2.13.6



[PATCH 6/6] cdrom: wait for drive to become ready

2017-12-14 Thread Michal Suchanek
When the drive closes it can take tens of seconds until the disc is
analyzed. Wait for the drive to become ready or report an error.

Signed-off-by: Michal Suchanek 
---
 drivers/cdrom/cdrom.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 040d3d466cd7..a483f34b7648 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -1087,6 +1087,15 @@ int open_for_common(struct cdrom_device_info *cdi, 
tracktype *tracks)
}
cd_dbg(CD_OPEN, "the tray is now closed\n");
}
+   /* the door should be closed now, check for the disc */
+   if (ret == CDS_DRIVE_NOT_READY) {
+   int poll_res = poll_event_interruptible(
+   CDS_DRIVE_NOT_READY !=
+   (ret = cdo->drive_status(cdi, CDSL_CURRENT)),
+   500);
+   if (poll_res == -ERESTARTSYS)
+   return poll_res;
+   }
if (ret != CDS_DISC_OK)
return -ENOMEDIUM;
}
-- 
2.13.6



[PATCH 5/6] Documentetion: cdrom: introduce CDS_DRIVE_ERROR

2017-12-14 Thread Michal Suchanek

CDS_DRIVE_NOT_READY is used for the state in which CDROM is 'becoming
ready' (typically analyzing the disc) but also as the fallback when
nothing else applies. Introduce CDS_DRIVE_ERROR for the fallback case.

Signed-off-by: Michal Suchanek 
---
 Documentation/cdrom/cdrom-standard.tex | 8 +++-
 Documentation/cdrom/ide-cd | 6 ++
 Documentation/ioctl/cdrom.txt  | 1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/cdrom/cdrom-standard.tex 
b/Documentation/cdrom/cdrom-standard.tex
index 8f85b0e41046..018284ba696a 100644
--- a/Documentation/cdrom/cdrom-standard.tex
+++ b/Documentation/cdrom/cdrom-standard.tex
@@ -371,11 +371,17 @@ $$
 CDS_NO_INFO& no information available\cr
 CDS_NO_DISC& no disc is inserted, tray is closed\cr
 CDS_TRAY_OPEN& tray is opened\cr
-CDS_DRIVE_NOT_READY& something is wrong, tray is moving?\cr
+CDS_DRIVE_NOT_READY& tray just closed?\cr
 CDS_DISC_OK& a disc is loaded and everything is fine\cr
+CDS_DRIVE_ERROR& something is wrong\cr
 }
 $$
 
+Note: The IDE and SCSI cdroms have a status code 'drive becoming ready' which
+is typically returned when the drive has just closed and is analyzing the disc.
+For other cdrom types this state is not reported by the hardware or not
+implemented by the driver.
+
 \subsection{$Int\ media_changed(struct\ cdrom_device_info * cdi, int\ 
disc_nr)$}
 
 This function is very similar to the original function in $struct\ 
diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd
index a5f2a7f1ff46..9324a8fd9a39 100644
--- a/Documentation/cdrom/ide-cd
+++ b/Documentation/cdrom/ide-cd
@@ -455,6 +455,9 @@ main (int argc, char **argv)
case CDS_DRIVE_NOT_READY:
printf ("Drive Not Ready.\n");
break;
+   case CDS_DRIVE_ERROR:
+   printf ("Drive problem.\n");
+   break;
default:
printf ("This Should not happen!\n");
break;
@@ -481,6 +484,9 @@ main (int argc, char **argv)
case CDS_NO_INFO:
printf ("No Information available.");
break;
+   case CDS_DRIVE_ERROR:
+   printf ("Drive problem.\n");
+   break;
default:
printf ("This Should not happen!\n");
break;
diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt
index a4d62a9d6771..7720d11807c3 100644
--- a/Documentation/ioctl/cdrom.txt
+++ b/Documentation/ioctl/cdrom.txt
@@ -700,6 +700,7 @@ CDROM_DRIVE_STATUS  Get tray position, etc.
CDS_TRAY_OPEN
CDS_DRIVE_NOT_READY
CDS_DISC_OK
+   CDS_DRIVE_ERROR
-1  error
 
error returns:
-- 
2.13.6



[PATCH 1/6] delay: add poll_event_interruptible

2017-12-14 Thread Michal Suchanek
Add convenience macro for polling an event that does not have a
waitqueue.

Signed-off-by: Michal Suchanek 
---
 include/linux/delay.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index b78bab4395d8..3ae9fa395628 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -64,4 +64,16 @@ static inline void ssleep(unsigned int seconds)
msleep(seconds * 1000);
 }
 
+#define poll_event_interruptible(event, interval) ({ \
+   int ret = 0; \
+   while (!(event)) { \
+   if (signal_pending(current)) { \
+   ret = -ERESTARTSYS; \
+   break; \
+   } \
+   msleep_interruptible(interval); \
+   } \
+   ret; \
+})
+
 #endif /* defined(_LINUX_DELAY_H) */
-- 
2.13.6



[PATCH 2/6] cdrom: factor out common open_for_* code

2017-12-14 Thread Michal Suchanek
The open_for_audio and open_for_data copies are bitrotten in different
ways already and will need to update the autoclose logic in both.

Signed-off-by: Michal Suchanek 
---
 drivers/cdrom/cdrom.c | 100 ++
 1 file changed, 36 insertions(+), 64 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index e36d160c458f..e976d3d0180d 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -1031,12 +1031,12 @@ static void cdrom_count_tracks(struct cdrom_device_info 
*cdi, tracktype *tracks)
 }
 
 static
-int open_for_data(struct cdrom_device_info *cdi)
+int open_for_common(struct cdrom_device_info *cdi, tracktype *tracks)
 {
int ret;
const struct cdrom_device_ops *cdo = cdi->ops;
-   tracktype tracks;
-   cd_dbg(CD_OPEN, "entering open_for_data\n");
+
+   cd_dbg(CD_OPEN, "entering " __func__ "\n");
/* Check if the driver can report drive status.  If it can, we
   can do clever things.  If it can't, well, we at least tried! */
if (cdo->drive_status != NULL) {
@@ -1048,7 +1048,7 @@ int open_for_data(struct cdrom_device_info *cdi)
if (CDROM_CAN(CDC_CLOSE_TRAY) &&
cdi->options & CDO_AUTO_CLOSE) {
cd_dbg(CD_OPEN, "trying to close the tray\n");
-   ret=cdo->tray_move(cdi,0);
+   ret = cdo->tray_move(cdi, 0);
if (ret) {
cd_dbg(CD_OPEN, "bummer. tried to close 
the tray but failed.\n");
/* Ignore the error from the low
@@ -1056,37 +1056,45 @@ int open_for_data(struct cdrom_device_info *cdi)
couldn't close the tray.  We only care 
that there is no disc in the drive, 
since that is the _REAL_ problem here.*/
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
} else {
cd_dbg(CD_OPEN, "bummer. this drive can't close 
the tray.\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
/* Ok, the door should be closed now.. Check again */
ret = cdo->drive_status(cdi, CDSL_CURRENT);
-   if ((ret == CDS_NO_DISC) || (ret==CDS_TRAY_OPEN)) {
+   if ((ret == CDS_NO_DISC) || (ret == CDS_TRAY_OPEN)) {
cd_dbg(CD_OPEN, "bummer. the tray is still not 
closed.\n");
cd_dbg(CD_OPEN, "tray might not contain a 
medium\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
cd_dbg(CD_OPEN, "the tray is now closed\n");
}
-   /* the door should be closed now, check for the disc */
-   ret = cdo->drive_status(cdi, CDSL_CURRENT);
-   if (ret!=CDS_DISC_OK) {
-   ret = -ENOMEDIUM;
-   goto clean_up_and_return;
-   }
+   if (ret != CDS_DISC_OK)
+   return -ENOMEDIUM;
}
-   cdrom_count_tracks(cdi, );
-   if (tracks.error == CDS_NO_DISC) {
+   cdrom_count_tracks(cdi, tracks);
+   if (tracks->error == CDS_NO_DISC) {
cd_dbg(CD_OPEN, "bummer. no disc.\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
+
+   return 0;
+}
+
+static
+int open_for_data(struct cdrom_device_info *cdi)
+{
+   int ret;
+   const struct cdrom_device_ops *cdo = cdi->ops;
+   tracktype tracks;
+
+   cd_dbg(CD_OPEN, "entering " __func__ "\n");
+   ret = open_for_common(cdi, );
+   if (ret)
+   goto clean_up_and_return;
+
/* CD-Players which don't use O_NONBLOCK, workman
 * for example, need bit CDO_CHECK_TYPE cleared! */
if (tracks.data==0) {
@@ -1196,53 +1204,17 @@ int cdrom_open(struct cdrom_device_info *cdi, struct 
block_device *bdev,
 /* This code is similar to that in open_for_data. The routine is called
whenever an audio play operation is requested.
 */
-static int check_for_audio_disc(struct cdrom_device_info *cdi,
-   const struct cdrom_de

[PATCH] init/main.c: simplify repair_env_string

2017-12-15 Thread Michal Suchanek
Quoting characters are now removed from the parameter so value always
follows directly after the NUL terminating parameter name.

Signed-off-by: Michal Suchanek 
---
 init/main.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

Since the previous "[PATCH v9 3/8] lib/cmdline.c: add backslash support to
kernel commandline parsing" adds the memmove in lib/cmdline.c it is now
superfluous in init/main.c

diff --git a/init/main.c b/init/main.c
index 1f5fdedbb293..1e5b1dc940d9 100644
--- a/init/main.c
+++ b/init/main.c
@@ -244,15 +244,10 @@ static int __init repair_env_string(char *param, char 
*val,
const char *unused, void *arg)
 {
if (val) {
-   /* param=val or param="val"? */
-   if (val == param+strlen(param)+1)
-   val[-1] = '=';
-   else if (val == param+strlen(param)+2) {
-   val[-2] = '=';
-   memmove(val-1, val, strlen(val)+1);
-   val--;
-   } else
-   BUG();
+   int parm_len = strlen(param);
+
+   param[parm_len] = '=';
+   BUG_ON(val != param + parm_len + 1);
}
return 0;
 }
-- 
2.13.6



[PATCH] Fix parse_args cycle limit check.

2017-12-15 Thread Michal Suchanek
Actually args are supposed to be renamed to next so both and args hold the
previous argument so both can be passed to the callback. This additionla patch
should fix up the rename.

---
 kernel/params.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/params.c b/kernel/params.c
index 69ff58e69887..efb4dfaa6bc5 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -182,17 +182,18 @@ char *parse_args(const char *doing,
 
if (*args)
pr_debug("doing %s, parsing ARGS: '%s'\n", doing, args);
+   else
+   return err;
 
-   next = next_arg(args, , );
-   while (*next) {
+   do {
int ret;
int irq_was_disabled;
 
-   args = next;
next = next_arg(args, , );
+
/* Stop at -- */
if (!val && strcmp(param, "--") == 0)
-   return err ?: args;
+   return err ?: next;
irq_was_disabled = irqs_disabled();
ret = parse_one(param, val, args, next, doing, params, num,
min_level, max_level, arg, unknown);
@@ -215,9 +216,10 @@ char *parse_args(const char *doing,
   doing, val ?: "", param);
break;
}
-
err = ERR_PTR(ret);
-   }
+
+   args = next;
+   } while (*args);
 
return err;
 }
-- 
2.13.6



[PATCH] Optimize final quote removal.

2017-12-15 Thread Michal Suchanek
This is additional patch that avoids the memmove when processing the quote on
the end of the parameter.

---
 lib/cmdline.c   | 9 +++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/cmdline.c b/lib/cmdline.c
index c5335a79a177..b1d8a0dc60fc 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -191,7 +191,13 @@ bool parse_option_str(const char *str, const char *option)
return false;
 }
 
+#define break_arg_end(i) { \
+   if (isspace(args[i]) && !in_quote && !backslash && !in_single) \
+   break; \
+   }
+
 #define squash_char { \
+   break_arg_end(i + 1); \
memmove(args + 1, args, i); \
args++; \
i--; \
@@ -209,8 +215,7 @@ char *next_arg(char *args, char **param, char **val)
char *next;
 
for (i = 0; args[i]; i++) {
-   if (isspace(args[i]) && !in_quote && !backslash && !in_single)
-   break;
+   break_arg_end(i);
 
if ((equals == 0) && (args[i] == '='))
equals = i;
-- 
2.13.6



[PATCH v2] Do not disable driver and bus shutdown hook when class shutdown hook is set.

2017-08-11 Thread Michal Suchanek
As seen from the implementation of the single class shutdown hook this
is not very sound design.

Rename the class shutdown hook to shutdown_pre to make it clear it runs
before the driver shutdown hook.

Signed-off-by: Michal Suchanek 
---
v2: rename class shutdown member to shutdown_pre
---
 drivers/base/core.c |  9 +
 drivers/char/tpm/tpm-chip.c | 11 ++-
 include/linux/device.h  |  4 ++--
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 755451f684bc..13e7c41fd417 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2664,11 +2664,12 @@ void device_shutdown(void)
pm_runtime_get_noresume(dev);
pm_runtime_barrier(dev);
 
-   if (dev->class && dev->class->shutdown) {
+   if (dev->class && dev->class->shutdown_pre) {
if (initcall_debug)
-   dev_info(dev, "shutdown\n");
-   dev->class->shutdown(dev);
-   } else if (dev->bus && dev->bus->shutdown) {
+   dev_info(dev, "shutdown_pre\n");
+   dev->class->shutdown_pre(dev);
+   }
+   if (dev->bus && dev->bus->shutdown) {
if (initcall_debug)
dev_info(dev, "shutdown\n");
dev->bus->shutdown(dev);
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 67ec9d3d04f5..0eca20c5a80c 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -164,14 +164,7 @@ static int tpm_class_shutdown(struct device *dev)
chip->ops = NULL;
up_write(>ops_sem);
}
-   /* Allow bus- and device-specific code to run. Note: since chip->ops
-* is NULL, more-specific shutdown code will not be able to issue TPM
-* commands.
-*/
-   if (dev->bus && dev->bus->shutdown)
-   dev->bus->shutdown(dev);
-   else if (dev->driver && dev->driver->shutdown)
-   dev->driver->shutdown(dev);
+
return 0;
 }
 
@@ -214,7 +207,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
device_initialize(>devs);
 
chip->dev.class = tpm_class;
-   chip->dev.class->shutdown = tpm_class_shutdown;
+   chip->dev.class->shutdown_pre = tpm_class_shutdown;
chip->dev.release = tpm_dev_release;
chip->dev.parent = pdev;
chip->dev.groups = chip->groups;
diff --git a/include/linux/device.h b/include/linux/device.h
index beabdbc08420..649b1b72c76a 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -375,7 +375,7 @@ int subsys_virtual_register(struct bus_type *subsys,
  * @suspend:   Used to put the device to sleep mode, usually to a low power
  * state.
  * @resume:Used to bring the device from the sleep mode.
- * @shutdown:  Called at shut-down time to quiesce the device.
+ * @shutdown_pre: Called at shut-down time before driver shutdown.
  * @ns_type:   Callbacks so sysfs can detemine namespaces.
  * @namespace: Namespace of the device belongs to this class.
  * @pm:The default device power management operations of this 
class.
@@ -404,7 +404,7 @@ struct class {
 
int (*suspend)(struct device *dev, pm_message_t state);
int (*resume)(struct device *dev);
-   int (*shutdown)(struct device *dev);
+   int (*shutdown_pre)(struct device *dev);
 
const struct kobj_ns_type_operations *ns_type;
const void *(*namespace)(struct device *dev);
-- 
2.10.2



[PATCH] ibmvnic: Fix unused variable warning

2017-08-09 Thread Michal Suchanek
Fixes: a248878d7a1d ("ibmvnic: Check for transport event on driver resume")

Signed-off-by: Michal Suchanek 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 99576ba4187f..09c20d3b1b79 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3948,7 +3948,6 @@ static int ibmvnic_resume(struct device *dev)
 {
struct net_device *netdev = dev_get_drvdata(dev);
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-   int i;
 
if (adapter->state != VNIC_OPEN)
return 0;
-- 
2.10.2



[PATCH] Do not disable driver and bus shutdown hook when class shutdown hook is set.

2017-08-09 Thread Michal Suchanek
Disabling the driver hook by setting class hook is totally sound design
not prone to error as evidenced by the single implementation of the
class hook.

Fixes: d1bd4a792d39 ("tpm: Issue a TPM2_Shutdown for TPM2 devices.")
Fixes: f77af1516584 ("Add "shutdown" to "struct class".")

Signed-off-by: Michal Suchanek 
---
 drivers/base/core.c | 3 ++-
 drivers/char/tpm/tpm-chip.c | 9 +
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 755451f684bc..2cf752dc1421 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2668,7 +2668,8 @@ void device_shutdown(void)
if (initcall_debug)
dev_info(dev, "shutdown\n");
dev->class->shutdown(dev);
-   } else if (dev->bus && dev->bus->shutdown) {
+   }
+   if (dev->bus && dev->bus->shutdown) {
if (initcall_debug)
dev_info(dev, "shutdown\n");
dev->bus->shutdown(dev);
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 67ec9d3d04f5..edf8fa553f5f 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -164,14 +164,7 @@ static int tpm_class_shutdown(struct device *dev)
chip->ops = NULL;
up_write(>ops_sem);
}
-   /* Allow bus- and device-specific code to run. Note: since chip->ops
-* is NULL, more-specific shutdown code will not be able to issue TPM
-* commands.
-*/
-   if (dev->bus && dev->bus->shutdown)
-   dev->bus->shutdown(dev);
-   else if (dev->driver && dev->driver->shutdown)
-   dev->driver->shutdown(dev);
+
return 0;
 }
 
-- 
2.10.2



[PATCH 5/6] lib/cmdline.c: Implement single quotes in commandline argument parsing

2017-09-15 Thread Michal Suchanek
This brings the kernel parser about on par with bourne shell, grub, and
other tools that chew the arguments before kernel does.

This should make it easier to deal with multiple levels of
nesting/quoting. With same quoting grammar on each level there is less
room for confusion.

Signed-off-by: Michal Suchanek 
---
 lib/cmdline.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/lib/cmdline.c b/lib/cmdline.c
index d98bdc017545..c5335a79a177 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -191,34 +191,45 @@ bool parse_option_str(const char *str, const char *option)
return false;
 }
 
+#define squash_char { \
+   memmove(args + 1, args, i); \
+   args++; \
+   i--; \
+}
+
 /*
  * Parse a string to get a param value pair.
- * You can use " around spaces, and you can escape with \
+ * You can use " or ' around spaces, and you can escape with \
  * Hyphens and underscores equivalent in parameter names.
  */
 char *next_arg(char *args, char **param, char **val)
 {
unsigned int i, equals = 0;
-   int in_quote = 0, backslash = 0;
+   int in_quote = 0, backslash = 0, in_single = 0;
char *next;
 
for (i = 0; args[i]; i++) {
-   if (isspace(args[i]) && !in_quote && !backslash)
+   if (isspace(args[i]) && !in_quote && !backslash && !in_single)
break;
 
if ((equals == 0) && (args[i] == '='))
equals = i;
 
-   if (!backslash) {
-   if ((args[i] == '"') || (args[i] == '\\')) {
+   if (in_single) {
+   if (args[i] == '\'') {
+   in_single = 0;
+   squash_char;
+   }
+   } else if (!backslash) {
+   if ((args[i] == '"') || (args[i] == '\\') ||
+   (args[i] == '\'')) {
if (args[i] == '"')
in_quote = !in_quote;
if (args[i] == '\\')
backslash = 1;
-
-   memmove(args + 1, args, i);
-   args++;
-   i--;
+   if (args[i] == '\'')
+   in_single = 1;
+   squash_char;
}
} else {
backslash = 0;
-- 
2.10.2



[PATCH 3/6] powerpc/fadump: stop removing quotes in argument parsing.

2017-09-15 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 1678d99ea835..275ea42a27d5 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -494,13 +494,6 @@ static void __init fadump_update_params(struct param_info 
*param_info,
if (!val)
return;
 
-   /* remove one leading and one trailing quote if both are present */
-   if ((val[0] == '"') && (val[vallen - 1] == '"')) {
-   shortening += 2;
-   vallen -= 2;
-   val++;
-   }
-
strncpy(tgt, FADUMP_EXTRA_ARGS_PARAM, FADUMP_EXTRA_ARGS_LEN);
tgt += FADUMP_EXTRA_ARGS_LEN;
*tgt++ = ' ';
-- 
2.10.2



[PATCH 1/6] lib/cmdline.c: Add backslash support to kernel commandline parsing.

2017-09-15 Thread Michal Suchanek
This allows passing quotes in kernel arguments. It is useful for passing
fadump nested arguemnts in fadump_extra_args and might be useful if
somebody wanted to pass a double quote directly as part of an argument.

It is also useful to have quoting grammar more similar to shells and
bootloaders.

Signed-off-by: Michal Suchanek 
---
 lib/cmdline.c | 41 -
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/lib/cmdline.c b/lib/cmdline.c
index 6d398a8b63fc..d98bdc017545 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -193,30 +193,36 @@ bool parse_option_str(const char *str, const char *option)
 
 /*
  * Parse a string to get a param value pair.
- * You can use " around spaces, but can't escape ".
+ * You can use " around spaces, and you can escape with \
  * Hyphens and underscores equivalent in parameter names.
  */
 char *next_arg(char *args, char **param, char **val)
 {
unsigned int i, equals = 0;
-   int in_quote = 0, quoted = 0;
+   int in_quote = 0, backslash = 0;
char *next;
 
-   if (*args == '"') {
-   args++;
-   in_quote = 1;
-   quoted = 1;
-   }
-
for (i = 0; args[i]; i++) {
-   if (isspace(args[i]) && !in_quote)
+   if (isspace(args[i]) && !in_quote && !backslash)
break;
-   if (equals == 0) {
-   if (args[i] == '=')
-   equals = i;
+
+   if ((equals == 0) && (args[i] == '='))
+   equals = i;
+
+   if (!backslash) {
+   if ((args[i] == '"') || (args[i] == '\\')) {
+   if (args[i] == '"')
+   in_quote = !in_quote;
+   if (args[i] == '\\')
+   backslash = 1;
+
+   memmove(args + 1, args, i);
+   args++;
+   i--;
+   }
+   } else {
+   backslash = 0;
}
-   if (args[i] == '"')
-   in_quote = !in_quote;
}
 
*param = args;
@@ -225,13 +231,6 @@ char *next_arg(char *args, char **param, char **val)
else {
args[equals] = '\0';
*val = args + equals + 1;
-
-   /* Don't include quotes in value. */
-   if ((args[i-1] == '"') && ((quoted) || (**val == '"'))) {
-   args[i-1] = '\0';
-   if (!quoted)
-   (*val)++;
-   }
}
 
if (args[i]) {
-- 
2.10.2



[PATCH 6/6] Documentation/admin-guide: single quotes in kernel arguments.

2017-09-15 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 Documentation/admin-guide/kernel-parameters.rst | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.rst 
b/Documentation/admin-guide/kernel-parameters.rst
index 722d3f771924..1f9837266417 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -35,9 +35,10 @@ can also be entered as::
 
log-buf-len=1M print_fatal_signals=1
 
-Double-quotes and backslashes can be used to protect spaces in values, e.g.::
+Double-quotes single-quaotes and backslashes can be used to protect spaces
+in values, e.g.::
 
-   param="spaces in here" param2=spaces\ in\ here
+   param="spaces in here" param2=spaces\ in\ here param3='@%# !\'
 
 cpu lists:
 --
-- 
2.10.2



[PATCH 2/6] Documentation/admin-guide: backslash support in commandline.

2017-09-15 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 Documentation/admin-guide/kernel-parameters.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.rst 
b/Documentation/admin-guide/kernel-parameters.rst
index b2598cc9834c..722d3f771924 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -35,9 +35,9 @@ can also be entered as::
 
log-buf-len=1M print_fatal_signals=1
 
-Double-quotes can be used to protect spaces in values, e.g.::
+Double-quotes and backslashes can be used to protect spaces in values, e.g.::
 
-   param="spaces in here"
+   param="spaces in here" param2=spaces\ in\ here
 
 cpu lists:
 --
-- 
2.10.2



[PATCH 4/6] powerpc/fadump: Update fadump ducumentation on quoting arguments.

2017-09-15 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 Documentation/powerpc/firmware-assisted-dump.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.txt 
b/Documentation/powerpc/firmware-assisted-dump.txt
index 2df88524d2c7..5705f55ffae4 100644
--- a/Documentation/powerpc/firmware-assisted-dump.txt
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -173,7 +173,7 @@ How to enable firmware-assisted dump (fadump):
can be used to reduce memory consumption during dump capture by
disabling unwarranted resources/subsystems like CPUs, NUMA
and such. Value with spaces can be passed as
-   'fadump_extra_args=""parameter="value with spaces"""'
+   'fadump_extra_args="parameter=\"value with spaces\""'
 4. Optionally, user can also set 'crashkernel=' kernel cmdline
to specify size of the memory to reserve for boot memory dump
preservation.
-- 
2.10.2



[PATCH] powerpc/pseries: include linux/types.h in asm/hvcall.h

2018-01-15 Thread Michal Suchanek
Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush
settings") uses u64 in asm/hvcall.h without including linux/types.h

This breaks hvcall.h users that do not include the header themselves.

Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush
settings")

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/include/asm/hvcall.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index f0461618bf7b..eca3f9c68907 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -353,6 +353,7 @@
 #define PROC_TABLE_GTSE0x01
 
 #ifndef __ASSEMBLY__
+#include 
 
 /**
  * plpar_hcall_norets: - Make a pseries hypervisor call with no return 
arguments
-- 
2.13.6



[PATCH 1/2] powerpc/fadump: return 0 on re-registration

2017-06-26 Thread Michal Suchanek
When fadump is already registered return success.

Currently EEXIST is returned which is difficult to handle race-free in
userspace when shell scripts are used. If multiple writers are trying to
write '1' there is no difference in whichever succeeds so just return 0
to all.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 436aedf195ab..5a7355381dac 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1214,7 +1214,6 @@ static ssize_t fadump_register_store(struct kobject *kobj,
break;
case '1':
if (fw_dump.dump_registered == 1) {
-   ret = -EEXIST;
goto unlock_out;
}
/* Register Firmware-assisted dump */
-- 
2.10.2



[PATCH 2/2] powerpc/fadump: use kstrtoint to handle sysfs store

2017-06-26 Thread Michal Suchanek
Currently sysfs store handlers in fadump use if buf[0] == 'char'.

This means input "100foo" is interpreted as '1' and "01" as '0'.

Change to kstrtoint so leading zeroes and the like is handled in
expected way.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 5a7355381dac..241eff0b5f76 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1161,10 +1161,15 @@ static ssize_t fadump_release_memory_store(struct 
kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
 {
+   int input = -1;
+
if (!fw_dump.dump_active)
return -EPERM;
 
-   if (buf[0] == '1') {
+   if (kstrtoint(buf, 0, ))
+   return -EINVAL;
+
+   if (input == 1) {
/*
 * Take away the '/proc/vmcore'. We are releasing the dump
 * memory, hence it will not be valid anymore.
@@ -1198,21 +1203,25 @@ static ssize_t fadump_register_store(struct kobject 
*kobj,
const char *buf, size_t count)
 {
int ret = 0;
+   int input = -1;
 
if (!fw_dump.fadump_enabled || fdm_active)
return -EPERM;
 
+   if (kstrtoint(buf, 0, ))
+   return -EINVAL;
+
mutex_lock(_mutex);
 
-   switch (buf[0]) {
-   case '0':
+   switch (input) {
+   case 0:
if (fw_dump.dump_registered == 0) {
goto unlock_out;
}
/* Un-register Firmware-assisted dump */
fadump_unregister_dump();
break;
-   case '1':
+   case 1:
if (fw_dump.dump_registered == 1) {
goto unlock_out;
}
-- 
2.10.2



[PATCH] powerpc/mm/hash: Remove stale comment.

2017-07-11 Thread Michal Suchanek
In commit e6f81a92015b ("powerpc/mm/hash: Support 68 bit VA") the
masking is folded into ASM_VSID_SCRAMBLE but the comment about masking
is removed only from the firt use of ASM_VSID_SCRAMBLE.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/mm/slb_low.S | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index bde378559d01..8e95e01b9e8e 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -296,10 +296,6 @@ slb_compare_rr_to_size:
srdir10,r10,(SID_SHIFT_1T - SID_SHIFT)  /* get 1T ESID */
rldimi  r10,r9,ESID_BITS_1T,0
ASM_VSID_SCRAMBLE(r10,r9,r11,1T)
-   /*
-* bits above VSID_BITS_1T need to be ignored from r10
-* also combine VSID and flags
-*/
 
li  r10,MMU_SEGSIZE_1T
rldimi  r11,r10,SLB_VSID_SSIZE_SHIFT,0  /* insert segment size */
-- 
2.10.2



Re: [PATCH 2/3] mtd: spi-nor: core code for the Altera Quadspi Flash Controller v2

2017-07-04 Thread Michal Suchanek
On 4 July 2017 at 02:00, Cyrille Pitchen  wrote:
> Hi Matthew,
>
>
> Le 26/06/2017 à 18:13, matthew.gerl...@linux.intel.com a écrit :
>> From: Matthew Gerlach 

>> +static int altera_quadspi_setup_banks(struct device *dev,
>> +   u32 bank, struct device_node *np)
>> +{
>> + struct altera_quadspi *q = dev_get_drvdata(dev);
>> + struct altera_quadspi_flash *flash;
>> + struct spi_nor *nor;
>> + int ret = 0;
>> + char modalias[40] = {0};
>> + struct spi_nor_hwcaps hwcaps = {
>> + .mask = SNOR_HWCAPS_READ |
>> + SNOR_HWCAPS_READ_FAST |
>> + SNOR_HWCAPS_READ_1_1_2 |
>> + SNOR_HWCAPS_READ_1_1_4 |
>> + SNOR_HWCAPS_PP,
>> + };
>
> since aletera_quadspi_{read|erase} just don't care about
> nor->read_opcode, nor->program_opcode and so on and anyway override all
> settings chosen by spi-nor.c, it means they will use Dual or Quad SPI
> controllers as they want, whether SNOR_HWCAPS_READ_1_1_{2|4} are set or not.
> Then I think it's risky to declare the READ_1_1_2 and READ_1_1_4 hwcaps
> because it may trigger additionnal calls of nor->read_reg() /
> nor->write_reg() from spi_nor_scan() with op codes not supported by
> altera_quadspi_{read|write}_reg().
>
>> +
>> + if (bank > q->num_flashes - 1)
>> + return -EINVAL;
>> +
>> + altera_quadspi_chip_select(q, bank);
>> +
>> + flash = devm_kzalloc(q->dev, sizeof(*flash), GFP_KERNEL);
>> + if (!flash)
>> + return -ENOMEM;
>> +
>> + q->flash[bank] = flash;
>> + nor = >nor;
>> + nor->dev = dev;
>> + nor->priv = flash;
>> + nor->mtd.priv = nor;
>> + flash->q = q;
>> + flash->bank = bank;
>> + spi_nor_set_flash_node(nor, np);
>> +
>> + /* spi nor framework*/
>> + nor->read_reg = altera_quadspi_read_reg;
>> + nor->write_reg = altera_quadspi_write_reg;
>> + nor->read = altera_quadspi_read;
>> + nor->write = altera_quadspi_write;
>> + nor->erase = altera_quadspi_erase;
>> + nor->flash_lock = altera_quadspi_lock;
>> + nor->flash_unlock = altera_quadspi_unlock;
>
> nor->flash_lock and nor->flash_unlock are described as "FLASH SPECIFIC"
> in include/linux/mtd/spi-nor.h as opposed to "DRIVER SPECIFIC" functions
> like nor->read, nor->read_reg, ...
>
> It means the actual implementations should be provided by the spi-nor
> sub-system but not by each SPI controller driver.
>
>
>
> For me, it really sounds like a bad idea that this driver tries so much
> to mystify the spi-nor sub-system.
>
> I can understand that you have to cope with the hardware design and its
> limitations but clearly it looks the spi-nor API is not suited to this
> hardware. This driver ignores and by-passes any settings selected by
> spi_nor_scan().
> Duplicating code is generally a bad idea but in this case, I don't know
> if trying to reuse spi_nor_read() / spi_nor_write() and spi_nor_erase()
> from spi-nor.c is that helpful.
>
> Why not directly plug your driver into the above mtd layer implementing
> you own version of mtd->_read(), mtd->_write() and mtd->_erase() then
> registering the mtd device? It may be not the way to go but at least we
> should study this alternative.

AFAICT fsl-quadspi does just that preventing the use of the SPI
controller for non-flash devices.

There is at least one accelerated driver that is passed the opcodes to
program in the controller for read acceleration in spi_flash_read so
reusing that should be viable. If the opcodes can be programmed or
match what is hardcoded in the controller use the acceleration and
fallback to plain spi transfer if there is mismatch between what
m25p80_read requests and what the controller can do.

If this works and you can still use the plain SPI trnsfers the
controller will be much morer useful than fsl-quadspi.

Thanks

Michal


[PATCH] s390/decompressor: add fortify_panic as x86 has.

2017-12-07 Thread Michal Suchanek
Fix following error:

  LD  arch/s390/boot/compressed/vmlinux
drivers/s390/char/sclp_early_core.o: In function `memcpy':
../include/linux/string.h:340: undefined reference to `fortify_panic'
make[4]: *** [../arch/s390/boot/compressed/Makefile:29: 
arch/s390/boot/compressed/vmlinux] Error 1

Fixes: 79962038dffa ("s390: add support for FORTIFY_SOURCE")
Signed-off-by: Michal Suchanek 
---
 arch/s390/boot/compressed/misc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/s390/boot/compressed/misc.c b/arch/s390/boot/compressed/misc.c
index cecf38b9ec82..e79c4499c548 100644
--- a/arch/s390/boot/compressed/misc.c
+++ b/arch/s390/boot/compressed/misc.c
@@ -174,3 +174,7 @@ unsigned long decompress_kernel(void)
return (unsigned long) output;
 }
 
+void fortify_panic(const char *name)
+{
+   error("detected buffer overflow");
+}
-- 
2.13.6



[PATCH v8 6/6] powerpc/fadump: use the new parse_args callback arguments

2017-09-12 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 47 
 1 file changed, 13 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 8778e1cc0380..1678d99ea835 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -481,33 +481,19 @@ struct param_info {
 };
 
 static void __init fadump_update_params(struct param_info *param_info,
-   char *param, char *val)
+   char *param, char *val,
+   char *currant, char *next)
 {
-   ptrdiff_t param_offset = param - param_info->tmp_cmdline;
+   ptrdiff_t param_offset = currant - param_info->tmp_cmdline;
size_t vallen = val ? strlen(val) : 0;
char *tgt = param_info->cmdline + param_offset
- param_info->shortening;
-   int shortening = 0;
-   int quoted = 0;
+   int shortening = ((next - 1) - (currant))
+   - (FADUMP_EXTRA_ARGS_LEN + 1 + vallen);
 
if (!val)
return;
 
-   /* leading '"' removed from parameter */
-   if ((param > param_info->tmp_cmdline) && *(param - 1) == '"') {
-   quoted = 1;
-   shortening += 1;
-   tgt--;
-   }
-
-   /* next_arg removes one leading and one trailing '"' */
-   if ((*(tgt + FADUMP_EXTRA_ARGS_LEN + 1 + vallen + shortening) == '"') &&
-   (quoted || (*(tgt + FADUMP_EXTRA_ARGS_LEN + 1) == '"'))) {
-   shortening += 1;
-   if (!quoted)
-   shortening += 1;
-   }
-
/* remove one leading and one trailing quote if both are present */
if ((val[0] == '"') && (val[vallen - 1] == '"')) {
shortening += 2;
@@ -515,22 +501,15 @@ static void __init fadump_update_params(struct param_info 
*param_info,
val++;
}
 
-   /* some characters were removed - move the trailing part of cmdline */
-   if (shortening) {
-   char *src;
+   strncpy(tgt, FADUMP_EXTRA_ARGS_PARAM, FADUMP_EXTRA_ARGS_LEN);
+   tgt += FADUMP_EXTRA_ARGS_LEN;
+   *tgt++ = ' ';
+   strncpy(tgt, val, vallen);
+   tgt += vallen;
 
-   strncpy(tgt, FADUMP_EXTRA_ARGS_PARAM, FADUMP_EXTRA_ARGS_LEN);
-   tgt += FADUMP_EXTRA_ARGS_LEN;
-   *tgt++ = ' ';
-
-   strncpy(tgt, val, vallen);
-   tgt += vallen;
-
-   src = tgt + shortening;
+   if (shortening) {
+   char *src = tgt + shortening;
memmove(tgt, src, strlen(src) + 1);
-   } else {
-   /* remove the '=' */
-   *(tgt + FADUMP_EXTRA_ARGS_LEN) = ' ';
}
 
param_info->shortening += shortening;
@@ -550,7 +529,7 @@ static int __init fadump_rework_cmdline_params(char *param, 
char *val,
 strlen(FADUMP_EXTRA_ARGS_PARAM) - 1))
return 0;
 
-   fadump_update_params(param_info, param, val);
+   fadump_update_params(param_info, param, val, currant, next);
 
return 0;
 }
-- 
2.10.2



[PATCH v8 1/6] powerpc/fadump: reduce memory consumption for capture kernel

2017-09-12 Thread Michal Suchanek
With fadump (dump capture) kernel booting like a regular kernel, it needs
almost the same amount of memory to boot as the production kernel, which is
unwarranted for a dump capture kernel. But with no option to disable some
of the unnecessary subsystems in fadump kernel, that much memory is wasted
on fadump, depriving the production kernel of that memory.

Introduce kernel parameter 'fadump_extra_args=' that would take regular
parameters as a space separated quoted string, to be enforced when fadump
is active. This 'fadump_extra_args=' parameter can be leveraged to pass
parameters like nr_cpus=1, cgroup_disable=memory and numa=off, to disable
unwarranted resources/subsystems.

Also, ensure the log "Firmware-assisted dump is active" is printed early
in the boot process to put the subsequent fadump messages in context.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
Changes from v6:
Correct and simplify quote handling. Ideally I would like to extend
parse_args to give the length of the original quoted value to callback.
However, parse_args removes at most one doubel-quote from the start and
one from the end so that is easy to detect. Otherwise all other users
will have to be updated to trash the new argument.
Changes from v7:
Handle leading quote in parameter name.
---
 arch/powerpc/include/asm/fadump.h |   2 +
 arch/powerpc/kernel/fadump.c  | 122 +-
 arch/powerpc/kernel/prom.c|   7 +++
 3 files changed, 128 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index 5a23010af600..41b50b317a67 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -208,12 +208,14 @@ extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
 extern int fadump_reserve_mem(void);
 extern int setup_fadump(void);
+extern void enforce_fadump_extra_args(char *cmdline);
 extern int is_fadump_active(void);
 extern int should_fadump_crash(void);
 extern void crash_fadump(struct pt_regs *, const char *);
 extern void fadump_cleanup(void);
 
 #else  /* CONFIG_FA_DUMP */
+static inline void enforce_fadump_extra_args(char *cmdline) { }
 static inline int is_fadump_active(void) { return 0; }
 static inline int should_fadump_crash(void) { return 0; }
 static inline void crash_fadump(struct pt_regs *regs, const char *str) { }
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index e1431800bfb9..0e08f1a80af2 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -78,8 +78,10 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 * dump data waiting for us.
 */
fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump", NULL);
-   if (fdm_active)
+   if (fdm_active) {
+   pr_info("Firmware-assisted dump is active.\n");
fw_dump.dump_active = 1;
+   }
 
/* Get the sizes required to store dump data for the firmware provided
 * dump sections.
@@ -339,8 +341,11 @@ int __init fadump_reserve_mem(void)
 {
unsigned long base, size, memory_boundary;
 
-   if (!fw_dump.fadump_enabled)
+   if (!fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active)
+   pr_warn("Firmware-assisted dump was active but kernel 
booted with fadump disabled!\n");
return 0;
+   }
 
if (!fw_dump.fadump_supported) {
printk(KERN_INFO "Firmware-assisted dump is not supported on"
@@ -380,7 +385,6 @@ int __init fadump_reserve_mem(void)
memory_boundary = memblock_end_of_DRAM();
 
if (fw_dump.dump_active) {
-   printk(KERN_INFO "Firmware-assisted dump is active.\n");
/*
 * If last boot has crashed then reserve all the memory
 * above boot_memory_size so that we don't touch it until
@@ -467,6 +471,118 @@ static int __init early_fadump_reserve_mem(char *p)
 }
 early_param("fadump_reserve_mem", early_fadump_reserve_mem);
 
+#define FADUMP_EXTRA_ARGS_PARAM"fadump_extra_args="
+#define FADUMP_EXTRA_ARGS_LEN  (strlen(FADUMP_EXTRA_ARGS_PARAM) - 1)
+
+struct param_info {
+   char*cmdline;
+   char*tmp_cmdline;
+   int  shortening;
+};
+
+static void __init fadump_update_params(struct param_info *param_info,
+   char *param, char *val)
+{
+   ptrdiff_t param_offset = param - param_info->tmp_cmdline;
+   size_t vallen = val ? strlen(val) : 0;
+   char *tgt = param_info->cmdline + param_offset
+   - param_info->shortening;
+   int shortening = 0;
+   int quoted = 0;
+
+   if (!val)
+   r

[PATCH v8 4/6] powerpc/fadump: update the dequoting logic to match lib/cmdline.c

2017-09-12 Thread Michal Suchanek
Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 0e08f1a80af2..b214c1e333dd 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -501,10 +501,12 @@ static void __init fadump_update_params(struct param_info 
*param_info,
}
 
/* next_arg removes one leading and one trailing '"' */
-   if (*(tgt + FADUMP_EXTRA_ARGS_LEN + 1) == '"')
-   shortening += 1;
-   if (*(tgt + FADUMP_EXTRA_ARGS_LEN + 1 + vallen + shortening) == '"')
+   if ((*(tgt + FADUMP_EXTRA_ARGS_LEN + 1 + vallen + shortening) == '"') &&
+   (quoted || (*(tgt + FADUMP_EXTRA_ARGS_LEN + 1) == '"'))) {
shortening += 1;
+   if (!quoted)
+   shortening += 1;
+   }
 
/* remove one leading and one trailing quote if both are present */
if ((val[0] == '"') && (val[vallen - 1] == '"')) {
-- 
2.10.2



[PATCH v8 5/6] boot/param: add pointer to current and next argument to unknown parameter callback

2017-09-12 Thread Michal Suchanek
The fadump parameter processing re-does the logic of next_arg quote
stripping to determine where the argument ends. Pass pointer to the
current and next argument instead to make this more robust.

Signed-off-by: Michal Suchanek 
---
rebase on master
split off changes to fadump.c
add pointer to current argument to detect shortening of the parameterer name
---
 arch/powerpc/kernel/fadump.c |  1 +
 include/linux/moduleparam.h  |  1 +
 init/main.c  |  8 ++--
 kernel/module.c  |  5 +++--
 kernel/params.c  | 20 +---
 lib/dynamic_debug.c  |  1 +
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index b214c1e333dd..8778e1cc0380 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -541,6 +541,7 @@ static void __init fadump_update_params(struct param_info 
*param_info,
  * to enforce the parameters passed through it
  */
 static int __init fadump_rework_cmdline_params(char *param, char *val,
+  char *currant, char *next,
   const char *unused, void *arg)
 {
struct param_info *param_info = (struct param_info *)arg;
diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 1ee7b30dafec..e86f3f830a7f 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -327,6 +327,7 @@ extern char *parse_args(const char *name,
  s16 level_max,
  void *arg,
  int (*unknown)(char *param, char *val,
+char *currant, char *next,
 const char *doing, void *arg));
 
 /* Called by module remove. */
diff --git a/init/main.c b/init/main.c
index 0ee9c6866ada..9381aa24bca7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -240,6 +240,7 @@ early_param("loglevel", loglevel);
 
 /* Change NUL term back to "=", to make "param" the whole string. */
 static int __init repair_env_string(char *param, char *val,
+   char *unused3, char *unused2,
const char *unused, void *arg)
 {
if (val) {
@@ -258,6 +259,7 @@ static int __init repair_env_string(char *param, char *val,
 
 /* Anything after -- gets handed straight to init. */
 static int __init set_init_arg(char *param, char *val,
+  char *unused3, char *unused2,
   const char *unused, void *arg)
 {
unsigned int i;
@@ -265,7 +267,7 @@ static int __init set_init_arg(char *param, char *val,
if (panic_later)
return 0;
 
-   repair_env_string(param, val, unused, NULL);
+   repair_env_string(param, val, unused3, unused2, unused, NULL);
 
for (i = 0; argv_init[i]; i++) {
if (i == MAX_INIT_ARGS) {
@@ -283,9 +285,10 @@ static int __init set_init_arg(char *param, char *val,
  * unused parameters (modprobe will find them in /proc/cmdline).
  */
 static int __init unknown_bootoption(char *param, char *val,
+char *unused3, char *unused2,
 const char *unused, void *arg)
 {
-   repair_env_string(param, val, unused, NULL);
+   repair_env_string(param, val, unused3, unused2, unused, NULL);
 
/* Handle obsolete-style parameters */
if (obsolete_checksetup(param))
@@ -437,6 +440,7 @@ static noinline void __ref rest_init(void)
 
 /* Check for early params. */
 static int __init do_early_param(char *param, char *val,
+char *unused3, char *unused2,
 const char *unused, void *arg)
 {
const struct obs_kernel_param *p;
diff --git a/kernel/module.c b/kernel/module.c
index 40f983cbea81..0f74718f8934 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3609,8 +3609,9 @@ static int prepare_coming_module(struct module *mod)
return 0;
 }
 
-static int unknown_module_param_cb(char *param, char *val, const char *modname,
-  void *arg)
+static int unknown_module_param_cb(char *param, char *val,
+  char *unused, char *unused2,
+  const char *modname, void *arg)
 {
struct module *mod = arg;
int ret;
diff --git a/kernel/params.c b/kernel/params.c
index 60b2d8101355..c0e0c65f460b 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -119,6 +119,8 @@ static void param_check_unsafe(const struct kernel_param 
*kp)
 
 static int parse_one(char *param,
 char *val,
+char *currant,
+char *next,
 const char *doing,
 const struct kernel_param *params,
 unsigned num_params,
@@ -126,7 +128,8 @@ st

[PATCH v8 3/6] lib/cmdline.c: Remove quotes symmetrically.

2017-09-12 Thread Michal Suchanek
Remove quotes from argument value only if there is qoute on both sides.

Signed-off-by: Michal Suchanek 
---
 lib/cmdline.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/lib/cmdline.c b/lib/cmdline.c
index 171c19b6888e..6d398a8b63fc 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -227,14 +227,12 @@ char *next_arg(char *args, char **param, char **val)
*val = args + equals + 1;
 
/* Don't include quotes in value. */
-   if (**val == '"') {
-   (*val)++;
-   if (args[i-1] == '"')
-   args[i-1] = '\0';
+   if ((args[i-1] == '"') && ((quoted) || (**val == '"'))) {
+   args[i-1] = '\0';
+   if (!quoted)
+   (*val)++;
}
}
-   if (quoted && args[i-1] == '"')
-   args[i-1] = '\0';
 
if (args[i]) {
args[i] = '\0';
-- 
2.10.2



[PATCH v8 2/6] powerpc/fadump: update documentation about 'fadump_extra_args=' parameter

2017-09-12 Thread Michal Suchanek
With the introduction of 'fadump_extra_args=' parameter to pass additional
parameters to fadump (capture) kernel, update documentation about it.

Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
 Documentation/powerpc/firmware-assisted-dump.txt | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.txt 
b/Documentation/powerpc/firmware-assisted-dump.txt
index bdd344aa18d9..2df88524d2c7 100644
--- a/Documentation/powerpc/firmware-assisted-dump.txt
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -162,7 +162,19 @@ How to enable firmware-assisted dump (fadump):
 
 1. Set config option CONFIG_FA_DUMP=y and build kernel.
 2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
-3. Optionally, user can also set 'crashkernel=' kernel cmdline
+3. A user can pass additional command line parameters as a space
+   separated quoted list through 'fadump_extra_args=' parameter,
+   to be enforced when fadump is active. For example, parameter
+   'fadump_extra_args="nr_cpus=1 numa=off udev.children-max=2"'
+   will be changed to 'fadump_extra_args nr_cpus=1  numa=off
+   udev.children-max=2' in-place when fadump is active. This
+   parameter has no affect when fadump is not active. Multiple
+   instances of 'fadump_extra_args=' can be passed. This provision
+   can be used to reduce memory consumption during dump capture by
+   disabling unwarranted resources/subsystems like CPUs, NUMA
+   and such. Value with spaces can be passed as
+   'fadump_extra_args=""parameter="value with spaces"""'
+4. Optionally, user can also set 'crashkernel=' kernel cmdline
to specify size of the memory to reserve for boot memory dump
preservation.
 
@@ -172,6 +184,12 @@ NOTE: 1. 'fadump_reserve_mem=' parameter has been 
deprecated. Instead
   2. If firmware-assisted dump fails to reserve memory then it
  will fallback to existing kdump mechanism if 'crashkernel='
  option is set at kernel cmdline.
+  3. Special parameters like '--' passed inside fadump_extra_args are also
+ just left in-place. So, the user is advised to consider this while
+ specifying such parameters. It may be required to quote the argument
+ to fadump_extra_args when the bootloader uses double-quotes as
+ argument delimiter as well. eg
+append = " fadump_extra_args=\"nr_cpus=1 numa=off 
udev.children-max=2\""
 
 Sysfs/debugfs files:
 
-- 
2.10.2



[PATCH 2/2] dt-bindings: arm: sunxi: Fix Orange Pi Zero bindings

2020-10-08 Thread Michal Suchanek
There are two models of Orange Pi zero which are confusingly marketed
under the same name. Old model comes without a flash memory and current
model does have a flash memory. Add bindings for each model.

Signed-off-by: Michal Suchanek 
---
 Documentation/devicetree/bindings/arm/sunxi.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/sunxi.yaml 
b/Documentation/devicetree/bindings/arm/sunxi.yaml
index efc9118233b4..7e76ea544bf7 100644
--- a/Documentation/devicetree/bindings/arm/sunxi.yaml
+++ b/Documentation/devicetree/bindings/arm/sunxi.yaml
@@ -864,8 +864,15 @@ properties:
   - const: xunlong,orangepi-win
   - const: allwinner,sun50i-a64
 
+  - description: Xunlong OrangePi Zero  (old model without flash memory)
+items:
+  - const: xunlong,orangepi-zero-no-flash
+  - const: xunlong,orangepi-zero
+  - const: allwinner,sun8i-h2-plus
+
   - description: Xunlong OrangePi Zero
 items:
+  - const: xunlong,orangepi-zero-with-flash
   - const: xunlong,orangepi-zero
   - const: allwinner,sun8i-h2-plus
 
-- 
2.28.0



[PATCH 1/2] ARM: dts: sun8i: h2+: Fix Orange Pi Zero device description.

2020-10-08 Thread Michal Suchanek
There are two models of Orange Pi zero which are confusingly marketed
under the same name. Old model comes without a flash memory and current
model does have a flash memory. Build device tree for each model.

Signed-off-by: Michal Suchanek 
---
 arch/arm/boot/dts/Makefile|   1 +
 .../sun8i-h2-plus-orangepi-zero-no-flash.dts  | 210 ++
 .../boot/dts/sun8i-h2-plus-orangepi-zero.dts  | 201 +
 3 files changed, 215 insertions(+), 197 deletions(-)
 create mode 100644 arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero-no-flash.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 4572db3fa5ae..f2853cea0c9c 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -1168,6 +1168,7 @@ dtb-$(CONFIG_MACH_SUN8I) += \
sun8i-h2-plus-libretech-all-h3-cc.dtb \
sun8i-h2-plus-orangepi-r1.dtb \
sun8i-h2-plus-orangepi-zero.dtb \
+   sun8i-h2-plus-orangepi-zero-no-flash.dtb \
sun8i-h3-bananapi-m2-plus.dtb \
sun8i-h3-bananapi-m2-plus-v1.2.dtb \
sun8i-h3-beelink-x2.dtb \
diff --git a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero-no-flash.dts 
b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero-no-flash.dts
new file mode 100644
index ..3859b663e3f0
--- /dev/null
+++ b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero-no-flash.dts
@@ -0,0 +1,210 @@
+/*
+ * Copyright (C) 2016 Icenowy Zheng 
+ *
+ * Based on sun8i-h3-orangepi-one.dts, which is:
+ *   Copyright (C) 2016 Hans de Goede 
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This file is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+#include "sun8i-h3.dtsi"
+#include "sunxi-common-regulators.dtsi"
+
+#include 
+#include 
+
+/ {
+   model = "Xunlong Orange Pi Zero (old model without flash memory)";
+   compatible = "xunlong,orangepi-zero-no-flash",
+  "xunlong,orangepi-zero", "allwinner,sun8i-h2-plus";
+
+   aliases {
+   serial0 = 
+   /* ethernet0 is the H3 emac, defined in sun8i-h3.dtsi */
+   ethernet0 = 
+   ethernet1 = 
+   };
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+   };
+
+   leds {
+   compatible = "gpio-leds";
+
+   pwr_led {
+   label = "orangepi:green:pwr";
+   gpios = <_pio 0 10 GPIO_ACTIVE_HIGH>;
+   default-state = "on";
+   };
+
+   status_led {
+   label = "orangepi:red:status";
+   gpios = < 0 17 GPIO_ACTIVE_HIGH>;
+   };
+   };
+
+   reg_vcc_wifi: reg_vcc_wifi {
+   compatible = "regulator-fixed";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-name = "vcc-wifi";
+   enable-active-high;
+   gpio

[PATCH] char: virtio: Select VIRTIO from VIRTIO_CONSOLE.

2020-08-31 Thread Michal Suchanek
Make it possible to have virtio console built-in when
other virtio drivers are modular.

Signed-off-by: Michal Suchanek 
---
 drivers/char/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 3a144c000a38..9bd9917ca9af 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -93,8 +93,9 @@ config PPDEV
 
 config VIRTIO_CONSOLE
tristate "Virtio console"
-   depends on VIRTIO && TTY
+   depends on TTY
select HVC_DRIVER
+   select VIRTIO
help
  Virtio console for use with hypervisors.
 
-- 
2.28.0



[PATCH] Revert "powerpc/64s: machine check interrupt update NMI accounting"

2020-09-15 Thread Michal Suchanek
This reverts commit 116ac378bb3ff844df333e7609e7604651a0db9d.

This commit causes the kernel to oops and reboot when injecting a SLB
multihit which causes a MCE.

Before this commit a SLB multihit was corrected by the kernel and the
system continued to operate normally.

cc: sta...@vger.kernel.org
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI 
accounting")
Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/mce.c   |  7 ---
 arch/powerpc/kernel/traps.c | 18 +++---
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index ada59f6c4298..2e13528dcc92 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -591,14 +591,10 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
 long notrace machine_check_early(struct pt_regs *regs)
 {
long handled = 0;
-   bool nested = in_nmi();
u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
 
this_cpu_set_ftrace_enabled(0);
 
-   if (!nested)
-   nmi_enter();
-
hv_nmi_check_nonrecoverable(regs);
 
/*
@@ -607,9 +603,6 @@ long notrace machine_check_early(struct pt_regs *regs)
if (ppc_md.machine_check_early)
handled = ppc_md.machine_check_early(regs);
 
-   if (!nested)
-   nmi_exit();
-
this_cpu_set_ftrace_enabled(ftrace_enabled);
 
return handled;
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index d1ebe152f210..7853b770918d 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -827,19 +827,7 @@ void machine_check_exception(struct pt_regs *regs)
 {
int recover = 0;
 
-   /*
-* BOOK3S_64 does not call this handler as a non-maskable interrupt
-* (it uses its own early real-mode handler to handle the MCE proper
-* and then raises irq_work to call this handler when interrupts are
-* enabled).
-*
-* This is silly. The BOOK3S_64 should just call a different function
-* rather than expecting semantics to magically change. Something
-* like 'non_nmi_machine_check_exception()', perhaps?
-*/
-   const bool nmi = !IS_ENABLED(CONFIG_PPC_BOOK3S_64);
-
-   if (nmi) nmi_enter();
+   nmi_enter();
 
__this_cpu_inc(irq_stat.mce_exceptions);
 
@@ -865,7 +853,7 @@ void machine_check_exception(struct pt_regs *regs)
if (check_io_access(regs))
goto bail;
 
-   if (nmi) nmi_exit();
+   nmi_exit();
 
die("Machine check", regs, SIGBUS);
 
@@ -876,7 +864,7 @@ void machine_check_exception(struct pt_regs *regs)
return;
 
 bail:
-   if (nmi) nmi_exit();
+   nmi_exit();
 }
 
 void SMIException(struct pt_regs *regs)
-- 
2.28.0



[PATCH] ibmveth: Fix use of ibmveth in a bridge.

2020-10-26 Thread Michal Suchanek
From: Thomas Bogendoerfer 

The check for src mac address in ibmveth_is_packet_unsupported is wrong.
Commit 6f2275433a2f wanted to shut down messages for loopback packets,
but now suppresses bridged frames, which are accepted by the hypervisor
otherwise bridging won't work at all.

Fixes: 6f2275433a2f ("ibmveth: Detect unsupported packets before sending to the 
hypervisor")
Signed-off-by: Michal Suchanek 
---
ms: added commit message
---
 drivers/net/ethernet/ibm/ibmveth.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index 7ef3369953b6..c3ec9ceed833 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1031,12 +1031,6 @@ static int ibmveth_is_packet_unsupported(struct sk_buff 
*skb,
ret = -EOPNOTSUPP;
}
 
-   if (!ether_addr_equal(ether_header->h_source, netdev->dev_addr)) {
-   netdev_dbg(netdev, "source packet MAC address does not match 
veth device's, dropping packet.\n");
-   netdev->stats.tx_dropped++;
-   ret = -EOPNOTSUPP;
-   }
-
return ret;
 }
 
-- 
2.28.0



[PATCH] powerpc: Stop exporting __clear_user which is now inlined.

2020-12-04 Thread Michal Suchanek
Stable commit 452e2a83ea23 ("powerpc: Fix __clear_user() with KUAP
enabled") redefines __clear_user as inline function but does not remove
the export.

Fixes: 452e2a83ea23 ("powerpc: Fix __clear_user() with KUAP enabled")

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/lib/ppc_ksyms.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/lib/ppc_ksyms.c b/arch/powerpc/lib/ppc_ksyms.c
index c7f8e9586316..4b81fd96aa3e 100644
--- a/arch/powerpc/lib/ppc_ksyms.c
+++ b/arch/powerpc/lib/ppc_ksyms.c
@@ -24,7 +24,6 @@ EXPORT_SYMBOL(csum_tcpudp_magic);
 #endif
 
 EXPORT_SYMBOL(__copy_tofrom_user);
-EXPORT_SYMBOL(__clear_user);
 EXPORT_SYMBOL(copy_page);
 
 #ifdef CONFIG_PPC64
-- 
2.26.2



[PATCH 0/2] Tristate moount option comatibility fixup

2020-11-09 Thread Michal Suchanek
Hello,

after the tristate dax option change some applications fail to detect
pmem devices because the dax option no longer shows in mtab when device
is mounted with -o dax.

At first it might seem stupid to detect pmem by looking at the mount
options.

However, if the application actually wants a mount point properly
configured for dax rather than just backed by pmem I do not see any
other easy way.

Also this happens during early installtion steps when the mounted
filesystem is typically empty and you want to perform non-destructive
detection.

If there are better ways to detect dax enabled mount poins I want to
hear all about it. In the meantime we have legacy applications to
support.

It also makes sense that when you mount a device with -o dax it actually
shows dax in the mount options. Not doind so is confusing for humans as
well.

Thanks

Michal

Michal Suchanek (2):
  xfs: show the dax option in mount options.
  ext4: show the dax option in mount options

References: bsc#1178366

 fs/ext4/super.c| 2 +-
 fs/xfs/xfs_super.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.26.2



[PATCH 2/2] ext4: show the dax option in mount options

2020-11-09 Thread Michal Suchanek
ext4 accepts both dax and dax_always option but shows only dax_always.
Show both options.

Fixes: 9cb20f94afcd ("fs/ext4: Make DAX mount option a tri-state")
Signed-off-by: Michal Suchanek 
---
 fs/ext4/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ef4734b40e2a..7656c519cbe6 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2647,7 +2647,7 @@ static int _ext4_show_options(struct seq_file *seq, 
struct super_block *sb,
if (IS_EXT2_SB(sb))
SEQ_OPTS_PUTS("dax");
else
-   SEQ_OPTS_PUTS("dax=always");
+   SEQ_OPTS_PUTS("dax,dax=always");
} else if (test_opt2(sb, DAX_NEVER)) {
SEQ_OPTS_PUTS("dax=never");
} else if (test_opt2(sb, DAX_INODE)) {
-- 
2.26.2



[PATCH 1/2] xfs: show the dax option in mount options.

2020-11-09 Thread Michal Suchanek
xfs accepts both dax and dax_enum but shows only dax_enum. Show both
options.

Fixes: 8d6c3446ec23 ("fs/xfs: Make DAX mount option a tri-state")
Signed-off-by: Michal Suchanek 
---
 fs/xfs/xfs_super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index e3e229e52512..a3b3840d 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -163,7 +163,7 @@ xfs_fs_show_options(
{ XFS_MOUNT_GRPID,  ",grpid" },
{ XFS_MOUNT_DISCARD,",discard" },
{ XFS_MOUNT_LARGEIO,",largeio" },
-   { XFS_MOUNT_DAX_ALWAYS, ",dax=always" },
+   { XFS_MOUNT_DAX_ALWAYS, ",dax,dax=always" },
{ XFS_MOUNT_DAX_NEVER,  ",dax=never" },
{ 0, NULL }
};
-- 
2.26.2



[PATCH] powerpc/fadump: when fadump is supported register the fadump sysfs files.

2019-08-20 Thread Michal Suchanek
Currently it is not possible to distinguish the case when fadump is
supported by firmware and disabled in kernel and completely unsupported
using the kernel sysfs interface. User can investigate the devicetree
but it is more reasonable to provide sysfs files in case we get some
fadumpv2 in the future.

With this patch sysfs files are available whenever fadump is supported
by firmware.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/fadump.c | 32 ++--
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 4eab97292cc2..f35ab2433a9b 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1671,13 +1671,9 @@ static void fadump_init_files(void)
  */
 int __init setup_fadump(void)
 {
-   if (!fw_dump.fadump_enabled)
-   return 0;
-
-   if (!fw_dump.fadump_supported) {
+   if (!fw_dump.fadump_supported && fw_dump.fadump_enabled) {
printk(KERN_ERR "Firmware-assisted dump is not supported on"
" this hardware\n");
-   return 0;
}
 
fadump_show_config();
@@ -1685,18 +1681,26 @@ int __init setup_fadump(void)
 * If dump data is available then see if it is valid and prepare for
 * saving it to the disk.
 */
-   if (fw_dump.dump_active) {
+   if (fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active) {
+   /*
+* if dump process fails then invalidate the
+* registration and release memory before proceeding
+* for re-registration.
+*/
+   if (process_fadump(fdm_active) < 0)
+   fadump_invalidate_release_mem();
+   }
/*
-* if dump process fails then invalidate the registration
-* and release memory before proceeding for re-registration.
+* Initialize the kernel dump memory structure for FAD
+* registration.
 */
-   if (process_fadump(fdm_active) < 0)
-   fadump_invalidate_release_mem();
+   else if (fw_dump.reserve_dump_area_size)
+   init_fadump_mem_struct(,
+   fw_dump.reserve_dump_area_start);
}
-   /* Initialize the kernel dump memory structure for FAD registration. */
-   else if (fw_dump.reserve_dump_area_size)
-   init_fadump_mem_struct(, fw_dump.reserve_dump_area_start);
-   fadump_init_files();
+   if (fw_dump.fadump_supported)
+   fadump_init_files();
 
return 1;
 }
-- 
2.22.0



[PATCH rebased] powerpc/fadump: when fadump is supported register the fadump sysfs files.

2019-08-20 Thread Michal Suchanek
Currently it is not possible to distinguish the case when fadump is
supported by firmware and disabled in kernel and completely unsupported
using the kernel sysfs interface. User can investigate the devicetree
but it is more reasonable to provide sysfs files in case we get some
fadumpv2 in the future.

With this patch sysfs files are available whenever fadump is supported
by firmware.

Signed-off-by: Michal Suchanek 
---
Rebase on top of http://patchwork.ozlabs.org/patch/1150160/
[v5,31/31] powernv/fadump: support holes in kernel boot memory area
---
 arch/powerpc/kernel/fadump.c | 33 ++---
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 4b1bb3c55cf9..7ad424729e9c 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1319,13 +1319,9 @@ static void fadump_init_files(void)
  */
 int __init setup_fadump(void)
 {
-   if (!fw_dump.fadump_enabled)
-   return 0;
-
-   if (!fw_dump.fadump_supported) {
+   if (!fw_dump.fadump_supported && fw_dump.fadump_enabled) {
printk(KERN_ERR "Firmware-assisted dump is not supported on"
" this hardware\n");
-   return 0;
}
 
fadump_show_config();
@@ -1333,19 +1329,26 @@ int __init setup_fadump(void)
 * If dump data is available then see if it is valid and prepare for
 * saving it to the disk.
 */
-   if (fw_dump.dump_active) {
+   if (fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active) {
+   /*
+* if dump process fails then invalidate the
+* registration and release memory before proceeding
+* for re-registration.
+*/
+   if (fw_dump.ops->fadump_process(_dump) < 0)
+   fadump_invalidate_release_mem();
+   }
/*
-* if dump process fails then invalidate the registration
-* and release memory before proceeding for re-registration.
+* Initialize the kernel dump memory structure for FAD
+* registration.
 */
-   if (fw_dump.ops->fadump_process(_dump) < 0)
-   fadump_invalidate_release_mem();
-   }
-   /* Initialize the kernel dump memory structure for FAD registration. */
-   else if (fw_dump.reserve_dump_area_size)
-   fw_dump.ops->fadump_init_mem_struct(_dump);
+   else if (fw_dump.reserve_dump_area_size)
+   fw_dump.ops->fadump_init_mem_struct(_dump);
 
-   fadump_init_files();
+   }
+   if (fw_dump.fadump_supported)
+   fadump_init_files();
 
return 1;
 }
-- 
2.22.0



[PATCH] ARM: dts: sun8i: h2+: Enable optional SPI flash on Orange Pi Zero board

2020-09-29 Thread Michal Suchanek
The flash is present on all new boards and users went out of their way
to add it on the old ones.

Enabling it makes a more reasonable default.

Signed-off-by: Michal Suchanek 
---
 arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts 
b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
index f19ed981da9d..061d295bbba7 100644
--- a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
+++ b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
@@ -163,8 +163,8 @@  {
 };
 
  {
-   /* Disable SPI NOR by default: it optional on Orange Pi Zero boards */
-   status = "disabled";
+   /* Enable optional SPI NOR by default */
+   status = "okay";
 
flash@0 {
#address-cells = <1>;
-- 
2.28.0



[PATCH] net/ibmvnic: Fix missing { in __ibmvnic_reset

2019-09-09 Thread Michal Suchanek
Commit 1c2977c09499 ("net/ibmvnic: free reset work of removed device from 
queue")
adds a } without corresponding { causing build break.

Fixes: 1c2977c09499 ("net/ibmvnic: free reset work of removed device from 
queue")
Signed-off-by: Michal Suchanek 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 6644cabc8e75..5cb55ea671e3 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1984,7 +1984,7 @@ static void __ibmvnic_reset(struct work_struct *work)
rwi = get_next_rwi(adapter);
while (rwi) {
if (adapter->state == VNIC_REMOVING ||
-   adapter->state == VNIC_REMOVED)
+   adapter->state == VNIC_REMOVED) {
kfree(rwi);
rc = EBUSY;
break;
-- 
2.22.0



[PATCH resend 0/6] Fix cdrom autoclose

2018-01-26 Thread Michal Suchanek
Hello,

  there is cdrom autoclose feature that is supposed to close the tray, wait for
  the disc to become ready, and then open the device.

  This used to work in ancient times. Then in old times there was a hack in
  util-linux which worked around the breakage which probably resulted from
  switching to scsi emulation.

  Currently util-linux maintainer refuses to merge another hack on the basis 
that
  kernel still has the feature so it should be fixed there. Indeed, to implement
  this feature effectively from userspace one would need to know when the CD-ROM
  is in the "drive becoming ready" state which is knowledge that never leaves 
the
  hardware-specific driver and is passed neither to userspace nor the generic
  cdrom driver.

  So this patchset fixes the kernel autoclose implementation in cdrom.c and to
  do so reports the "drive becoming ready" state from the harware specific
  drivers.

First time I did not get any feedback for the patches. I found a defect in
tray_close - it used status function without checking it exists. So resending
with the defect corrected.

Michal Suchanek (6):
  delay: add poll_event_interruptible
  cdrom: factor out common open_for_* code
  cdrom: wait for tray to close
  cdrom: introduce CDS_DRIVE_ERROR
  Documentetion: cdrom: introduce CDS_DRIVE_ERROR
  cdrom: wait for drive to become ready

 Documentation/cdrom/cdrom-standard.tex |   8 ++-
 Documentation/cdrom/ide-cd |   6 ++
 Documentation/ioctl/cdrom.txt  |   1 +
 drivers/block/paride/pcd.c |   2 +-
 drivers/cdrom/cdrom.c  | 124 -
 drivers/cdrom/gdrom.c  |   2 +-
 drivers/ide/ide-cd_ioctl.c |  12 ++--
 drivers/scsi/sr_ioctl.c|   2 +-
 include/linux/delay.h  |  12 
 include/uapi/linux/cdrom.h |   1 +
 10 files changed, 99 insertions(+), 71 deletions(-)

-- 
2.13.6



[PATCH resend 2/6] cdrom: factor out common open_for_* code

2018-01-26 Thread Michal Suchanek
The open_for_audio and open_for_data copies are bitrotten in different
ways already and will need to update the autoclose logic in both.

Signed-off-by: Michal Suchanek 
---
 drivers/cdrom/cdrom.c | 100 ++
 1 file changed, 36 insertions(+), 64 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index e36d160c458f..89746b3d193f 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -1031,12 +1031,12 @@ static void cdrom_count_tracks(struct cdrom_device_info 
*cdi, tracktype *tracks)
 }
 
 static
-int open_for_data(struct cdrom_device_info *cdi)
+int open_for_common(struct cdrom_device_info *cdi, tracktype *tracks)
 {
int ret;
const struct cdrom_device_ops *cdo = cdi->ops;
-   tracktype tracks;
-   cd_dbg(CD_OPEN, "entering open_for_data\n");
+
+   cd_dbg(CD_OPEN, "entering open_for_common\n");
/* Check if the driver can report drive status.  If it can, we
   can do clever things.  If it can't, well, we at least tried! */
if (cdo->drive_status != NULL) {
@@ -1048,7 +1048,7 @@ int open_for_data(struct cdrom_device_info *cdi)
if (CDROM_CAN(CDC_CLOSE_TRAY) &&
cdi->options & CDO_AUTO_CLOSE) {
cd_dbg(CD_OPEN, "trying to close the tray\n");
-   ret=cdo->tray_move(cdi,0);
+   ret = cdo->tray_move(cdi, 0);
if (ret) {
cd_dbg(CD_OPEN, "bummer. tried to close 
the tray but failed.\n");
/* Ignore the error from the low
@@ -1056,37 +1056,45 @@ int open_for_data(struct cdrom_device_info *cdi)
couldn't close the tray.  We only care 
that there is no disc in the drive, 
since that is the _REAL_ problem here.*/
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
} else {
cd_dbg(CD_OPEN, "bummer. this drive can't close 
the tray.\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
/* Ok, the door should be closed now.. Check again */
ret = cdo->drive_status(cdi, CDSL_CURRENT);
-   if ((ret == CDS_NO_DISC) || (ret==CDS_TRAY_OPEN)) {
+   if ((ret == CDS_NO_DISC) || (ret == CDS_TRAY_OPEN)) {
cd_dbg(CD_OPEN, "bummer. the tray is still not 
closed.\n");
cd_dbg(CD_OPEN, "tray might not contain a 
medium\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
cd_dbg(CD_OPEN, "the tray is now closed\n");
}
-   /* the door should be closed now, check for the disc */
-   ret = cdo->drive_status(cdi, CDSL_CURRENT);
-   if (ret!=CDS_DISC_OK) {
-   ret = -ENOMEDIUM;
-   goto clean_up_and_return;
-   }
+   if (ret != CDS_DISC_OK)
+   return -ENOMEDIUM;
}
-   cdrom_count_tracks(cdi, );
-   if (tracks.error == CDS_NO_DISC) {
+   cdrom_count_tracks(cdi, tracks);
+   if (tracks->error == CDS_NO_DISC) {
cd_dbg(CD_OPEN, "bummer. no disc.\n");
-   ret=-ENOMEDIUM;
-   goto clean_up_and_return;
+   return -ENOMEDIUM;
}
+
+   return 0;
+}
+
+static
+int open_for_data(struct cdrom_device_info *cdi)
+{
+   int ret;
+   const struct cdrom_device_ops *cdo = cdi->ops;
+   tracktype tracks;
+
+   cd_dbg(CD_OPEN, "entering open_for_data\n");
+   ret = open_for_common(cdi, );
+   if (ret)
+   goto clean_up_and_return;
+
/* CD-Players which don't use O_NONBLOCK, workman
 * for example, need bit CDO_CHECK_TYPE cleared! */
if (tracks.data==0) {
@@ -1196,53 +1204,17 @@ int cdrom_open(struct cdrom_device_info *cdi, struct 
block_device *bdev,
 /* This code is similar to that in open_for_data. The routine is called
whenever an audio play operation is requested.
 */
-static int check_for_audio_disc(struct cdrom_device_info *cdi,
-   const struct cdrom_device_ops 

[PATCH resend 6/6] cdrom: wait for drive to become ready

2018-01-27 Thread Michal Suchanek
When the drive closes it can take tens of seconds until the disc is
analyzed. Wait for the drive to become ready or report an error.

Signed-off-by: Michal Suchanek 
---
 drivers/cdrom/cdrom.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 69e85c902373..9994441f5041 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -1087,6 +1087,15 @@ int open_for_common(struct cdrom_device_info *cdi, 
tracktype *tracks)
}
cd_dbg(CD_OPEN, "the tray is now closed\n");
}
+   /* the door should be closed now, check for the disc */
+   if (ret == CDS_DRIVE_NOT_READY) {
+   int poll_res = poll_event_interruptible(
+   CDS_DRIVE_NOT_READY !=
+   (ret = cdo->drive_status(cdi, CDSL_CURRENT)),
+   500);
+   if (poll_res == -ERESTARTSYS)
+   return poll_res;
+   }
if (ret != CDS_DISC_OK)
return -ENOMEDIUM;
}
-- 
2.13.6



[PATCH resend 4/6] cdrom: introduce CDS_DRIVE_ERROR

2018-01-27 Thread Michal Suchanek
CDS_DRIVE_NOT_READY is used for the state in which CDROM is 'becoming
ready' (typically analyzing the disc) but also as the fallback when
nothing else applies. Introduce CDS_DRIVE_ERROR for the fallback case.

Signed-off-by: Michal Suchanek 
---
 drivers/block/paride/pcd.c |  2 +-
 drivers/cdrom/gdrom.c  |  2 +-
 drivers/ide/ide-cd_ioctl.c | 12 
 drivers/scsi/sr_ioctl.c|  2 +-
 include/uapi/linux/cdrom.h |  1 +
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c
index 7b8c6368beb7..6e00093ff34e 100644
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -605,7 +605,7 @@ static int pcd_drive_status(struct cdrom_device_info *cdi, 
int slot_nr)
struct pcd_unit *cd = cdi->handle;
 
if (pcd_ready_wait(cd, PCD_READY_TMO))
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
if (pcd_atapi(cd, rc_cmd, 8, pcd_scratch, DBMSG("check media")))
return CDS_NO_DISC;
return CDS_DISC_OK;
diff --git a/drivers/cdrom/gdrom.c b/drivers/cdrom/gdrom.c
index 6495b03f576c..702f255bbe42 100644
--- a/drivers/cdrom/gdrom.c
+++ b/drivers/cdrom/gdrom.c
@@ -390,7 +390,7 @@ static int gdrom_drivestatus(struct cdrom_device_info 
*cd_info, int ignore)
if (sense == 0)
return CDS_DISC_OK;
if (sense == 0x20)
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
/* default */
return CDS_NO_INFO;
 }
diff --git a/drivers/ide/ide-cd_ioctl.c b/drivers/ide/ide-cd_ioctl.c
index 2acca12b9c94..9a26f50a2092 100644
--- a/drivers/ide/ide-cd_ioctl.c
+++ b/drivers/ide/ide-cd_ioctl.c
@@ -62,9 +62,13 @@ int ide_cdrom_drive_status(struct cdrom_device_info *cdi, 
int slot_nr)
return CDS_NO_DISC;
}
 
-   if (sense.sense_key == NOT_READY && sense.asc == 0x04
-   && sense.ascq == 0x04)
-   return CDS_DISC_OK;
+   if (sense.sense_key == NOT_READY && sense.asc == 0x04)
+   switch (sense.ascq) {
+   case 0x01:
+   return CDS_DRIVE_NOT_READY;
+   case 0x04:
+   return CDS_DISC_OK;
+   }
 
/*
 * If not using Mt Fuji extended media tray reports,
@@ -77,7 +81,7 @@ int ide_cdrom_drive_status(struct cdrom_device_info *cdi, int 
slot_nr)
else
return CDS_TRAY_OPEN;
}
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
 }
 
 /*
diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
index 2a21f2d48592..7c93f12a9cb8 100644
--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -333,7 +333,7 @@ int sr_drive_status(struct cdrom_device_info *cdi, int slot)
else
return CDS_TRAY_OPEN;
 
-   return CDS_DRIVE_NOT_READY;
+   return CDS_DRIVE_ERROR;
 }
 
 int sr_disk_status(struct cdrom_device_info *cdi)
diff --git a/include/uapi/linux/cdrom.h b/include/uapi/linux/cdrom.h
index 2817230148fd..339b1435f44e 100644
--- a/include/uapi/linux/cdrom.h
+++ b/include/uapi/linux/cdrom.h
@@ -398,6 +398,7 @@ struct cdrom_generic_command
 #define CDS_TRAY_OPEN  2
 #define CDS_DRIVE_NOT_READY3
 #define CDS_DISC_OK4
+#define CDS_DRIVE_ERROR5
 
 /* return values for the CDROM_DISC_STATUS ioctl */
 /* can also return CDS_NO_[INFO|DISC], from above */
-- 
2.13.6



[PATCH resend 3/6] cdrom: wait for tray to close

2018-01-27 Thread Michal Suchanek
The scsi command to close tray only starts the motor and does not wait
for the tray to close. Wait until the state chages from TRAY_OPEN so
users do not race with the tray closing.

This looks like inifinte wait but unless the drive is broken it either
closes the tray within a few seconds or reports an error when it detects
the tray is blocked. At worst the wait can be interrupted by user.

Signed-off-by: Michal Suchanek 
---
v2:
 - check drive_status exists before using it
 - rename tray_close -> cdrom_tray_close
---
 drivers/cdrom/cdrom.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 89746b3d193f..69e85c902373 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -281,7 +281,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 
 /* used to tell the module to turn on full debugging messages */
@@ -1030,6 +1032,18 @@ static void cdrom_count_tracks(struct cdrom_device_info 
*cdi, tracktype *tracks)
   tracks->cdi, tracks->xa);
 }
 
+static int cdrom_tray_close(struct cdrom_device_info *cdi)
+{
+   int ret;
+
+   ret = cdi->ops->tray_move(cdi, 0);
+   if (ret || !cdi->ops->drive_status)
+   return ret;
+
+   return poll_event_interruptible(CDS_TRAY_OPEN !=
+   cdi->ops->drive_status(cdi, CDSL_CURRENT), 500);
+}
+
 static
 int open_for_common(struct cdrom_device_info *cdi, tracktype *tracks)
 {
@@ -1048,7 +1062,9 @@ int open_for_common(struct cdrom_device_info *cdi, 
tracktype *tracks)
if (CDROM_CAN(CDC_CLOSE_TRAY) &&
cdi->options & CDO_AUTO_CLOSE) {
cd_dbg(CD_OPEN, "trying to close the tray\n");
-   ret = cdo->tray_move(cdi, 0);
+   ret = cdrom_tray_close(cdi);
+   if (ret == -ERESTARTSYS)
+   return ret;
if (ret) {
cd_dbg(CD_OPEN, "bummer. tried to close 
the tray but failed.\n");
/* Ignore the error from the low
@@ -2312,7 +2328,8 @@ static int cdrom_ioctl_closetray(struct cdrom_device_info 
*cdi)
 
if (!CDROM_CAN(CDC_CLOSE_TRAY))
return -ENOSYS;
-   return cdi->ops->tray_move(cdi, 0);
+
+   return cdrom_tray_close(cdi);
 }
 
 static int cdrom_ioctl_eject_sw(struct cdrom_device_info *cdi,
-- 
2.13.6



[PATCH resend 1/6] delay: add poll_event_interruptible

2018-01-27 Thread Michal Suchanek
Add convenience macro for polling an event that does not have a
waitqueue.

Signed-off-by: Michal Suchanek 
---
 include/linux/delay.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index b78bab4395d8..3ae9fa395628 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -64,4 +64,16 @@ static inline void ssleep(unsigned int seconds)
msleep(seconds * 1000);
 }
 
+#define poll_event_interruptible(event, interval) ({ \
+   int ret = 0; \
+   while (!(event)) { \
+   if (signal_pending(current)) { \
+   ret = -ERESTARTSYS; \
+   break; \
+   } \
+   msleep_interruptible(interval); \
+   } \
+   ret; \
+})
+
 #endif /* defined(_LINUX_DELAY_H) */
-- 
2.13.6



[PATCH resend 5/6] Documentetion: cdrom: introduce CDS_DRIVE_ERROR

2018-01-27 Thread Michal Suchanek
CDS_DRIVE_NOT_READY is used for the state in which CDROM is 'becoming
ready' (typically analyzing the disc) but also as the fallback when
nothing else applies. Introduce CDS_DRIVE_ERROR for the fallback case.

Signed-off-by: Michal Suchanek 
---
 Documentation/cdrom/cdrom-standard.tex | 8 +++-
 Documentation/cdrom/ide-cd | 6 ++
 Documentation/ioctl/cdrom.txt  | 1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/cdrom/cdrom-standard.tex 
b/Documentation/cdrom/cdrom-standard.tex
index 8f85b0e41046..018284ba696a 100644
--- a/Documentation/cdrom/cdrom-standard.tex
+++ b/Documentation/cdrom/cdrom-standard.tex
@@ -371,11 +371,17 @@ $$
 CDS_NO_INFO& no information available\cr
 CDS_NO_DISC& no disc is inserted, tray is closed\cr
 CDS_TRAY_OPEN& tray is opened\cr
-CDS_DRIVE_NOT_READY& something is wrong, tray is moving?\cr
+CDS_DRIVE_NOT_READY& tray just closed?\cr
 CDS_DISC_OK& a disc is loaded and everything is fine\cr
+CDS_DRIVE_ERROR& something is wrong\cr
 }
 $$
 
+Note: The IDE and SCSI cdroms have a status code 'drive becoming ready' which
+is typically returned when the drive has just closed and is analyzing the disc.
+For other cdrom types this state is not reported by the hardware or not
+implemented by the driver.
+
 \subsection{$Int\ media_changed(struct\ cdrom_device_info * cdi, int\ 
disc_nr)$}
 
 This function is very similar to the original function in $struct\ 
diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd
index a5f2a7f1ff46..9324a8fd9a39 100644
--- a/Documentation/cdrom/ide-cd
+++ b/Documentation/cdrom/ide-cd
@@ -455,6 +455,9 @@ main (int argc, char **argv)
case CDS_DRIVE_NOT_READY:
printf ("Drive Not Ready.\n");
break;
+   case CDS_DRIVE_ERROR:
+   printf ("Drive problem.\n");
+   break;
default:
printf ("This Should not happen!\n");
break;
@@ -481,6 +484,9 @@ main (int argc, char **argv)
case CDS_NO_INFO:
printf ("No Information available.");
break;
+   case CDS_DRIVE_ERROR:
+   printf ("Drive problem.\n");
+   break;
default:
printf ("This Should not happen!\n");
break;
diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt
index a4d62a9d6771..7720d11807c3 100644
--- a/Documentation/ioctl/cdrom.txt
+++ b/Documentation/ioctl/cdrom.txt
@@ -700,6 +700,7 @@ CDROM_DRIVE_STATUS  Get tray position, etc.
CDS_TRAY_OPEN
CDS_DRIVE_NOT_READY
CDS_DISC_OK
+   CDS_DRIVE_ERROR
-1  error
 
error returns:
-- 
2.13.6



[PATCH v5 4/5] powerpc/64: Make COMPAT user-selectable disabled on littleendian by default.

2019-08-29 Thread Michal Suchanek
On bigendian ppc64 it is common to have 32bit legacy binaries but much
less so on littleendian.

Signed-off-by: Michal Suchanek 
---
v3: make configurable
---
 arch/powerpc/Kconfig | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5bab0bb6b833..b0339e892329 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -264,8 +264,9 @@ config PANIC_TIMEOUT
default 180
 
 config COMPAT
-   bool
-   default y if PPC64
+   bool "Enable support for 32bit binaries"
+   depends on PPC64
+   default y if !CPU_LITTLE_ENDIAN
select COMPAT_BINFMT_ELF
select ARCH_WANT_OLD_COMPAT_IPC
select COMPAT_OLD_SIGACTION
-- 
2.22.0



[PATCH v5 5/5] powerpc/perf: split callchain.c by bitness

2019-08-29 Thread Michal Suchanek
Building callchain.c with !COMPAT proved quite ugly with all the
defines. Splitting out the 32bit and 64bit parts looks better.

Also rewrite current_is_64bit as common function. No other code change
intended.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/perf/Makefile   |   4 +
 arch/powerpc/perf/callchain.c| 388 +--
 arch/powerpc/perf/callchain.h|  11 +
 arch/powerpc/perf/callchain_32.c | 218 +
 arch/powerpc/perf/callchain_64.c | 185 +++
 5 files changed, 422 insertions(+), 384 deletions(-)
 create mode 100644 arch/powerpc/perf/callchain.h
 create mode 100644 arch/powerpc/perf/callchain_32.c
 create mode 100644 arch/powerpc/perf/callchain_64.c

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index c155dcbb8691..e9f3202251d0 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -1,6 +1,10 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_PERF_EVENTS)  += callchain.o perf_regs.o
+ifdef CONFIG_PERF_EVENTS
+obj-y  += callchain_$(BITS).o
+obj-$(CONFIG_COMPAT)   += callchain_32.o
+endif
 
 obj-$(CONFIG_PPC_PERF_CTRS)+= core-book3s.o bhrb.o
 obj64-$(CONFIG_PPC_PERF_CTRS)  += ppc970-pmu.o power5-pmu.o \
diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index 881be5c4e9bb..981005625c05 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -15,11 +15,9 @@
 #include 
 #include 
 #include 
-#ifdef CONFIG_COMPAT
-#include "../kernel/ppc32.h"
-#endif
 #include 
 
+#include "callchain.h"
 
 /*
  * Is sp valid as the address of the next kernel stack frame after prev_sp?
@@ -102,188 +100,6 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *re
}
 }
 
-#ifdef CONFIG_PPC64
-/*
- * On 64-bit we don't want to invoke hash_page on user addresses from
- * interrupt context, so if the access faults, we read the page tables
- * to find which page (if any) is mapped and access it directly.
- */
-static int read_user_stack_slow(void __user *ptr, void *buf, int nb)
-{
-   int ret = -EFAULT;
-   pgd_t *pgdir;
-   pte_t *ptep, pte;
-   unsigned shift;
-   unsigned long addr = (unsigned long) ptr;
-   unsigned long offset;
-   unsigned long pfn, flags;
-   void *kaddr;
-
-   pgdir = current->mm->pgd;
-   if (!pgdir)
-   return -EFAULT;
-
-   local_irq_save(flags);
-   ptep = find_current_mm_pte(pgdir, addr, NULL, );
-   if (!ptep)
-   goto err_out;
-   if (!shift)
-   shift = PAGE_SHIFT;
-
-   /* align address to page boundary */
-   offset = addr & ((1UL << shift) - 1);
-
-   pte = READ_ONCE(*ptep);
-   if (!pte_present(pte) || !pte_user(pte))
-   goto err_out;
-   pfn = pte_pfn(pte);
-   if (!page_is_ram(pfn))
-   goto err_out;
-
-   /* no highmem to worry about here */
-   kaddr = pfn_to_kaddr(pfn);
-   memcpy(buf, kaddr + offset, nb);
-   ret = 0;
-err_out:
-   local_irq_restore(flags);
-   return ret;
-}
-
-static int read_user_stack_64(unsigned long __user *ptr, unsigned long *ret)
-{
-   if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned long) ||
-   ((unsigned long)ptr & 7))
-   return -EFAULT;
-
-   pagefault_disable();
-   if (!__get_user_inatomic(*ret, ptr)) {
-   pagefault_enable();
-   return 0;
-   }
-   pagefault_enable();
-
-   return read_user_stack_slow(ptr, ret, 8);
-}
-
-static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
-{
-   if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned int) ||
-   ((unsigned long)ptr & 3))
-   return -EFAULT;
-
-   pagefault_disable();
-   if (!__get_user_inatomic(*ret, ptr)) {
-   pagefault_enable();
-   return 0;
-   }
-   pagefault_enable();
-
-   return read_user_stack_slow(ptr, ret, 4);
-}
-
-static inline int valid_user_sp(unsigned long sp, int is_64)
-{
-   if (!sp || (sp & 7) || sp > (is_64 ? TASK_SIZE : 0x1UL) - 32)
-   return 0;
-   return 1;
-}
-
-/*
- * 64-bit user processes use the same stack frame for RT and non-RT signals.
- */
-struct signal_frame_64 {
-   chardummy[__SIGNAL_FRAMESIZE];
-   struct ucontext uc;
-   unsigned long   unused[2];
-   unsigned inttramp[6];
-   struct siginfo  *pinfo;
-   void*puc;
-   struct siginfo  info;
-   charabigap[288];
-};
-
-static int is_sigreturn_64_address(unsigned long nip, unsigned long fp)
-{
-   if (nip == fp + offsetof(struct signal_frame_64, tramp))
-   return 1;
-   if (vdso64_rt_sigtramp && current->mm->context.vdso_base &&
-   nip == current->mm-&

[PATCH v5 2/5] powerpc: move common register copy functions from signal_32.c to signal.c

2019-08-29 Thread Michal Suchanek
These functions are required for 64bit as well.

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/signal.c| 141 
 arch/powerpc/kernel/signal_32.c | 140 ---
 2 files changed, 141 insertions(+), 140 deletions(-)

diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index e6c30cee6abf..60436432399f 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -18,12 +18,153 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 #include "signal.h"
 
+#ifdef CONFIG_VSX
+unsigned long copy_fpr_to_user(void __user *to,
+  struct task_struct *task)
+{
+   u64 buf[ELF_NFPREG];
+   int i;
+
+   /* save FPR copy to local buffer then write to the thread_struct */
+   for (i = 0; i < (ELF_NFPREG - 1) ; i++)
+   buf[i] = task->thread.TS_FPR(i);
+   buf[i] = task->thread.fp_state.fpscr;
+   return __copy_to_user(to, buf, ELF_NFPREG * sizeof(double));
+}
+
+unsigned long copy_fpr_from_user(struct task_struct *task,
+void __user *from)
+{
+   u64 buf[ELF_NFPREG];
+   int i;
+
+   if (__copy_from_user(buf, from, ELF_NFPREG * sizeof(double)))
+   return 1;
+   for (i = 0; i < (ELF_NFPREG - 1) ; i++)
+   task->thread.TS_FPR(i) = buf[i];
+   task->thread.fp_state.fpscr = buf[i];
+
+   return 0;
+}
+
+unsigned long copy_vsx_to_user(void __user *to,
+  struct task_struct *task)
+{
+   u64 buf[ELF_NVSRHALFREG];
+   int i;
+
+   /* save FPR copy to local buffer then write to the thread_struct */
+   for (i = 0; i < ELF_NVSRHALFREG; i++)
+   buf[i] = task->thread.fp_state.fpr[i][TS_VSRLOWOFFSET];
+   return __copy_to_user(to, buf, ELF_NVSRHALFREG * sizeof(double));
+}
+
+unsigned long copy_vsx_from_user(struct task_struct *task,
+void __user *from)
+{
+   u64 buf[ELF_NVSRHALFREG];
+   int i;
+
+   if (__copy_from_user(buf, from, ELF_NVSRHALFREG * sizeof(double)))
+   return 1;
+   for (i = 0; i < ELF_NVSRHALFREG ; i++)
+   task->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = buf[i];
+   return 0;
+}
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+unsigned long copy_ckfpr_to_user(void __user *to,
+ struct task_struct *task)
+{
+   u64 buf[ELF_NFPREG];
+   int i;
+
+   /* save FPR copy to local buffer then write to the thread_struct */
+   for (i = 0; i < (ELF_NFPREG - 1) ; i++)
+   buf[i] = task->thread.TS_CKFPR(i);
+   buf[i] = task->thread.ckfp_state.fpscr;
+   return __copy_to_user(to, buf, ELF_NFPREG * sizeof(double));
+}
+
+unsigned long copy_ckfpr_from_user(struct task_struct *task,
+ void __user *from)
+{
+   u64 buf[ELF_NFPREG];
+   int i;
+
+   if (__copy_from_user(buf, from, ELF_NFPREG * sizeof(double)))
+   return 1;
+   for (i = 0; i < (ELF_NFPREG - 1) ; i++)
+   task->thread.TS_CKFPR(i) = buf[i];
+   task->thread.ckfp_state.fpscr = buf[i];
+
+   return 0;
+}
+
+unsigned long copy_ckvsx_to_user(void __user *to,
+ struct task_struct *task)
+{
+   u64 buf[ELF_NVSRHALFREG];
+   int i;
+
+   /* save FPR copy to local buffer then write to the thread_struct */
+   for (i = 0; i < ELF_NVSRHALFREG; i++)
+   buf[i] = task->thread.ckfp_state.fpr[i][TS_VSRLOWOFFSET];
+   return __copy_to_user(to, buf, ELF_NVSRHALFREG * sizeof(double));
+}
+
+unsigned long copy_ckvsx_from_user(struct task_struct *task,
+ void __user *from)
+{
+   u64 buf[ELF_NVSRHALFREG];
+   int i;
+
+   if (__copy_from_user(buf, from, ELF_NVSRHALFREG * sizeof(double)))
+   return 1;
+   for (i = 0; i < ELF_NVSRHALFREG ; i++)
+   task->thread.ckfp_state.fpr[i][TS_VSRLOWOFFSET] = buf[i];
+   return 0;
+}
+#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#else
+inline unsigned long copy_fpr_to_user(void __user *to,
+ struct task_struct *task)
+{
+   return __copy_to_user(to, task->thread.fp_state.fpr,
+ ELF_NFPREG * sizeof(double));
+}
+
+inline unsigned long copy_fpr_from_user(struct task_struct *task,
+   void __user *from)
+{
+   return __copy_from_user(task->thread.fp_state.fpr, from,
+ ELF_NFPREG * sizeof(double));
+}
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+inline unsigned long copy_ckfpr_to_user(void __user *to,
+struct task_struct *task)
+{
+   return __copy_to_user(to, task->thread.ckfp_state.fpr,
+  

[PATCH v5 3/5] powerpc/64: make buildable without CONFIG_COMPAT

2019-08-29 Thread Michal Suchanek
There are numerous references to 32bit functions in generic and 64bit
code so ifdef them out.

Signed-off-by: Michal Suchanek 
---
v2:
- fix 32bit ifdef condition in signal.c
- simplify the compat ifdef condition in vdso.c - 64bit is redundant
- simplify the compat ifdef condition in callchain.c - 64bit is redundant
v3:
- use IS_ENABLED and maybe_unused where possible
- do not ifdef declarations
- clean up Makefile
v4:
- further makefile cleanup
- simplify is_32bit_task conditions
- avoid ifdef in condition by using return
v5:
- avoid unreachable code on 32bit
- make is_current_64bit constant on !COMPAT
- add stub perf_callchain_user_32 to avoid some ifdefs
---
 arch/powerpc/include/asm/thread_info.h |  4 ++--
 arch/powerpc/kernel/Makefile   |  7 +++
 arch/powerpc/kernel/entry_64.S |  2 ++
 arch/powerpc/kernel/signal.c   |  3 +--
 arch/powerpc/kernel/syscall_64.c   |  6 ++
 arch/powerpc/kernel/vdso.c |  5 ++---
 arch/powerpc/perf/callchain.c  | 13 +++--
 7 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/thread_info.h 
b/arch/powerpc/include/asm/thread_info.h
index 8e1d0195ac36..c128d8a48ea3 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -144,10 +144,10 @@ static inline bool test_thread_local_flags(unsigned int 
flags)
return (ti->local_flags & flags) != 0;
 }
 
-#ifdef CONFIG_PPC64
+#ifdef CONFIG_COMPAT
 #define is_32bit_task()(test_thread_flag(TIF_32BIT))
 #else
-#define is_32bit_task()(1)
+#define is_32bit_task()(IS_ENABLED(CONFIG_PPC32))
 #endif
 
 #if defined(CONFIG_PPC64)
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1d646a94d96c..9d8772e863b9 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -44,16 +44,15 @@ CFLAGS_btext.o += -DDISABLE_BRANCH_PROFILING
 endif
 
 obj-y  := cputable.o ptrace.o syscalls.o \
-  irq.o align.o signal_32.o pmc.o vdso.o \
+  irq.o align.o signal_$(BITS).o pmc.o vdso.o \
   process.o systbl.o idle.o \
   signal.o sysfs.o cacheinfo.o time.o \
   prom.o traps.o setup-common.o \
   udbg.o misc.o io.o misc_$(BITS).o \
   of_platform.o prom_parse.o
-obj-$(CONFIG_PPC64)+= setup_64.o sys_ppc32.o \
-  signal_64.o ptrace32.o \
-  paca.o nvram_64.o firmware.o \
+obj-$(CONFIG_PPC64)+= setup_64.o paca.o nvram_64.o firmware.o \
   syscall_64.o
+obj-$(CONFIG_COMPAT)   += sys_ppc32.o ptrace32.o signal_32.o
 obj-$(CONFIG_VDSO32)   += vdso32/
 obj-$(CONFIG_PPC_WATCHDOG) += watchdog.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)   += hw_breakpoint.o
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2ec825a85f5b..a2dbf216f607 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -51,8 +51,10 @@
 SYS_CALL_TABLE:
.tc sys_call_table[TC],sys_call_table
 
+#ifdef CONFIG_COMPAT
 COMPAT_SYS_CALL_TABLE:
.tc compat_sys_call_table[TC],compat_sys_call_table
+#endif
 
 /* This value is used to mark exception frames on the stack. */
 exception_marker:
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 60436432399f..61678cb0e6a1 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -247,7 +247,6 @@ static void do_signal(struct task_struct *tsk)
sigset_t *oldset = sigmask_to_save();
struct ksignal ksig = { .sig = 0 };
int ret;
-   int is32 = is_32bit_task();
 
BUG_ON(tsk != current);
 
@@ -277,7 +276,7 @@ static void do_signal(struct task_struct *tsk)
 
rseq_signal_deliver(, tsk->thread.regs);
 
-   if (is32) {
+   if (is_32bit_task()) {
if (ksig.ka.sa.sa_flags & SA_SIGINFO)
ret = handle_rt_signal32(, oldset, tsk);
else
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 98ed970796d5..0d5cbbe54cf1 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -38,7 +38,6 @@ typedef long (*syscall_fn)(long, long, long, long, long, 
long);
 
 long system_call_exception(long r3, long r4, long r5, long r6, long r7, long 
r8, unsigned long r0, struct pt_regs *regs)
 {
-   unsigned long ti_flags;
syscall_fn f;
 
BUG_ON(!(regs->msr & MSR_PR));
@@ -83,8 +82,7 @@ long system_call_exception(long r3, long r4, long r5, long 
r6, long r7, long r8,
 */
regs->softe = IRQS_ENABLED;
 
-   ti_flags = current_thread_info()->flags;
- 

[PATCH v5 1/5] powerpc: make llseek 32bit-only.

2019-08-29 Thread Michal Suchanek
The llseek syscall is not built in fs/read_write.c when !64bit && !COMPAT
With the syscall marked as common in syscall.tbl build fails in this
case.

The llseek inteface does not make sense on 64bit and it is explicitly
described as 32bit interface. Use on 64bit is not well-defined so just
drop it for 64bit.

Fixes: caf6f9c8a326 ("asm-generic: Remove unneeded
__ARCH_WANT_SYS_LLSEEK macro")

Signed-off-by: Michal Suchanek 
---
v5: update commit message.
---
 arch/powerpc/kernel/syscalls/syscall.tbl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 010b9f445586..53e427606f6c 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -188,7 +188,7 @@
 137common  afs_syscall sys_ni_syscall
 138common  setfsuidsys_setfsuid
 139common  setfsgidsys_setfsgid
-140common  _llseek sys_llseek
+14032  _llseek sys_llseek
 141common  getdentssys_getdents
compat_sys_getdents
 142common  _newselect  sys_select  
compat_sys_select
 143common  flock   sys_flock
-- 
2.22.0



[PATCH v5 0/5] Disable compat cruft on ppc64le v5

2019-08-29 Thread Michal Suchanek
Less code means less bugs so add a knob to skip the compat stuff.

This is tested on ppc64le top of

https://patchwork.ozlabs.org/cover/1153556/

Changes in v2: saner CONFIG_COMPAT ifdefs
Changes in v3:
 - change llseek to 32bit instead of builing it unconditionally in fs
 - clanup the makefile conditionals
 - remove some ifdefs or convert to IS_DEFINED where possible
Changes in v4:
 - cleanup is_32bit_task and current_is_64bit
 - more makefile cleanup
Changes in v5:
 - more current_is_64bit cleanup
 - split off callchain.c 32bit and 64bit parts

Michal Suchanek (5):
  powerpc: make llseek 32bit-only.
  powerpc: move common register copy functions from signal_32.c to
signal.c
  powerpc/64: make buildable without CONFIG_COMPAT
  powerpc/64: Make COMPAT user-selectable disabled on littleendian by
default.
  powerpc/perf: split callchain.c by bitness

 arch/powerpc/Kconfig |   5 +-
 arch/powerpc/include/asm/thread_info.h   |   4 +-
 arch/powerpc/kernel/Makefile |   7 +-
 arch/powerpc/kernel/entry_64.S   |   2 +
 arch/powerpc/kernel/signal.c | 144 -
 arch/powerpc/kernel/signal_32.c  | 140 -
 arch/powerpc/kernel/syscall_64.c |   6 +-
 arch/powerpc/kernel/syscalls/syscall.tbl |   2 +-
 arch/powerpc/kernel/vdso.c   |   5 +-
 arch/powerpc/perf/Makefile   |   4 +
 arch/powerpc/perf/callchain.c| 379 +--
 arch/powerpc/perf/callchain.h|  11 +
 arch/powerpc/perf/callchain_32.c | 218 +
 arch/powerpc/perf/callchain_64.c | 185 +++
 14 files changed, 579 insertions(+), 533 deletions(-)
 create mode 100644 arch/powerpc/perf/callchain.h
 create mode 100644 arch/powerpc/perf/callchain_32.c
 create mode 100644 arch/powerpc/perf/callchain_64.c

-- 
2.22.0



<    1   2   3   4   5   6   7   8   >