Re: console issue since 3.6, console=ttyS1 hangs

2016-11-04 Thread Peter Hurley
On Fri, Nov 4, 2016 at 3:33 PM, Nathan Zimmer  wrote:
> On Thu, Nov 03, 2016 at 06:25:46PM -0600, Peter Hurley wrote:
>> On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
>> > On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
>> >> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
>> >> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
>> >> >> I think this should be PNP0501 instead of PNP0c02.
>> >> >> Once I alter that then when I boot the serial comes up on irq 3. 
>> >> >> However it
>> >> >> still hangs.
>> >> >> I'll keep digging.
>> >> >
>> >> > Well that's that theory out of the window. I'm not sure where to look 
>> >> > now,
>> >> > I would start by enabling as many as possible of the "kernel hacking" 
>> >> > config
>> >> > options and see if anything gets caught.
>> >> >
>> >> > Looking at your earlier messages, you have a collection of percpu 
>> >> > allocation
>> >> > failures. That might be worth resolving before anything else.
>> >>
>> >> Hi Nathan,
>> >>
>> >> Couple of questions:
>> >> 1. Was login over serial console setup and working on SLES 11?  or was
>> >> the 'console=ttyS1' only for debug output?
>> >> I ask because console output doesn't use IRQs; iow, maybe the serial
>> >> port w/ driver never actually worked.
>> >> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
>> >> were probe errors even on that.
>> >>
>> >> An alternative that should be equivalent to your previous setup is to
>> >> build w/ CONFIG_SERIAL_8250_PNP=n
>> >> Seems like your ACPI BIOS is buggy, but also that something else is using 
>> >> IRQ 3?
>> >>
>> >> Regards,
>> >> Peter Hurley
>> >
>> >
>> >
>> > 1) Yes I can confirm I used it to login sometimes.
>> >
>> > I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in 
>> > that the system did not hang.
>> > However I couldn't login on the serial and got these error messages, I 
>> > suspect I broke something while trying different permutations.
>> >
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
>> > gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
>> > failures reached: check X server log for errors
>> >
>> > It did boot all the way though.
>> >
>> > 2) attached log
>>
>> So I'm confused where this leaves us.
>>
>> In your OP, you claim to have gotten it working with a partial revert
>> of commit 835d844d1a28 (but you didn't attach the partial revert so no
>> one knows what you did); however, my suggestion should have been
>> equivalent.
>
> I apologize, if I was unclear.  Your suggestion of CONFIG_SERIAL_8250_PNP=n 
> did successfully boot and provide messages
> across the console, and yes is basically equivelent to the revert.

Ok, so the partial revert didn't get the login working then?

> Those warnings I just noticed in the dmesg and they weren't there before.
>
>>
>> Note that you have the serial port disabled in BIOS; that's why you're
>> getting the probe error for PNP.
>
> Now when you say its diabled in bios, how can I be sure and double check that?

Well, the ACPI BIOS is reporting it as disabled. Even the SLES11 log says:

[2.136899] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (disabled)


> These bios screens do not have any mention of PNP settings.
> I am getting output over the console (via ipmi) until the boot hangs.

Yeah, probably the device actually decodes io address access anyway,
but in the disabled state probably has not routed IRQ.

I have no idea how to help you with the bios, sorry.

Regards,
Peter Hurley


Re: more hangs in the tty layer

2016-11-03 Thread Peter Hurley
aste.ubuntu.com/23273973/
>
> To me this seems like a lost wakeup event, especially that I've
> seen patches flying around fixing such issues. Unfortunately I
> cannot reliably reproduce this and booting a production machine
> with lockdep would be problematic.

Too bad because lockdep would definitively tell you what process is
holding the ldsem for this tty.
If you paste *all the tasks*, I can try to see if I recognize the task
stack that is holding the ldsem.

Regards,
Peter Hurley


Re: console issue since 3.6, console=ttyS1 hangs

2016-11-03 Thread Peter Hurley
On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
> On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
>> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
>> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
>> >> I think this should be PNP0501 instead of PNP0c02.
>> >> Once I alter that then when I boot the serial comes up on irq 3. However 
>> >> it
>> >> still hangs.
>> >> I'll keep digging.
>> >
>> > Well that's that theory out of the window. I'm not sure where to look now,
>> > I would start by enabling as many as possible of the "kernel hacking" 
>> > config
>> > options and see if anything gets caught.
>> >
>> > Looking at your earlier messages, you have a collection of percpu 
>> > allocation
>> > failures. That might be worth resolving before anything else.
>>
>> Hi Nathan,
>>
>> Couple of questions:
>> 1. Was login over serial console setup and working on SLES 11?  or was
>> the 'console=ttyS1' only for debug output?
>> I ask because console output doesn't use IRQs; iow, maybe the serial
>> port w/ driver never actually worked.
>> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
>> were probe errors even on that.
>>
>> An alternative that should be equivalent to your previous setup is to
>> build w/ CONFIG_SERIAL_8250_PNP=n
>> Seems like your ACPI BIOS is buggy, but also that something else is using 
>> IRQ 3?
>>
>> Regards,
>> Peter Hurley
>
>
>
> 1) Yes I can confirm I used it to login sometimes.
>
> I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in that 
> the system did not hang.
> However I couldn't login on the serial and got these error messages, I 
> suspect I broke something while trying different permutations.
>
> gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
> gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
> gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
> gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
> failures reached: check X server log for errors
>
> It did boot all the way though.
>
> 2) attached log

So I'm confused where this leaves us.

In your OP, you claim to have gotten it working with a partial revert
of commit 835d844d1a28 (but you didn't attach the partial revert so no
one knows what you did); however, my suggestion should have been
equivalent.

Note that you have the serial port disabled in BIOS; that's why you're
getting the probe error for PNP.

Regards,
Peter Hurley


Re: console issue since 3.6, console=ttyS1 hangs

2016-10-31 Thread Peter Hurley
On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
> On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
>> I think this should be PNP0501 instead of PNP0c02.
>> Once I alter that then when I boot the serial comes up on irq 3. However it
>> still hangs.
>> I'll keep digging.
>
> Well that's that theory out of the window. I'm not sure where to look now,
> I would start by enabling as many as possible of the "kernel hacking" config
> options and see if anything gets caught.
>
> Looking at your earlier messages, you have a collection of percpu allocation
> failures. That might be worth resolving before anything else.

Hi Nathan,

Couple of questions:
1. Was login over serial console setup and working on SLES 11?  or was
the 'console=ttyS1' only for debug output?
I ask because console output doesn't use IRQs; iow, maybe the serial
port w/ driver never actually worked.
2. Can you post dmesg for the SLES 11 setup?  That would show if there
were probe errors even on that.

An alternative that should be equivalent to your previous setup is to
build w/ CONFIG_SERIAL_8250_PNP=n
Seems like your ACPI BIOS is buggy, but also that something else is using IRQ 3?

Regards,
Peter Hurley


Re: [RFC PATCH 0/6] UART slave devices using serio

2016-10-31 Thread Peter Hurley
[ this time w/o the html ]

On Tue, Oct 25, 2016 at 4:02 PM, Rob Herring  wrote:
> > Maybe you can try to find some minutes at the Kernel Summit to talk
> > about this?
>
> Still waiting for my invite...
>
> But I will be at Plumbers if folks want to discuss this.

Hey Rob,

I'm here so let's make time to discuss this.

Regards,
Peter Hurley


Re: [PATCH v8 4/4] serial: pl011: add console matching function

2016-06-22 Thread Peter Hurley


> On Jun 22, 2016, at 5:18 AM, Yury Norov  wrote:
> 
>> On Fri, May 20, 2016 at 04:03:23PM +0300, Aleksey Makarov wrote:
>> This patch adds function pl011_console_match() that implements
>> method match of struct console.  It allows to match consoles against
>> data specified in a string, for example taken from command line or
>> compiled by ACPI SPCR table handler.
>> 
>> Signed-off-by: Aleksey Makarov 
>> Reviewed-by: Peter Hurley 
>> ---
>> drivers/tty/serial/amba-pl011.c | 56 
>> +
>> 1 file changed, 56 insertions(+)
>> 
>> diff --git a/drivers/tty/serial/amba-pl011.c 
>> b/drivers/tty/serial/amba-pl011.c
>> index a2aa655..388edc8 100644
>> --- a/drivers/tty/serial/amba-pl011.c
>> +++ b/drivers/tty/serial/amba-pl011.c
>> @@ -2288,12 +2288,68 @@ static int __init pl011_console_setup(struct console 
>> *co, char *options)
>>return uart_set_options(&uap->port, co, baud, parity, bits, flow);
>> }
>> 
>> +/**
>> + *pl011_console_match - non-standard console matching
>> + *@co:  registering console
>> + *@name:  name from console command line
>> + *@idx:  index from console command line
>> + *@options: ptr to option string from console command line
>> + *
>> + *Only attempts to match console command lines of the form:
>> + *console=pl011,mmio|mmio32,[,]
>> + *console=pl011,0x[,]
>> + *This form is used to register an initial earlycon boot console and
>> + *replace it with the amba_console at pl011 driver init.
>> + *
>> + *Performs console setup for a match (as required by interface)
>> + *If no  are specified, then assume the h/w is already setup.
>> + *
>> + *Returns 0 if console matches; otherwise non-zero to use default 
>> matching
>> + */
>> +static int __init pl011_console_match(struct console *co, char *name, int 
>> idx,
>> +  char *options)
>> +{
>> +char match[] = "pl011";/* pl011-specific earlycon name */
>> +unsigned char iotype;
>> +unsigned long addr;
>> +int i;
>> +
>> +if (strncmp(name, match, 5) != 0)
>> +return -ENODEV;
>> +
>> +if (uart_parse_earlycon(options, &iotype, &addr, &options))
>> +return -ENODEV;
>> +
>> +/* try to match the port specified on the command line */
>> +for (i = 0; i < ARRAY_SIZE(amba_ports); i++) {
>> +struct uart_port *port;
>> +
>> +if (!amba_ports[i])
>> +continue;
>> +
>> +port = &amba_ports[i]->port;
>> +
>> +if (iotype != UPIO_MEM && iotype != UPIO_MEM32)
>> +continue;
> 
> So it looks like iotype is constant inside the loop, and UPIO_MEM
> and UPIO_MEM32 too, of course. It means you can move this check out of
> cycle and avoid ports traversing at all in specific case. Am I wrong?
> 

No you're not wrong but I'd prefer if we don't use assumptions like that in 
port enumeration.


>> +
>> +if (port->mapbase != addr)
>> +continue;
>> +
>> +co->index = i;
>> +port->cons = co;
>> +return pl011_console_setup(co, options);
>> +}
>> +
>> +return -ENODEV;
>> +}
>> +
>> static struct uart_driver amba_reg;
>> static struct console amba_console = {
>>.name= "ttyAMA",
>>.write= pl011_console_write,
>>.device= uart_console_device,
>>.setup= pl011_console_setup,
>> +.match= pl011_console_match,
>>.flags= CON_PRINTBUFFER,
>>.index= -1,
>>.data= &amba_reg,
>> -- 
>> 2.8.2


Re: [RFC] serial: 8250: fix regression in 8250 uart driver

2016-06-13 Thread Peter Hurley
On 06/13/2016 01:13 PM, Dinh Nguyen wrote:
> Hi Andy,
> 
> I saw that you have discovered that commit ec5a11a91eec ("serial: 8250:
> Validate dmaengine rx chan meets requirements") introduced a regression
> in the 8250 uart driver. For SoCFPGA platform, I am seeing this error:
> 
> [5.541751] ttyS0 - failed to request DMA
> 
> Reverting the commit ec5a11a91eec removes the error.

The error is that your dmaengine driver doesn't support the functionality
required by the 8250 driver.

Regards,
Peter Hurley

> I saw that you started the discussion, but I didn't see that a patch was
> included[1].
> 
> The following patch seems to fix the error, but I'm not sure if it's the
> same fix that you had in mind.
> 
> Thanks,
> Dinh
> 
> [1] http://marc.info/?l=linux-serial&m=146254187602862&w=2
> 
> ---8<
> diff --git a/drivers/tty/serial/8250/8250_dma.c
> b/drivers/tty/serial/8250/8250_dma.c
> index 7f33d1c..847a203 100644
> --- a/drivers/tty/serial/8250/8250_dma.c
> +++ b/drivers/tty/serial/8250/8250_dma.c
> @@ -176,7 +176,7 @@ int serial8250_request_dma(struct uart_8250_port *p)
> ret = dma_get_slave_caps(dma->rxchan, &caps);
> if (ret)
> goto release_rx;
> -   if (!caps.cmd_pause || !caps.cmd_terminate ||
> +   if ((!caps.cmd_pause || !caps.cmd_terminate) &&
> caps.residue_granularity ==
> DMA_RESIDUE_GRANULARITY_DESCRIPTOR) {
> ret = -EINVAL;
> goto release_rx;
> 



Re: [PATCH] serial: earlycon: stop abusing console::index

2016-06-08 Thread Peter Hurley
On 06/03/2016 08:19 AM, Mark Rutland wrote:
> Commit cda64e6824026575 ("serial: earlycon: Fixup earlycon console name
> and index") added code to decompose an earlycon driver name into a
> string prefix and numeric suffix, and place this into console::name and
> console::index, such that we'd get a pretty name out of the core console
> code, which requires a name and index. This is simply to give a
> pretty/useful earlycon driver name in the log messages, and other than
> logging the name and index hold no significance.

The idea was to keep the console_cmdline[] array consistent with the
registered consoles, even though earlycon uses its own matching scheme.
Least surprise and all that.


> However, this is broken for drivers such as "pl011", where the "011"
> suffix gets stripped off, then subsequently restored, printed as a
> decimal, erroneously giving "pl11" in log messages.

Yeah, I saw that.

> Additionally, for
> earlycon drivers without a numeric suffix, "0" is added regardless. Thus
> the code is broken in some cases, and generally inconsistent.

I would like to continue with some kind of naming scheme that preserves
both the command line parameters (uart,uart8250,pl011,etc.) and uniquely
identifies which uart driver is the earlycon.

The current scheme could be fixed easily enough (by just using a single digit).
Or using a separator, ie. uart/0, pl011/0, etc.

Regards,
Peter Hurley


> Instead, this patch changes the earlycon code to consistently register
> "earlycon0", but ensures that the earlycon driver name is logged at
> earlycon_init time. This is obvious, consistent, and sufficient to
> provide the required information to the user.
> 
> With this in place, amongst the first messages from the kernel, the user
> will see something like:
> 
> [0.00] earlycon: earlycon0 (pl011) at MMIO 0x7ff8 
> (options '')
> [0.000000] bootconsole [earlycon0] enabled
> 
> Signed-off-by: Mark Rutland 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Cc: Leif Lindholm 
> Cc: Peter Hurley 
> Cc: Rob Herring 
> Cc: Will Deacon 
> Cc: linux-ser...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/tty/serial/earlycon.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c
> index 067783f..2b6622a 100644
> --- a/drivers/tty/serial/earlycon.c
> +++ b/drivers/tty/serial/earlycon.c
> @@ -29,7 +29,7 @@
>  #include 
>  
>  static struct console early_con = {
> - .name = "uart", /* fixed up at earlycon registration */
> + .name = "earlycon",
>   .flags =CON_PRINTBUFFER | CON_BOOT,
>   .index =0,
>  };
> @@ -60,24 +60,13 @@ static void __init earlycon_init(struct earlycon_device 
> *device,
>  {
>   struct console *earlycon = device->con;
>   struct uart_port *port = &device->port;
> - const char *s;
> - size_t len;
> -
> - /* scan backwards from end of string for first non-numeral */
> - for (s = name + strlen(name);
> -  s > name && s[-1] >= '0' && s[-1] <= '9';
> -  s--)
> - ;
> - if (*s)
> - earlycon->index = simple_strtoul(s, NULL, 10);
> - len = s - name;
> - strlcpy(earlycon->name, name, min(len + 1, sizeof(earlycon->name)));
> +
>   earlycon->data = &early_console_dev;
>  
>   if (port->iotype == UPIO_MEM || port->iotype == UPIO_MEM16 ||
>   port->iotype == UPIO_MEM32 || port->iotype == UPIO_MEM32BE)
> - pr_info("%s%d at MMIO%s %pa (options '%s')\n",
> - earlycon->name, earlycon->index,
> + pr_info("%s%d (%s) at MMIO%s %pa (options '%s')\n",
> + earlycon->name, earlycon->index, name,
>   (port->iotype == UPIO_MEM) ? "" :
>   (port->iotype == UPIO_MEM16) ? "16" :
>   (port->iotype == UPIO_MEM32) ? "32" : "32be",
> 



Re: [PATCH] serial_core: Change UART PM state to OFF on failure

2016-06-08 Thread Peter Hurley
Hi Pramod,

On 05/06/2016 02:46 AM, Pramod Gurav wrote:
> uart_change_pm is used to turn on the UART controller resources and
> change UART's PM status. On failure to allocate pages the controller
> be left in ON state. This will change the state to OFF on failure.
> 
> Signed-off-by: Pramod Gurav 
> ---
>  drivers/tty/serial/serial_core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/serial_core.c 
> b/drivers/tty/serial/serial_core.c
> index 62fe368..58af2e9 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -156,9 +156,10 @@ static int uart_port_startup(struct tty_struct *tty, 
> struct uart_state *state,
>   if (!state->xmit.buf) {
>   /* This is protected by the per port mutex */
>   page = get_zeroed_page(GFP_KERNEL);
> - if (!page)
> + if (!page) {

if (!uart_console(uport))

Otherwise, you'll be powering off the console.

Just out of curiosity, did you actually hit this error?

Regards,
Peter Hurley

> + uart_change_pm(state, UART_PM_STATE_OFF);
>   return -ENOMEM;
> -
> + }
>   state->xmit.buf = (unsigned char *) page;
>   uart_circ_clear(&state->xmit);
>   }
> 



Re: crosstool builds - https://www.kernel.org/pub/tools/crosstool/

2016-05-23 Thread Peter Hurley
Hi Jon,

On 05/19/2016 08:48 PM, Jon Masters wrote:
> Hi Tony,
> 
> I'm wondering whether you know of anyone who has picked up where you
> left off, building cross toolchains for various and sundry alternative
> architectures? I wanted to download an IA64 compiler recently (really)
> and noticed that the most recent builds were against GCC 4.6.3.

Someone's been making them.

https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/

arm64
alpha
arm = gnueabi
parisc = hppa
parisc64 = hppa64
mips
mips64
powerpc
powerpc64
sparc
sparc64
s390
...

Regards,
Peter Hurley


Re: [PATCH v4 2/5] locking/rwsem: Protect all writes to owner by WRITE_ONCE

2016-05-21 Thread Peter Hurley
On 05/18/2016 12:58 PM, Jason Low wrote:
> On Wed, 2016-05-18 at 14:29 -0400, Waiman Long wrote:
>> On 05/18/2016 01:21 PM, Jason Low wrote:
>>> On Wed, 2016-05-18 at 07:04 -0700, Davidlohr Bueso wrote:
>>>> On Tue, 17 May 2016, Waiman Long wrote:
>>>>
>>>>> Without using WRITE_ONCE(), the compiler can potentially break a
>>>>> write into multiple smaller ones (store tearing). So a read from the
>>>>> same data by another task concurrently may return a partial result.
>>>>> This can result in a kernel crash if the data is a memory address
>>>>> that is being dereferenced.
>>>>>
>>>>> This patch changes all write to rwsem->owner to use WRITE_ONCE()
>>>>> to make sure that store tearing will not happen. READ_ONCE() may
>>>>> not be needed for rwsem->owner as long as the value is only used for
>>>>> comparison and not dereferencing.
>>> It might be okay to leave out READ_ONCE() for reading rwsem->owner, but
>>> couldn't we include it to at least document that we're performing a
>>> "special" lockless read?
>>>
>>
>> Using READ_ONCE() does have a bit of cost as it limits compiler 
>> optimization. If we changes all access to rwsem->owner to READ_ONCE() 
>> and WRITE_ONCE(), we may as well change its type to volatile and be done 
>> with.
> 
> Right, although there are still places like the init function where
> WRITE_ONCE isn't necessary.

Which doesn't cost anything either.


>> I am not against doing that, but it feels a bit over-reach for me. 
>> On the other hand, we may define a do-nothing macro that designates the 
>> owner as a special variable for documentation purpose, but don't need 
>> protection at that particular call site.
> 
> It should be fine to use the standard READ_ONCE here, even if it's just
> for documentation, as it's probably not going to cost anything in
> practice. It would be better to avoid adding any special macros for this
> which may just add more complexity.

See, I don't understand this line of reasoning at all.

I read this as "it's ok to be non-optimal here where were spinning CPU
time but not ok to be non-optimal generally elsewhere where it's
way less important like at init time".

And by the way, it's not just "here" but _everywhere_.
What about reading ->on_cpu locklessly?

Sure it's a bool, but doesn't the "we need to document lockless access"
argument equally apply here?

Regards,
Peter Hurley



Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-17 Thread Peter Hurley
Hi Paul,

You can disregard this as I think we're talking about
the same things with the other email thread.

Regards,
Peter Hurley


On 05/17/2016 12:46 PM, Peter Hurley wrote:
> On 05/16/2016 10:22 AM, Paul E. McKenney wrote:
>> On Mon, May 16, 2016 at 07:17:42AM -0700, Peter Hurley wrote:
>>> On 05/16/2016 05:17 AM, Paul E. McKenney wrote:
>>>> On Mon, May 16, 2016 at 01:09:48PM +0200, Peter Zijlstra wrote:
>>>>> On Fri, May 13, 2016 at 10:58:05AM -0700, Peter Hurley wrote:
>>>>>>> Note that barrier() and READ_ONCE() have overlapping but not identical
>>>>>>> results and the combined use actually makes sense here.
>>>>>>>
>>>>>>> Yes, a barrier() anywhere in the loop will force a reload of the
>>>>>>> variable, _however_ it doesn't force that reload to not suffer from
>>>>>>> load tearing.
>>>>>>>
>>>>>>> Using volatile also forces a reload, but also ensures the load cannot
>>>>>>> be torn IFF it is of machine word side and naturally aligned.
>>>>>>>
>>>>>>> So while the READ_ONCE() here is pointless for forcing the reload;
>>>>>>> that's already ensured, we still need to make sure the load isn't torn.
>>>>>>
>>>>>> If load tearing a naturally aligned pointer is a real code generation
>>>>>> possibility then the rcu list code is broken too (which loads ->next
>>>>>> directly; cf. list_for_each_entry_rcu() & 
>>>>>> list_for_each_entry_lockless()).
>>>>>>
>>>>>> For 4.4, Paul added READ_ONCE() checks for list_empty() et al, but iirc
>>>>>> that had to do with control dependencies and not load tearing.
>>>>>
>>>>> Well, Paul is the one who started the whole load/store tearing thing, so
>>>>> I suppose he knows what he's doing.
>>>>
>>>> That had to do with suppressing false positives for one of Dmitry
>>>> Vjukov's concurrency checkers.  I suspect that Peter Hurley is right
>>>> that continued use of that checker would identify other places needing
>>>> READ_ONCE(), but from what I understand that is on hold pending a formal
>>>> definition of the Linux-kernel memory model.  (KCC and Dmitry (CCed)
>>>> can correct my if I am confused on this point.)
>>>>
>>>>> That said; its a fairly recent as things go so lots of code hasn't been
>>>>> updated yet, and its also a very unlikely thing for a compiler to do;
>>>>> since it mostly doesn't make sense to emit multiple instructions where
>>>>> one will do, so its not a very high priority thing either.
>>>>>
>>>>> But from what I understand, the compiler is free to emit all kinds of
>>>>> nonsense for !volatile loads/stores.
>>>>
>>>> That is quite true.  :-/
>>>>
>>>>>> OTOH, this patch might actually produce store-tearing:
>>>>>>
>>>>>> +static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
>>>>>> +{
>>>>>> +/*
>>>>>> + * We check the owner value first to make sure that we will only
>>>>>> + * do a write to the rwsem cacheline when it is really necessary
>>>>>> + * to minimize cacheline contention.
>>>>>> + */
>>>>>> +if (sem->owner != RWSEM_READER_OWNED)
>>>>>> +sem->owner = RWSEM_READER_OWNED;
>>>>>> +}
>>>>>
>>>>> Correct; which is why we should always use {READ,WRITE}_ONCE() for
>>>>> anything that is used locklessly.
>>>>
>>>> Completely agreed.  Improve readability of code by flagging lockless
>>>> shared-memory accesses, help checkers better find bugs, and prevent the
>>>> occasional compiler mischief!
>>>
>>> I think this would be a mistake for 3 reasons:
>>>
>>> 1. If READ_ONCE()/WRITE_ONCE() is necessary to prevent load/store tearing
>>>of any normally-atomic type (char/int/long/void*), then _every_ access
>>>would require READ_ONCE()/WRITE_ONCE(), thus eliminating any possibility
>>>of compiler optimization (eg. eliding redundant loads) where it would
>>>otherwise be possible.
>>
>> The point about eliding red

Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-17 Thread Peter Hurley
On 05/16/2016 10:22 AM, Paul E. McKenney wrote:
> On Mon, May 16, 2016 at 07:17:42AM -0700, Peter Hurley wrote:
>> On 05/16/2016 05:17 AM, Paul E. McKenney wrote:
>>> On Mon, May 16, 2016 at 01:09:48PM +0200, Peter Zijlstra wrote:
>>>> On Fri, May 13, 2016 at 10:58:05AM -0700, Peter Hurley wrote:
>>>>>> Note that barrier() and READ_ONCE() have overlapping but not identical
>>>>>> results and the combined use actually makes sense here.
>>>>>>
>>>>>> Yes, a barrier() anywhere in the loop will force a reload of the
>>>>>> variable, _however_ it doesn't force that reload to not suffer from
>>>>>> load tearing.
>>>>>>
>>>>>> Using volatile also forces a reload, but also ensures the load cannot
>>>>>> be torn IFF it is of machine word side and naturally aligned.
>>>>>>
>>>>>> So while the READ_ONCE() here is pointless for forcing the reload;
>>>>>> that's already ensured, we still need to make sure the load isn't torn.
>>>>>
>>>>> If load tearing a naturally aligned pointer is a real code generation
>>>>> possibility then the rcu list code is broken too (which loads ->next
>>>>> directly; cf. list_for_each_entry_rcu() & list_for_each_entry_lockless()).
>>>>>
>>>>> For 4.4, Paul added READ_ONCE() checks for list_empty() et al, but iirc
>>>>> that had to do with control dependencies and not load tearing.
>>>>
>>>> Well, Paul is the one who started the whole load/store tearing thing, so
>>>> I suppose he knows what he's doing.
>>>
>>> That had to do with suppressing false positives for one of Dmitry
>>> Vjukov's concurrency checkers.  I suspect that Peter Hurley is right
>>> that continued use of that checker would identify other places needing
>>> READ_ONCE(), but from what I understand that is on hold pending a formal
>>> definition of the Linux-kernel memory model.  (KCC and Dmitry (CCed)
>>> can correct my if I am confused on this point.)
>>>
>>>> That said; its a fairly recent as things go so lots of code hasn't been
>>>> updated yet, and its also a very unlikely thing for a compiler to do;
>>>> since it mostly doesn't make sense to emit multiple instructions where
>>>> one will do, so its not a very high priority thing either.
>>>>
>>>> But from what I understand, the compiler is free to emit all kinds of
>>>> nonsense for !volatile loads/stores.
>>>
>>> That is quite true.  :-/
>>>
>>>>> OTOH, this patch might actually produce store-tearing:
>>>>>
>>>>> +static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
>>>>> +{
>>>>> + /*
>>>>> +  * We check the owner value first to make sure that we will only
>>>>> +  * do a write to the rwsem cacheline when it is really necessary
>>>>> +  * to minimize cacheline contention.
>>>>> +  */
>>>>> + if (sem->owner != RWSEM_READER_OWNED)
>>>>> + sem->owner = RWSEM_READER_OWNED;
>>>>> +}
>>>>
>>>> Correct; which is why we should always use {READ,WRITE}_ONCE() for
>>>> anything that is used locklessly.
>>>
>>> Completely agreed.  Improve readability of code by flagging lockless
>>> shared-memory accesses, help checkers better find bugs, and prevent the
>>> occasional compiler mischief!
>>
>> I think this would be a mistake for 3 reasons:
>>
>> 1. If READ_ONCE()/WRITE_ONCE() is necessary to prevent load/store tearing
>>of any normally-atomic type (char/int/long/void*), then _every_ access
>>would require READ_ONCE()/WRITE_ONCE(), thus eliminating any possibility
>>of compiler optimization (eg. eliding redundant loads) where it would
>>otherwise be possible.
> 
> The point about eliding redundant loads is a good one, at least in those
> cases where it is a reasonable optimization.  Should we ever get to a
> point where we no longer use pre-C11 compilers, those use cases could
> potentially use memory_order_relaxed loads.  Preferably wrappered in
> something that can be typed with fewer characters.  And it could of course
> lead to an interesting discussion of what use cases would be required
> to justify this change, but what else is new?

I believe lockless access is quite widespread in the kernel, and this
use was 

Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-17 Thread Peter Hurley
On 05/16/2016 10:50 AM, Peter Zijlstra wrote:
> On Mon, May 16, 2016 at 07:17:42AM -0700, Peter Hurley wrote:
> 
>>>> Correct; which is why we should always use {READ,WRITE}_ONCE() for
>>>> anything that is used locklessly.
>>>
>>> Completely agreed.  Improve readability of code by flagging lockless
>>> shared-memory accesses, help checkers better find bugs, and prevent the
>>> occasional compiler mischief!
>>
>> I think this would be a mistake for 3 reasons:
>>
>> 1. If READ_ONCE()/WRITE_ONCE() is necessary to prevent load/store tearing
>>of any normally-atomic type (char/int/long/void*), then _every_ access
>>would require READ_ONCE()/WRITE_ONCE(), thus eliminating any possibility
>>of compiler optimization (eg. eliding redundant loads) where it would
>>otherwise be possible.
> 
> Should not really be a problem I think; you already have to be very
> careful when doing lockless stuff.

I think you may not understand my point here.

Consider normal list processing.

Paul added a READ_ONCE() load of head->next to list_empty().
That's because list_empty() is being used locklessly in lots of places,
not just for RCU lists.

Ok, so the load won't be torn. But the store to ->next can, so Paul
added WRITE_ONCE() to some but not all those list primitives too, even
though those aren't being used locklessly.

Note that the compiler _cannot_ optimize those stores even though
they're being used with locks held; IOW, exactly the situation that
volatile-consider-harmful rails against.

Similarly for the load in list_empty() even if it's used with locks held.

As Paul pointed out in his reply, there is no way to address this right now.

What's really needed is separate, not overloaded semantics:

1. "*If* you load or store this value, do it with single memory access, but
feel free to optimize (elide load/defer store/hoist load/whatever)."

2. "No, seriously, load/store this value regardless of what you think you
know." == READ_ONCE/WRITE_ONCE


If we start adding READ_ONCE/WRITE_ONCE to every lockless load/store,
worse code will be generated even though the vast majority of current use
is safe.


>> 2. Makes a mess of otherwise readable code.
> 
> We have to disagree here; I think code with {READ,WRITE}_ONCE() is more
> readable, as their presence is a clear indication something special is
> up.

I think you and Paul may wildly underestimate the scale of lockless
use in the kernel not currently annotated by READ_ONCE()/WRITE_ONCE().

Even in primitives designed for lockless concurrency like
rwsem and seqlock and sched/core, much less higher order code like
ipc/msg.c and vfs.

The majority of this lockless use is safe if the assumption
that loads/stores are performed atomically for char/short/int/long/void*
hold; in other words usage #1 above.


>> 3. Error-prone; ie., easy to overlook in review.
> 
> lockless stuff is error prone by nature; what's your point?

That lockless use is very common, even outside core functionality, and
needlessly splattering READ_ONCE/WRITE_ONCE when the existing code
generation is already safe, will not make it better. And that it will be
no safer by adding READ_ONCE/WRITE_ONCE because plenty of accesses will
be overlooked, like the stores to sem->owner are now.


>> There is no practical difference between _always_ using 
>> READ_ONCE()/WRITE_ONCE()
>> (to prevent tearing) and declaring the field volatile.
> 
> There is, declaring the field volatile doesn't make it stand out in the
> code while reading at all; it also doesn't allow you to omit the
> volatile qualifier in places where it doesn't matter.
> 
> The whole; variables aren't volatile, accesses are, thing is still very
> much in effect.

I'm not really seriously arguing for volatile declarations; I'm pointing
out that READ_ONCE()/WRITE_ONCE() has overloaded meaning that is
counter-productive.

For example, you point out that it doesn't allow you to omit the
volatile qualifier but yet we're adding it to underlying primitives
that may or may not be used locklessly.

Guaranteed list_empty() and list_add() are used way more than INIT_LIST_HEAD().


>> So we've come full-circle from volatile-considered-harmful.
> 
> I don't think so; the cases that document talks about are still very
> much relevant and volatile should not be used for that.

The example below is clearly wrong now.

"Another situation where one might be tempted to use volatile is
when the processor is busy-waiting on the value of a variable.  The 
right
way to perform a busy wait is:

while (my_variable != what_i_want)
cpu_relax();

The cpu_relax() call can 

Re: [PATCH 2/2] serial: 8250_mid: Read RX buffer on RX DMA timeout for DNV

2016-05-17 Thread Peter Hurley
On 05/13/2016 04:27 AM, Andy Shevchenko wrote:
> On Fri, 2016-05-13 at 18:15 +0800, kbuild test robot wrote:
>> Hi,
>>
>> [auto build test ERROR on next-20160513]
>> [cannot apply to tty/tty-testing usb/usb-testing v4.6-rc7 v4.6-rc6
>> v4.6-rc5 v4.6-rc7]
>> [if your patch is applied to the wrong git tree, please drop us a note
>> to help improving the system]
>>
>> url:https://github.com/0day-ci/linux/commits/Chuah-Kim-Tatt/Fix-DN
>> V-HSUART-RX-DMA-timeout-interrupt-issue/20160513-162046
>> config: i386-randconfig-s0-201619 (attached as .config)
>> compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
>> reproduce:
>> # save the attached .config to linux build tree
>> make ARCH=i386 
>>
>> Note: the linux-review/Chuah-Kim-Tatt/Fix-DNV-HSUART-RX-DMA-timeout-
>> interrupt-issue/20160513-162046 HEAD
>> 0354112aa9821bec8d278ad06b3d543724f5291d builds fine.
>>   It only hurts bisectibility.
>>
>> All errors (new ones prefixed by >>):
> 
> Peter, what happened to your DMA series in the linux-next? Did I miss
> any discussion related?

Sorry, I don't understand. What is the problem?

And where is this patch coming from? I didn't see it on linux-serial.

Regards,
Peter Hurley


>>>>
>>>> ERROR: "serial8250_rx_dma_flush"
>>>> [drivers/tty/serial/8250/8250_mid.ko] undefined!
>> ---
>> 0-DAY kernel test infrastructureOpen Source Technology
>> Center
>> https://lists.01.org/pipermail/kbuild-all   Intel
>> Corporation
> 



Re: tty crash in Linux 4.6

2016-05-17 Thread Peter Hurley
On 05/17/2016 08:57 AM, Peter Hurley wrote:
> On 05/16/2016 04:36 PM, Peter Hurley wrote:
>> > Hi Mikulas,
>> > 
>> > On 05/16/2016 01:12 PM, Mikulas Patocka wrote:
>>> >> Hi
>>> >>
>>> >> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the 
>>> >> crash by logging into the machine with ssh and typing before the prompt 
>>> >> appears.
>> > 
>> > Thanks for the report.
>> > I tried to reproduce this a number of times on different machines
>> > with no luck.
>
> I was able to reproduce this crash with a test jig.
> The patch below fixed it, but I'm testing a better patch now, which
> I'll get to you asap.

--- >% ---
Subject: [PATCH] tty: Fix ldisc crash on reopened tty

If the tty has been hungup, the ldisc instance may have been destroyed.
Continued input to the tty will be ignored as long as the ldisc instance
is not visible to the flush_to_ldisc kworker. However, when the tty
is reopened and a new ldisc instance is created, the flush_to_ldisc
kworker can obtain an ldisc reference before the new ldisc is
completely initialized. This will likely crash:

 BUG: unable to handle kernel paging request at 2260
 IP: [] n_tty_receive_buf_common+0x6d/0xb80
 PGD 2ab581067 PUD 290c11067 PMD 0
 Oops:  [#1] PREEMPT SMP
 Modules linked in: nls_iso8859_1 ip6table_filter [.]
 CPU: 2 PID: 103 Comm: kworker/u16:1 Not tainted 4.6.0-rc7+wip-xeon+debug 
#rc7+wip
 Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 
04/30/2012
 Workqueue: events_unbound flush_to_ldisc
 task: 8802ad16d100 ti: 8802ad31c000 task.ti: 8802ad31c000
 RIP: 0010:[]  [] 
n_tty_receive_buf_common+0x6d/0xb80
 RSP: 0018:8802ad31fc70  EFLAGS: 00010296
 RAX:  RBX: 8802aaddd800 RCX: 0001
 RDX:  RSI: 810db48f RDI: 0246
 RBP: 8802ad31fd08 R08:  R09: 0001
 R10: 8802aadddb28 R11: 0001 R12: 8800ba6da808
 R13: 8802ad18be80 R14: 8800ba6da858 R15: 8800ba6da800
 FS:  () GS:8802b0a0() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 2260 CR3: 00028ee5d000 CR4: 06e0
 Stack:
  81531219 8802aadddab8 8802aae0 8802aa78
  0001 8800ba6da858 8800ba6da860 8802ad31fd30
  81885f78 81531219  0002
 Call Trace:
  [] ? flush_to_ldisc+0x49/0xd0
  [] ? mutex_lock_nested+0x2c8/0x430
  [] ? flush_to_ldisc+0x49/0xd0
  [] n_tty_receive_buf2+0x14/0x20
  [] tty_ldisc_receive_buf+0x22/0x50
  [] flush_to_ldisc+0xbe/0xd0
  [] process_one_work+0x1ed/0x6e0
  [] ? process_one_work+0x16f/0x6e0
  [] worker_thread+0x4e/0x490
  [] ? process_one_work+0x6e0/0x6e0
  [] kthread+0xf2/0x110
  [] ? preempt_count_sub+0x4c/0x80
  [] ret_from_fork+0x22/0x50
  [] ? kthread_create_on_node+0x220/0x220
 Code: ff ff e8 27 a0 35 00 48 8d 83 78 05 00 00 c7 45 c0 00 00 00 00 48 89 45 
80 48
   8d 83 e0 05 00 00 48 89 85 78 ff ff ff 48 8b 45 b8 <48> 8b b8 60 22 00 
00 48
   8b 30 89 f8 8b 8b 88 04 00 00 29 f0 8d
 RIP  [] n_tty_receive_buf_common+0x6d/0xb80
  RSP 
 CR2: 2260

Ensure the kworker cannot obtain the ldisc reference until the new ldisc
is completely initialized.

Fixes: 892d1fa7eaae ("tty: Destroy ldisc instance on hangup")
Reported-by: Mikulas Patocka 
Signed-off-by: Peter Hurley 
---
 drivers/tty/tty_ldisc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index cdd063f..bda0c85 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -669,16 +669,17 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc)
tty_ldisc_put(tty->ldisc);
}
 
-   /* switch the line discipline */
-   tty->ldisc = ld;
tty_set_termios_ldisc(tty, disc);
-   retval = tty_ldisc_open(tty, tty->ldisc);
+   retval = tty_ldisc_open(tty, ld);
if (retval) {
if (!WARN_ON(disc == N_TTY)) {
-   tty_ldisc_put(tty->ldisc);
-   tty->ldisc = NULL;
+   tty_ldisc_put(ld);
+   ld = NULL;
}
}
+
+   /* switch the line discipline */
+   smp_store_release(&tty->ldisc, ld);
return retval;
 }
 
-- 
2.8.2



Re: [PATCH v3 4/4] locking/rwsem: Streamline the rwsem_optimistic_spin() code

2016-05-17 Thread Peter Hurley
On 05/12/2016 03:56 PM, Waiman Long wrote:
> This patch moves the owner loading and checking code entirely inside of
> rwsem_spin_on_owner() to simplify the logic of rwsem_optimistic_spin()
> loop.

Thanks for this.

Reviewed-by: Peter Hurley 



Re: [PATCH v3 2/4] locking/rwsem: Don't wake up one's own task

2016-05-17 Thread Peter Hurley
On 05/12/2016 03:56 PM, Waiman Long wrote:
> As rwsem_down_read_failed() will queue itself and potentially call
> __rwsem_do_wake(sem, RWSEM_WAKE_ANY), it is possible that a reader
> will try to wake up its own task. This patch adds a check to make
> sure that this won't happen.

Although there's no particular harm in the current code, this at
least spells out this condition is normal (ie., when a failed reader
wakes itself while waking the other waiting readers).

Reviewed-by: Peter Hurley 



Re: [PATCH v3 3/4] locking/rwsem: Improve reader wakeup code

2016-05-17 Thread Peter Hurley
On 05/12/2016 03:56 PM, Waiman Long wrote:
> In __rwsem_do_wake(), the reader wakeup code will assume a writer
> has stolen the lock if the active reader/writer count is not 0.
> However, this is not as reliable an indicator as the original
> "< RWSEM_WAITING_BIAS" check. If another reader is present, the code
> will still break out and exit even if the writer is gone. This patch
> changes it to check the same "< RWSEM_WAITING_BIAS" condition to
> reduce the chance of false positive.

Nice.

Reviewed-by: Peter Hurley 



Re: tty crash in Linux 4.6

2016-05-17 Thread Peter Hurley
On 05/16/2016 04:36 PM, Peter Hurley wrote:
> Hi Mikulas,
> 
> On 05/16/2016 01:12 PM, Mikulas Patocka wrote:
>> Hi
>>
>> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the 
>> crash by logging into the machine with ssh and typing before the prompt 
>> appears.
> 
> Thanks for the report.
> I tried to reproduce this a number of times on different machines
> with no luck.

I was able to reproduce this crash with a test jig.
The patch below fixed it, but I'm testing a better patch now, which
I'll get to you asap.

Regards,
Peter Hurley


>> The crash is caused by the pointer tty->disc_data being NULL in the 
>> function n_tty_receive_buf_common. The crash happens on the statement 
>> smp_load_acquire(&ldata->read_tail).
>>
>> Bisecting shows that the crashes are caused by the patch 
>> 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on 
>> hangup").
> 
> 
> Can you try the test patch below?
> 
> Regards,
> Peter Hurley
> 
> 
>> Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260)
>> CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1
>> Workqueue: events_unbound flush_to_ldisc
>> task: 7c25ea80 ti: 7d9e task.ti: 7d9e
>>
>>  YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
>> PSW: 1100 Not tainted
>> r00-03  0804000f 4076cd10 40475fb4 7f761800
>> r04-07  40749510 0001 7f761800 7d9e0490
>> r08-11  7e722890  7da4ec00 7f763823
>> r12-15   7fc08ea8 7fc08c78 4080e080
>> r16-19  7fc08c00 0001  2260
>> r20-23  7f7618b0 7c25ea80 0001 0001
>> r24-27   080f 7f7618ac 40749510
>> r28-31  0001 7d9e0840 7d9e0720 0001
>> sr00-03  086c8800   086c8800
>> sr04-07     
>>
>> IASQ:   IAOQ: 40475fd4 
>> 40475fd8
>>  IIR: 0e6c00d5ISR:   IOR: 2260
>>  CPU:0   CR30: 7d9e CR31: ff87e7ffbc9e
>>  ORIG_R28: 4080a180
>>  IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0
>>  IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0
>>  RP(r2): n_tty_receive_buf_common+0x94/0xbe0
>> Backtrace:
>>  [<40476b14>] n_tty_receive_buf2+0x14/0x20
>>  [<4047a208>] tty_ldisc_receive_buf+0x30/0x90
>>  [<4047a544>] flush_to_ldisc+0x144/0x1c8
>>  [<402556bc>] process_one_work+0x1b4/0x460
>>  [<40255bbc>] worker_thread+0x1e4/0x5e0
>>  [<4025d454>] kthread+0x134/0x168
> 
> --- >% ---
> diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> index 68947f6..f271832 100644
> --- a/drivers/tty/tty_ldisc.c
> +++ b/drivers/tty/tty_ldisc.c
> @@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty)
>   *   Returns 0 if successful, otherwise error code < 0
>   */
>  
> -int tty_ldisc_reinit(struct tty_struct *tty, int disc)
> +static int __tty_ldisc_reinit(struct tty_struct *tty, int disc)
>  {
>   struct tty_ldisc *ld;
>   int retval;
> @@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc)
>   return retval;
>  }
>  
> +int tty_ldisc_reinit(struct tty_struct *tty, int disc)
> +{
> + int retval;
> +
> + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
> + retval = __tty_ldisc_reinit(tty, disc);
> + tty_ldisc_unlock(tty);
> + return retval;
> +}
> +
>  /**
>   *   tty_ldisc_hangup-   hangup ldisc reset
>   *   @tty: tty being hung up
> @@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit)
>  
>   if (tty->ldisc) {
>   if (reinit) {
> - if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0)
> - tty_ldisc_reinit(tty, N_TTY);
> + if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0)
> + __tty_ldisc_reinit(tty, N_TTY);
>   } else
>   tty_ldisc_kill(tty);
>   }
> 



Re: ast: cursor flashing softlockups

2016-05-17 Thread Peter Hurley
[ +to Scot Doyle ]

Scot, please take a look at this soft lockup.

Regards,
Peter Hurley


Hi Ming,

On 05/17/2016 02:12 AM, Ming Lei wrote:
> Hi,
> 
> On Tue, May 17, 2016 at 4:07 AM, Dann Frazier
>  wrote:
>> Hi,
>>  I'm observing a soft lockup issue w/ the ASPEED controller on an
>> arm64 server platform. This was originally seen on Ubuntu's 4.4
>> kernel, but it is reproducible w/ vanilla 4.6-rc7 as well.
>>
>> [   32.792656] NMI watchdog: BUG: soft lockup - CPU#38 stuck for 22s!
>> [swapper/38:0]
>>
>> I observe this just once each time I boot into debian-installer (I'm
>> using a serial console, but the ast module gets loaded during
>> startup).
> 
> I have figured out that it is caused by 'mod_timer(timer, jiffies)' and
> 'ops->cur_blink_jiffies' is observed as zero in cursor_timer_handler()
> when the issue happened.

Thanks for tracking this down.

This softlockup looks to be caused by:

commit 27a4c827c34ac4256a190cc9d24607f953c1c459
Author: Scot Doyle 
Date:   Thu Mar 26 13:56:38 2015 +

fbcon: use the cursor blink interval provided by vt

vt now provides a cursor blink interval via vc_data. Use this
interval instead of the currently hardcoded 200 msecs. Store it in
fbcon_ops to avoid locking the console in cursor_timer_handler().

Signed-off-by: Scot Doyle 
Acked-by: Pavel Machek 
Signed-off-by: Greg Kroah-Hartman 

and

commit bd63364caa8df38bad2b25b11b2a1b849475cce5
Author: Scot Doyle 
Date:   Thu Mar 26 13:54:39 2015 +

vt: add cursor blink interval escape sequence

Add an escape sequence to specify the current console's cursor blink
interval. The interval is specified as a number of milliseconds 
until
the next cursor display state toggle, from 50 to 65535. 
/proc/loadavg
did not show a difference with a one msec interval, but the lower
bound is set to 50 msecs since slower hardware wasn't tested.

Store the interval in the vc_data structure for later access by 
fbcon,
initializing the value to fbcon's current hardcoded value of 200 
msecs.

Signed-off-by: Scot Doyle 
Acked-by: Pavel Machek 
Signed-off-by: Greg Kroah-Hartman 



> Looks it is a real fbcon/vt issue, see following:
> 
> fbcon_init()
> <-.con_init
>   <-visual_init()
> 
> reset_terminal()
> <-vc_init()
> 
> vc->vc_cur_blink_ms is just set in reset_terminal() from vc_init() path,
> and ops->cur_blink_jiffies is figured out from vc->vc_cur_blink_ms
> in fbcon_init().
> 
> And visual_init() is always run before vc_init(), so ops->cur_blink_jiffies
> is initialized as zero and cause the soft lockup issue finally.
> 
> Thanks,
> Ming
> 
>>
>> perf shows that the CPU caught by the NMI is typically in code
>> updating the cursor timer:
>>
>> -   16.92%  swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
>>- _raw_spin_unlock_irqrestore
>>   + 16.87% mod_timer
>>   + 0.05% cursor_timer_handler
>> -   12.15%  swapper  [kernel.kallsyms]  [k] queue_work_on
>>- queue_work_on
>>   + 12.00% cursor_timer_handler
>>   + 0.15% call_timer_fn
>> +   10.98%  swapper  [kernel.kallsyms]  [k] run_timer_softirq
>> -2.23%  swapper  [kernel.kallsyms]  [k] mod_timer
>>- mod_timer
>>   + 1.97% cursor_timer_handler
>>   + 0.26% call_timer_fn
>>
>> During the same period, I can see that another CPU is actively
>> executing the timer function:
>>
>> -   42.18%  kworker/u96:2  [kernel.kallsyms]  [k] ww_mutex_unlock
>>- ww_mutex_unlock
>>   - 40.70% ast_dirty_update
>>ast_imageblit
>>soft_cursor
>>bit_cursor
>>fb_flashcursor
>>process_one_work
>>worker_thread
>>kthread
>>ret_from_fork
>>   + 1.48% ast_imageblit
>> -   40.15%  kworker/u96:2  [kernel.kallsyms]  [k] __memcpy_toio
>>- __memcpy_toio
>>   + 31.54% ast_dirty_update
>>   + 8.61% ast_imageblit
>>
>> Using the graph function tracer on fb_flashcursor(), I see that
>> ast_dirty_update usually takes around 60 us, in which it makes 16
>> calls to __memcpy_toio(). However, there is always one instance on
>> every boot of the installer where ast_dirty_update() takes ~98 *ms* to
>> complete, during which it makes 743 calls to __memcpy_toio(). While
>> that  doesn't directly account for the full 22s, I do wonder if that
>> maybe a smoking gun.
>>
>> fyi, this is being tracked at: https://bugs.launchpad.net/bugs/1574814
>>
>>   -dann



Re: tty crash in Linux 4.6

2016-05-16 Thread Peter Hurley
Hi Mikulas,

On 05/16/2016 01:12 PM, Mikulas Patocka wrote:
> Hi
> 
> In the kernel 4.6 I get crashes in the tty layer. I can reproduce the 
> crash by logging into the machine with ssh and typing before the prompt 
> appears.

Thanks for the report.
I tried to reproduce this a number of times on different machines
with no luck.


> The crash is caused by the pointer tty->disc_data being NULL in the 
> function n_tty_receive_buf_common. The crash happens on the statement 
> smp_load_acquire(&ldata->read_tail).
> 
> Bisecting shows that the crashes are caused by the patch 
> 892d1fa7eaaed9d3c04954cb140c34ebc3393932 ("tty: Destroy ldisc instance on 
> hangup").


Can you try the test patch below?

Regards,
Peter Hurley


> Kernel Fault: Code=15 regs=7d9e0720 (Addr=2260)
> CPU: 0 PID: 3319 Comm: kworker/u8:0 Not tainted 4.6.0 #1
> Workqueue: events_unbound flush_to_ldisc
> task: 7c25ea80 ti: 7d9e task.ti: 7d9e
> 
>  YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 1100 Not tainted
> r00-03  0804000f 4076cd10 40475fb4 7f761800
> r04-07  40749510 0001 7f761800 7d9e0490
> r08-11  7e722890  7da4ec00 7f763823
> r12-15   7fc08ea8 7fc08c78 4080e080
> r16-19  7fc08c00 0001  2260
> r20-23  7f7618b0 7c25ea80 0001 0001
> r24-27   080f 7f7618ac 40749510
> r28-31  0001 7d9e0840 7d9e0720 0001
> sr00-03  086c8800   086c8800
> sr04-07     
> 
> IASQ:   IAOQ: 40475fd4 
> 40475fd8
>  IIR: 0e6c00d5ISR:   IOR: 2260
>  CPU:0   CR30: 7d9e CR31: ff87e7ffbc9e
>  ORIG_R28: 4080a180
>  IAOQ[0]: n_tty_receive_buf_common+0xb4/0xbe0
>  IAOQ[1]: n_tty_receive_buf_common+0xb8/0xbe0
>  RP(r2): n_tty_receive_buf_common+0x94/0xbe0
> Backtrace:
>  [<40476b14>] n_tty_receive_buf2+0x14/0x20
>  [<4047a208>] tty_ldisc_receive_buf+0x30/0x90
>  [<4047a544>] flush_to_ldisc+0x144/0x1c8
>  [<402556bc>] process_one_work+0x1b4/0x460
>  [<40255bbc>] worker_thread+0x1e4/0x5e0
>  [<4025d454>] kthread+0x134/0x168

--- >% ---
diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index 68947f6..f271832 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -653,7 +653,7 @@ static void tty_reset_termios(struct tty_struct *tty)
  * Returns 0 if successful, otherwise error code < 0
  */
 
-int tty_ldisc_reinit(struct tty_struct *tty, int disc)
+static int __tty_ldisc_reinit(struct tty_struct *tty, int disc)
 {
struct tty_ldisc *ld;
int retval;
@@ -682,6 +682,16 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc)
return retval;
 }
 
+int tty_ldisc_reinit(struct tty_struct *tty, int disc)
+{
+   int retval;
+
+   tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
+   retval = __tty_ldisc_reinit(tty, disc);
+   tty_ldisc_unlock(tty);
+   return retval;
+}
+
 /**
  * tty_ldisc_hangup-   hangup ldisc reset
  * @tty: tty being hung up
@@ -732,8 +742,8 @@ void tty_ldisc_hangup(struct tty_struct *tty, bool reinit)
 
if (tty->ldisc) {
if (reinit) {
-   if (tty_ldisc_reinit(tty, tty->termios.c_line) < 0)
-   tty_ldisc_reinit(tty, N_TTY);
+   if (__tty_ldisc_reinit(tty, tty->termios.c_line) < 0)
+   __tty_ldisc_reinit(tty, N_TTY);
} else
tty_ldisc_kill(tty);
}



Re: [PATCH 1/1] tty/serial: to support 8250 earlycon can be enabled independently

2016-05-16 Thread Peter Hurley
On 05/16/2016 04:35 AM, Zhen Lei wrote:
> Sometimes, we may only use SSH to login, and build 8250 uart driver as a
> ko(insmod if needed). But the earlycon may still be necessary, because
> the kernel boot process may take a long time. It's not good to display
> nothing but ask people to wait patiently.

I'm confused; you want the possibility of earlycon but _not_ a normal
serial console?

This configuration is unsafe because nothing prevents the 8250 driver
and 8250 earlycon from concurrently accessing the hardware.


> In addition, the 8250.ko can not be worked if we have not opened any
> other serial drivers, because SERIAL_CORE would not be selected.

I don't understand what this means.

Regards,
Peter Hurley


> Signed-off-by: Zhen Lei 
> ---
>  drivers/tty/serial/8250/Kconfig  | 9 +++--
>  drivers/tty/serial/8250/Makefile | 1 -
>  drivers/tty/serial/Makefile  | 1 +
>  3 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
> index 4d7cb9c..2992f0a 100644
> --- a/drivers/tty/serial/8250/Kconfig
> +++ b/drivers/tty/serial/8250/Kconfig
> @@ -3,6 +3,12 @@
>  # you somehow have an implicit or explicit dependency on SERIAL_8250.
>  #
> 
> +config SERIAL_8250_EARLYCON
> + bool "Early console using 8250"
> + select SERIAL_CORE
> + select SERIAL_CORE_CONSOLE
> + select SERIAL_EARLYCON
> +
>  config SERIAL_8250
>   tristate "8250/16550 and compatible serial support"
>   select SERIAL_CORE
> @@ -60,8 +66,7 @@ config SERIAL_8250_PNP
>  config SERIAL_8250_CONSOLE
>   bool "Console on 8250/16550 and compatible serial port"
>   depends on SERIAL_8250=y
> - select SERIAL_CORE_CONSOLE
> - select SERIAL_EARLYCON
> + select SERIAL_8250_EARLYCON
>   ---help---
> If you say Y here, it will be possible to use a serial port as the
> system console (the system console is the device which receives all
> diff --git a/drivers/tty/serial/8250/Makefile 
> b/drivers/tty/serial/8250/Makefile
> index c9a2d6e..1f24c74 100644
> --- a/drivers/tty/serial/8250/Makefile
> +++ b/drivers/tty/serial/8250/Makefile
> @@ -13,7 +13,6 @@ obj-$(CONFIG_SERIAL_8250_HP300) += 8250_hp300.o
>  obj-$(CONFIG_SERIAL_8250_CS) += serial_cs.o
>  obj-$(CONFIG_SERIAL_8250_ACORN)  += 8250_acorn.o
>  obj-$(CONFIG_SERIAL_8250_BCM2835AUX) += 8250_bcm2835aux.o
> -obj-$(CONFIG_SERIAL_8250_CONSOLE)+= 8250_early.o
>  obj-$(CONFIG_SERIAL_8250_FOURPORT)   += 8250_fourport.o
>  obj-$(CONFIG_SERIAL_8250_ACCENT) += 8250_accent.o
>  obj-$(CONFIG_SERIAL_8250_BOCA)   += 8250_boca.o
> diff --git a/drivers/tty/serial/Makefile b/drivers/tty/serial/Makefile
> index 8c261ad..cd84181 100644
> --- a/drivers/tty/serial/Makefile
> +++ b/drivers/tty/serial/Makefile
> @@ -19,6 +19,7 @@ obj-$(CONFIG_SERIAL_SUNSAB) += sunsab.o
> 
>  # Now bring in any enabled 8250/16450/16550 type drivers.
>  obj-$(CONFIG_SERIAL_8250) += 8250/
> +obj-$(CONFIG_SERIAL_8250_EARLYCON) += 8250/8250_early.o
> 
>  obj-$(CONFIG_SERIAL_AMBA_PL010) += amba-pl010.o
>  obj-$(CONFIG_SERIAL_AMBA_PL011) += amba-pl011.o
> --
> 2.5.0
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-serial" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-16 Thread Peter Hurley
On 05/16/2016 05:17 AM, Paul E. McKenney wrote:
> On Mon, May 16, 2016 at 01:09:48PM +0200, Peter Zijlstra wrote:
>> On Fri, May 13, 2016 at 10:58:05AM -0700, Peter Hurley wrote:
>>>> Note that barrier() and READ_ONCE() have overlapping but not identical
>>>> results and the combined use actually makes sense here.
>>>>
>>>> Yes, a barrier() anywhere in the loop will force a reload of the
>>>> variable, _however_ it doesn't force that reload to not suffer from
>>>> load tearing.
>>>>
>>>> Using volatile also forces a reload, but also ensures the load cannot
>>>> be torn IFF it is of machine word side and naturally aligned.
>>>>
>>>> So while the READ_ONCE() here is pointless for forcing the reload;
>>>> that's already ensured, we still need to make sure the load isn't torn.
>>>
>>> If load tearing a naturally aligned pointer is a real code generation
>>> possibility then the rcu list code is broken too (which loads ->next
>>> directly; cf. list_for_each_entry_rcu() & list_for_each_entry_lockless()).
>>>
>>> For 4.4, Paul added READ_ONCE() checks for list_empty() et al, but iirc
>>> that had to do with control dependencies and not load tearing.
>>
>> Well, Paul is the one who started the whole load/store tearing thing, so
>> I suppose he knows what he's doing.
> 
> That had to do with suppressing false positives for one of Dmitry
> Vjukov's concurrency checkers.  I suspect that Peter Hurley is right
> that continued use of that checker would identify other places needing
> READ_ONCE(), but from what I understand that is on hold pending a formal
> definition of the Linux-kernel memory model.  (KCC and Dmitry (CCed)
> can correct my if I am confused on this point.)
> 
>> That said; its a fairly recent as things go so lots of code hasn't been
>> updated yet, and its also a very unlikely thing for a compiler to do;
>> since it mostly doesn't make sense to emit multiple instructions where
>> one will do, so its not a very high priority thing either.
>>
>> But from what I understand, the compiler is free to emit all kinds of
>> nonsense for !volatile loads/stores.
> 
> That is quite true.  :-/
> 
>>> OTOH, this patch might actually produce store-tearing:
>>>
>>> +static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
>>> +{
>>> +   /*
>>> +* We check the owner value first to make sure that we will only
>>> +* do a write to the rwsem cacheline when it is really necessary
>>> +* to minimize cacheline contention.
>>> +*/
>>> +   if (sem->owner != RWSEM_READER_OWNED)
>>> +   sem->owner = RWSEM_READER_OWNED;
>>> +}
>>
>> Correct; which is why we should always use {READ,WRITE}_ONCE() for
>> anything that is used locklessly.
> 
> Completely agreed.  Improve readability of code by flagging lockless
> shared-memory accesses, help checkers better find bugs, and prevent the
> occasional compiler mischief!

I think this would be a mistake for 3 reasons:

1. If READ_ONCE()/WRITE_ONCE() is necessary to prevent load/store tearing
   of any normally-atomic type (char/int/long/void*), then _every_ access
   would require READ_ONCE()/WRITE_ONCE(), thus eliminating any possibility
   of compiler optimization (eg. eliding redundant loads) where it would
   otherwise be possible.

2. Makes a mess of otherwise readable code.

3. Error-prone; ie., easy to overlook in review.

There is no practical difference between _always_ using READ_ONCE()/WRITE_ONCE()
(to prevent tearing) and declaring the field volatile.

So we've come full-circle from volatile-considered-harmful.

Regards,
Peter Hurley



Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-13 Thread Peter Hurley
On 05/13/2016 08:07 AM, Peter Zijlstra wrote:
> On Wed, May 11, 2016 at 03:04:20PM -0700, Peter Hurley wrote:
>>> +   return !rwsem_is_reader_owned(READ_ONCE(sem->owner));
>>
>> It doesn't make sense to force reload sem->owner here; if sem->owner
>> is not being reloaded then the loop above will execute forever.
>>
>> Arguably, this check should be bumped out to the optimistic spin and
>> reload/check the owner there?
>>
> 
> Note that barrier() and READ_ONCE() have overlapping but not identical
> results and the combined use actually makes sense here.
> 
> Yes, a barrier() anywhere in the loop will force a reload of the
> variable, _however_ it doesn't force that reload to not suffer from
> load tearing.
> 
> Using volatile also forces a reload, but also ensures the load cannot
> be torn IFF it is of machine word side and naturally aligned.
> 
> So while the READ_ONCE() here is pointless for forcing the reload;
> that's already ensured, we still need to make sure the load isn't torn.

If load tearing a naturally aligned pointer is a real code generation
possibility then the rcu list code is broken too (which loads ->next
directly; cf. list_for_each_entry_rcu() & list_for_each_entry_lockless()).

For 4.4, Paul added READ_ONCE() checks for list_empty() et al, but iirc
that had to do with control dependencies and not load tearing.

OTOH, this patch might actually produce store-tearing:

+static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
+{
+   /*
+* We check the owner value first to make sure that we will only
+* do a write to the rwsem cacheline when it is really necessary
+* to minimize cacheline contention.
+    */
+   if (sem->owner != RWSEM_READER_OWNED)
+   sem->owner = RWSEM_READER_OWNED;
+}


Regards,
Peter Hurley


Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-12 Thread Peter Hurley
On 05/12/2016 01:15 PM, Waiman Long wrote:
> On 05/11/2016 06:04 PM, Peter Hurley wrote:

[...]

>>
>>> @@ -328,8 +329,6 @@ done:
>>>   static noinline
>>>   bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct 
>>> *owner)
>>>   {
>>> -long count;
>>> -
>>>   rcu_read_lock();
>>>   while (sem->owner == owner) {
>>>   /*
>>> @@ -350,16 +349,11 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, 
>>> struct task_struct *owner)
>>>   }
>>>   rcu_read_unlock();
>>>
>>> -if (READ_ONCE(sem->owner))
>>> -return true; /* new owner, continue spinning */
>>> -
>>>   /*
>>> - * When the owner is not set, the lock could be free or
>>> - * held by readers. Check the counter to verify the
>>> - * state.
>>> + * If there is a new owner or the owner is not set, we continue
>>> + * spinning.
>>>*/
>>> -count = READ_ONCE(sem->count);
>>> -return (count == 0 || count == RWSEM_WAITING_BIAS);
>>> +return !rwsem_is_reader_owned(READ_ONCE(sem->owner));
>> It doesn't make sense to force reload sem->owner here; if sem->owner
>> is not being reloaded then the loop above will execute forever.
> 
> I agree that we don't actually need to use READ_ONCE() here for sem->owner as 
> the barrier() call will force the reloading. It is more like a habit to use 
> it for public variable or we will have to think a bit harder to make sure 
> that we are doing the right thing.

I look at that code and I think "what am I missing that it needs to reload at 
exit"
Extra cruft is just as misleading.

What matters while spinning on owner is that the reload is forced as part of the
loop, not at exit time. The whole point of rwsem_spin_on_owner() is to spin
until the owner _changes_; there's no need to get a more recent owner value
than the one that caused the loop to break.


>> Arguably, this check should be bumped out to the optimistic spin and
>> reload/check the owner there?
>>
>> Or better yet; don't pass the owner in as a parameter at all, but
>> instead snapshot the owner and check its ownership on entry.
> 
> That will make the main optimistic spinning loop more complex.

??

Simpler.

while (rwsem_spin_on_owner(sem)) {
if (rwsem_try_write_lock_unqueued(sem)) {
taken = true;
break;
}

if (!sem->owner && (need_resched() || rt_task(current)))
break;

cpu_relax_lowlatency();
}




bool rwsem_spin_on_owner(struct rw_semaphore *sem)
{
struct task_struct *owner = READ_ONCE(sem->owner);

if (!rwsem_is_writer_owned(owner))
return false;

rcu_read_lock();
while (sem->owner == owner) {

}
rcu_read_unlock();

return !rwsem_is_reader_owned(sem->owner);
}


>>
>> Because see below...
>>
>>>   }
>>>
>>>   static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>>> @@ -378,7 +372,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore 
>>> *sem)
>>>
>>>   while (true) {
>>>   owner = READ_ONCE(sem->owner);
>>> -if (owner&&  !rwsem_spin_on_owner(sem, owner))
>>> +if (rwsem_is_writer_owned(owner)&&
>>> +   !rwsem_spin_on_owner(sem, owner))
>>>   break;
>>>
>>>   /* wait_lock will be acquired if write_lock is obtained */
>>> @@ -391,9 +386,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore 
>>> *sem)
>>>* When there's no owner, we might have preempted between the
>>>* owner acquiring the lock and setting the owner field. If
>>>* we're an RT task that will live-lock because we won't let
>>> - * the owner complete.
>>> + * the owner complete. We also quit if the lock is owned by
>>> + * readers.
>>>*/
>>> -if (!owner&&  (need_resched() || rt_task(current)))
>>> +if (rwsem_is_reader_owned(owner) ||
>>> +   (!owner&&  (need_resched() || rt_task(current
>> This is using the stale owner value that was cached before spinning on the 
>> owner;
>> That can't be right.
> 
> The code is going to loop back and reload the new owner value anyway. It is 
> just a bit of additional latency. I will move the is_reader check up after 
> loading sem->owner to clear any confusion.

Well, why do it at all then?  Just before attempting the lock, 
rwsem_spin_on_owner()
determined that the owner was not a reader. If a reader became the new owner 
after
that, the loop will discover that the lock is no longer writer owned anyway.

Either way, the check here should be to sem->owner (ie., like I wrote above),
but there's no need to force reload here either.

Regards,
Peter Hurley




Re: [PATCH v2] locking/rwsem: Add reader-owned state to the owner field

2016-05-11 Thread Peter Hurley
t; - long count;
> -
>   rcu_read_lock();
>   while (sem->owner == owner) {
>   /*
> @@ -350,16 +349,11 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, 
> struct task_struct *owner)
>   }
>   rcu_read_unlock();
>  
> - if (READ_ONCE(sem->owner))
> - return true; /* new owner, continue spinning */
> -
>   /*
> -  * When the owner is not set, the lock could be free or
> -  * held by readers. Check the counter to verify the
> -  * state.
> +  * If there is a new owner or the owner is not set, we continue
> +  * spinning.
>*/
> - count = READ_ONCE(sem->count);
> - return (count == 0 || count == RWSEM_WAITING_BIAS);
> + return !rwsem_is_reader_owned(READ_ONCE(sem->owner));

It doesn't make sense to force reload sem->owner here; if sem->owner
is not being reloaded then the loop above will execute forever.

Arguably, this check should be bumped out to the optimistic spin and
reload/check the owner there?

Or better yet; don't pass the owner in as a parameter at all, but
instead snapshot the owner and check its ownership on entry.

Because see below...

>  }
>  
>  static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> @@ -378,7 +372,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore 
> *sem)
>  
>   while (true) {
>   owner = READ_ONCE(sem->owner);
> - if (owner && !rwsem_spin_on_owner(sem, owner))
> + if (rwsem_is_writer_owned(owner) &&
> +!rwsem_spin_on_owner(sem, owner))
>   break;
>  
>   /* wait_lock will be acquired if write_lock is obtained */
> @@ -391,9 +386,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore 
> *sem)
>* When there's no owner, we might have preempted between the
>* owner acquiring the lock and setting the owner field. If
>* we're an RT task that will live-lock because we won't let
> -  * the owner complete.
> +  * the owner complete. We also quit if the lock is owned by
> +  * readers.
>*/
> - if (!owner && (need_resched() || rt_task(current)))
> + if (rwsem_is_reader_owned(owner) ||
> +(!owner && (need_resched() || rt_task(current

This is using the stale owner value that was cached before spinning on the 
owner;
That can't be right.

Regards,
Peter Hurley

>   break;
>  
>   /*
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index c817216..5838f56 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -22,6 +22,7 @@ void __sched down_read(struct rw_semaphore *sem)
>   rwsem_acquire_read(&sem->dep_map, 0, 0, _RET_IP_);
>  
>   LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
> + rwsem_set_reader_owned(sem);
>  }
>  
>  EXPORT_SYMBOL(down_read);
> @@ -33,8 +34,10 @@ int down_read_trylock(struct rw_semaphore *sem)
>  {
>   int ret = __down_read_trylock(sem);
>  
> - if (ret == 1)
> + if (ret == 1) {
>   rwsem_acquire_read(&sem->dep_map, 0, 1, _RET_IP_);
> + rwsem_set_reader_owned(sem);
> + }
>   return ret;
>  }
>  
> @@ -124,7 +127,7 @@ void downgrade_write(struct rw_semaphore *sem)
>* lockdep: a downgraded write will live on as a write
>* dependency.
>*/
> - rwsem_clear_owner(sem);
> + rwsem_set_reader_owned(sem);
>   __downgrade_write(sem);
>  }
>  
> @@ -138,6 +141,7 @@ void down_read_nested(struct rw_semaphore *sem, int 
> subclass)
>   rwsem_acquire_read(&sem->dep_map, subclass, 0, _RET_IP_);
>  
>   LOCK_CONTENDED(sem, __down_read_trylock, __down_read);
> + rwsem_set_reader_owned(sem);
>  }
>  
>  EXPORT_SYMBOL(down_read_nested);
> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
> index 870ed9a..d7fea18 100644
> --- a/kernel/locking/rwsem.h
> +++ b/kernel/locking/rwsem.h
> @@ -1,3 +1,20 @@
> +/*
> + * The owner field of the rw_semaphore structure will be set to
> + * RWSEM_READ_OWNED when a reader grabs the lock. A writer will clear
> + * the owner field when it unlocks. A reader, on the other hand, will
> + * not touch the owner field when it unlocks.
> + *
> + * In essence, the owner field now has the following 3 states:
> + *  1) 0
> + * - lock is free or the owner hasn't set the field yet
> + *  2) RWSEM_READER_OWNED
> + * - lock is currently or previously owned by readers (lock is free
> + *   or not

Re: [PATCH 1/4] locking/rwsem: Avoid stale ->count for rwsem_down_write_failed()

2016-05-08 Thread Peter Hurley
On 05/08/2016 09:56 PM, Davidlohr Bueso wrote:
> The field is obviously updated w.o the lock and needs a READ_ONCE
> while waiting for lock holder(s) to go away, just like we do with
> all other ->count accesses.

This isn't actually fixing a bug because it's passed through
several full barriers which will force reloading from sem->count.

I think the patch is ok if you want it just for consistency anyway,
but please change $subject and changelog.

Regards,
Peter Hurley


> Signed-off-by: Davidlohr Bueso 
> ---
>  kernel/locking/rwsem-xadd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> index df4dcb883b50..7d62772600cf 100644
> --- a/kernel/locking/rwsem-xadd.c
> +++ b/kernel/locking/rwsem-xadd.c
> @@ -494,7 +494,7 @@ __rwsem_down_write_failed_common(struct rw_semaphore 
> *sem, int state)
>   }
>   schedule();
>   set_current_state(state);
> - } while ((count = sem->count) & RWSEM_ACTIVE_MASK);
> + } while ((count = READ_ONCE(sem->count)) & RWSEM_ACTIVE_MASK);
>  
>   raw_spin_lock_irq(&sem->wait_lock);
>   }
> 



Re: [PATCH 0/1] devpts: Removing the need for pt_chown

2016-05-06 Thread Peter Hurley
On 05/06/2016 12:35 PM, Greg KH wrote:
> On Fri, May 06, 2016 at 02:04:12PM -0500, Eric W. Biederman wrote:
>>
>> Greg,
>>
>> Could you please apply the following patch to tty-next so it can be
>> merged into 4.7-rc1.
>>
>> We have had a long series of discussions and in the last iteration we
>> finally converged on a set of semantics that does not break userspace
>> and also makes the code simpler.
> 
> Did everyone agree?  I didn't think so, but the thread got long and
> messy.  And then Linus did some work on this as well.
> 
> How does this play with what Linus proposed?  I think only portions of
> his original changes are merged, and there are still outstanding parts,
> right?

Linus committed his proposed pty changes which sits in -rc6 and not
in tty-next (which is based on -rc5):

commit 8ead9dd54716d1e05e129959f702fcc1786f82b4
Author: Linus Torvalds 
Date:   Mon Apr 25 20:04:08 2016 -0700

devpts: more pty driver interface cleanups

This is more prep-work for the upcoming pty changes.  Still just code
cleanup with no actual semantic changes.

This removes a bunch pointless complexity by just having the slave pty
side remember the dentry associated with the devpts slave rather than
the inode.  That allows us to remove all the "look up the dentry" code
for when we want to remove it again.




Linus's changes look good to me and I've been running them cherry-picked on
my private tty-next testing tree since.

When Greg picks up -rc6 (not sure he was going to do that pre-merge window?),
I'd also like to push the devpts_mutex locking down into fs/devpts/inode.c,
but I'd be willing to do that later, if it's going to get in the way.

Regards,
Peter Hurley


Re: [BUG] drivers/tty: read() on a noncanonical blocking tty randomly fails when VMIN > received >= buf

2016-05-05 Thread Peter Hurley
On 05/05/2016 03:08 AM, One Thousand Gnomes wrote:
> On Wed, 4 May 2016 16:07:44 -0700
> Peter Hurley  wrote:
> 
>> Hi Julio,
>>
>> On 05/04/2016 04:00 PM, Julio Guerra wrote:
>>> Hi,
>>>
>>> When a tty (here a slave pty) is set in noncanonical input and blocking 
>>> read modes, a read() randomly blocks when:
>>> "VMIN > kernel received >= user buffer size > 0".
>>>
>>> The standard says that read() should block until VMIN bytes are received 
>>> [1][2]. Whether this is an implementation defined case not really specified 
>>> by POSIX or not, it should not behave randomly (otherwise it really should 
>>> be documented in termios manpage).  
>>
>> This is not a bug.
>>
>> >From the termios(3) man page:  
>>
>>* MIN > 0; TIME == 0: read(2) blocks until the lesser of MIN bytes or 
>> the number of bytes requested are avail‐
>>  able, and returns the lesser of these two values.
> 
> The standard says
> 
>   Case B: MIN>0, TIME=0
> 
>   In case B, since the value of TIME is zero, the timer plays no
>   role and only MIN is significant. A pending read shall not be
>   satisfied until MIN bytes are received (that is, the pending read
>   shall block until MIN bytes are received), or a signal is
>   received. A program that uses case B to read record-based
>   terminal I/O may block indefinitely in the read operation.
> 
> That is if you do 
> 
> 
>   read(fd, buf, 3)
> 
> and MIN is 5, the read should not return until there are 5 bytes in the
> queue. The following code is guaranteed to work reliably by the standard
> with TIME 0 MIN 5 (ignoring signals for the moment)
> 
> 
>   read(fd, buf, 3);
>   fcntl(fd, F_SETFL, FNDELAY);
>   assert(read(fd, buf, 2) == 2);
> 
> Historically this behaviour was useful for things like block transfer
> protocols, especially with offloaded serial processing.
> 
> So actually I think we do have a bug, the behaviuour is not standards
> compliant, and the man page documents the erroneous behaviour.

I disagree; I think SUSv4 fails to address this degenerate condition at all.
For example, SUSv4 specifically states that there is no precedence of
MIN/TIME with O_NONBLOCK. IOW, the standard does _not_ guarantee that your
code fragment above won't block on the subsequent read anyway since it fails
to meet the new MIN 5 watermark.

But I have no problem fixing a bona fide regression; what's broken?



Re: [BUG] drivers/tty: read() on a noncanonical blocking tty randomly fails when VMIN > received >= buf

2016-05-04 Thread Peter Hurley
On 05/04/2016 04:27 PM, Julio Guerra wrote:
>>> When a tty (here a slave pty) is set in noncanonical input and blocking 
>>> read modes, a read() randomly blocks when:
>>> "VMIN > kernel received >= user buffer size > 0".
>>>
>>> The standard says that read() should block until VMIN bytes are received 
>>> [1][2]. Whether this is an implementation defined case not really specified 
>>> by POSIX or not, it should not behave randomly (otherwise it really should 
>>> be documented in termios manpage).
>>
>> This is not a bug.
>>
>> From the termios(3) man page:
>>
>>* MIN > 0; TIME == 0: read(2) blocks until the lesser of MIN bytes or 
>> the number of bytes requested are avail‐
>>  able, and returns the lesser of these two values.
>>
> 
> This does not appear in my man...
> 
> Anyway, how do you explain the random behavior then?

A long standing bug in this read mode allows the asynchronous input
processing thread to race with the read() thread and become confused
about how much data remains.

I fixed this in 4.6; when I run your test on 4.6, it consistently
returns the full user buffer.

Regards,
Peter Hurley




Re: [BUG] drivers/tty: read() on a noncanonical blocking tty randomly fails when VMIN > received >= buf

2016-05-04 Thread Peter Hurley
Hi Julio,

On 05/04/2016 04:00 PM, Julio Guerra wrote:
> Hi,
> 
> When a tty (here a slave pty) is set in noncanonical input and blocking read 
> modes, a read() randomly blocks when:
> "VMIN > kernel received >= user buffer size > 0".
> 
> The standard says that read() should block until VMIN bytes are received 
> [1][2]. Whether this is an implementation defined case not really specified 
> by POSIX or not, it should not behave randomly (otherwise it really should be 
> documented in termios manpage).

This is not a bug.

>From the termios(3) man page:

   * MIN > 0; TIME == 0: read(2) blocks until the lesser of MIN bytes or 
the number of bytes requested are avail‐
 able, and returns the lesser of these two values.

Regards,
Peter Hurley


> I isolated it in the following example (with VMIN = 5, received = 4, user 
> buffer = 3):
> https://gist.github.com/Julio-Guerra/b3fdefab281403073607d81cabcea04a
> 
> Since it is random, you will need run it several times to observe both cases. 
> When correctly behaving, it should block in the read() for ever (C-c to kill 
> it). When incorrectly behaving, it does not block and reads into the user 
> buffer what is present in the kernel receive buffer (string "any").
> 
> Example where it does not block 3 times out of 4:
>  > $ ./a.out
>  > res=3 buf=any
>  > $ ./a.out
>  > res=3 buf=any
>  > $ ./a.out
>  > ^C⏎
> 
> Linux version 4.5.1-1-ARCH (builduser@tobias) (gcc version 5.3.0 (GCC) ) #1 
> SMP PREEMPT Thu Apr 14 19:19:32 CEST 2016
> 
> [1] "Canonical and noncanonical mode", man termios
> [2] http://www.gnu.org/software/libc/manual/html_node/Noncanonical-Input.html
> 



Re: [serial] 13a7238eea: WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 uart_change_pm+0x35/0x162

2016-05-03 Thread Peter Hurley
Hi,

On 05/03/2016 08:13 PM, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing
> commit 13a7238eeab5718bf968e2a835205ba659a38a77 ("serial: core: Prevent 
> unsafe uart port access, part 3")

This should be fixed on most recent tty-testing and tty-next branches.

Please re-test on last tty-testing branch including commit 
7da4b8b7378790dd1e4af1bb7522863127fa1438
("serial: core: Fix port mutex assert if lockdep disabled")

Regards,
Peter Hurley


> on test machine: vm-vp-quantal-x86_64: 2 threads qemu-system-x86_64 
> -enable-kvm with 360M memory
> 
> caused below changes:
> 
> 
> [7.249323] [ cut here ]
> [7.249323] [ cut here ]
> [7.250172] WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 
> uart_change_pm+0x35/0x162
> [7.250172] WARNING: CPU: 0 PID: 1 at drivers/tty/serial/serial_core.c:99 
> uart_change_pm+0x35/0x162
> [7.252110] Modules linked in:
> [7.252110] Modules linked in:
> 
> [7.252679] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GB   
> 4.6.0-rc5-00015-g13a7238 #1
> [7.252679] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GB   
> 4.6.0-rc5-00015-g13a7238 #1
> [7.254232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Debian-1.8.2-1 04/01/2014
> [7.254232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Debian-1.8.2-1 04/01/2014
> [7.255806]  
> [7.255806]   88000f42f668 88000f42f668 
> 815a2566 815a2566 0286 0286
> 
> [7.257180]  41b58ab3
> [7.257180]  41b58ab3 81f66067 81f66067 
> 815a240c 815a240c 81f7647d 81f7647d
> 
> [7.258601]  81238abf
> [7.258601]  81238abf 0042f6a8 0042f6a8 
> ed0001e85ecb ed0001e85ecb  
> 
> [7.259973] Call Trace:
> [7.259973] Call Trace:
> [7.260422]  [] dump_stack+0x15a/0x1de
> [7.260422]  [] dump_stack+0x15a/0x1de
> [7.261374]  [] ? _atomic_dec_and_lock+0xaf/0xaf
> [7.261374]  [] ? _atomic_dec_and_lock+0xaf/0xaf
> [7.262470]  [] ? is_module_text_address+0x1f/0x1f
> [7.262470]  [] ? is_module_text_address+0x1f/0x1f
> [7.263595]  [] ? uart_change_pm+0x35/0x162
> [7.263595]  [] ? uart_change_pm+0x35/0x162
> [7.264623]  [] __warn+0x16e/0x189
> [7.264623]  [] __warn+0x16e/0x189
> [7.265523]  [] warn_slowpath_null+0x18/0x1a
> [7.265523]  [] warn_slowpath_null+0x18/0x1a
> [7.266561]  [] uart_change_pm+0x35/0x162
> [7.266561]  [] uart_change_pm+0x35/0x162
> [7.267553]  [] uart_add_one_port+0xb36/0x1008
> [7.267553]  [] uart_add_one_port+0xb36/0x1008
> [7.268637]  [] ? uart_unregister_driver+0x150/0x150
> [7.268637]  [] ? uart_unregister_driver+0x150/0x150
> [7.269786]  [] ? mutex_trylock+0x4eb/0x4eb
> [7.269786]  [] ? mutex_trylock+0x4eb/0x4eb
> [7.270838]  [] 
> serial8250_register_8250_port+0xf68/0x14e5
> [7.270838]  [] 
> serial8250_register_8250_port+0xf68/0x14e5
> [7.272116]  [] serial_pnp_probe+0x2d8/0x5f1
> [7.272116]  [] serial_pnp_probe+0x2d8/0x5f1
> [7.273146]  [] ? check_name+0x6e/0x6e
> [7.273146]  [] ? check_name+0x6e/0x6e
> [7.274125]  [] ? check_name+0x6e/0x6e
> [7.274125]  [] ? check_name+0x6e/0x6e
> [7.275076]  [] pnp_device_probe+0x1c1/0x1dc
> [7.275076]  [] pnp_device_probe+0x1c1/0x1dc
> [7.276109]  [] driver_probe_device+0x325/0x63c
> [7.276109]  [] driver_probe_device+0x325/0x63c
> [7.277191]  [] ? driver_probe_device+0x63c/0x63c
> [7.277191]  [] ? driver_probe_device+0x63c/0x63c
> [7.278299]  [] __driver_attach+0x138/0x179
> [7.278299]  [] __driver_attach+0x138/0x179
> [7.279316]  [] bus_for_each_dev+0x10d/0x143
> [7.279316]  [] bus_for_each_dev+0x10d/0x143
> [7.280372]  [] ? next_device+0x52/0x52
> [7.280372]  [] ? next_device+0x52/0x52
> [7.281334]  [] driver_attach+0x50/0x53
> [7.281334]  [] driver_attach+0x50/0x53
> [7.282302]  [] bus_add_driver+0x31d/0x53e
> [7.282302]  [] bus_add_driver+0x31d/0x53e
> [7.283346]  [] driver_register+0x24b/0x2c3
> [7.283346]  [] driver_register+0x24b/0x2c3
> [7.284370]  [] pnp_register_driver+0x98/0x9f
> [7.284370]  [] pnp_register_driver+0x98/0x9f
> [7.285420]  [] ? univ8250_console_init+0x28/0x28
> [7.285420]  [] ? univ8250_console_init+0x28/0x28
> [7.286579]  [] serial8250_pnp_init+0x10/0x12
> [7.286579]  [] serial8250

Re: [PATCH v2 0/4] Patches to allow consistent mmc / mmcblk numbering w/ device tree

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:03 PM, Doug Anderson wrote:
> Hi,
> 
> On Fri, Apr 29, 2016 at 4:58 PM, Peter Hurley  
> wrote:
> 
> On 04/29/2016 04:01 PM, Doug Anderson wrote:
> > * serial allows numbering devices by alias.
> 
> Which is in fact a total nightmare.
> 
> While stable device order is mandatory in serial because of
> console command line parameters and existing userspace expectations,
> it is the number one barrier to providing a shared ttyS namespace
> for mixed uart platforms.
> 
> Stable device order has a very real (and often unforeseen) maintenance
> burden.
> 
> 
> Interesting. I wonder if these burdens are unique to serial or shared
> by all the other subsystems that allow ordering? Maybe this is all
> because of legacy reasons?

Well, the specific issue is certainly unique to serial.
But what I was suggesting is that 5 years from now, these patches
could be the "legacy reasons" in mmc.

FWIW, there is already a defacto expectation by boot configurations in the
field that a given mmc block device is stable across boots. The reality
is that 10's of kernel command lines look like:

root=/dev/mmcblk0p2

This was a recent regression fixed by Ulf in commit 9aaf3437aa72
("mmc: block: Use the mmc host device index as the mmcblk device index")


> Note that for MMC devices we wouldn't be asserting that _every_
> device has a stable order. Only those known at boot.
> 
> 
> For example, I noticed these patches are strictly for DT.
> I'm assuming that's because guaranteeing stable device order for
> mmc in general is more difficult; eg., by performing minimal scan
> serially and more expensive portion of the probe async.
> 
> 
> Presumably if there were another system similar to DT where we knew
> all the possible built-in devices right at boot then we could
> extend.
> 
> ...but yes, obviously this wouldn't be a sane thing to do for
> anything probable like PCI or USB.

SCSI does. Here's what Linus said:

"So the *simple* model is to just scan the devices minimally serially,
and allocate the names at that point (so the names are reliable
between boots for the same hardware configuration). And then do the
more expensive device setup asynchronously (ie querying device
information, spinning up disks, whatever - things that can take
anything from milliseonds to several seconds, because they are doing
actual IO). So you'd do some very basic (and _often_ fairly quick)
operations serially, but then try to do the expensive parts
concurrently.

The SCSI layer actually goes a bit further than that: it has a fairly
asynchronous scanning thing, but it does allocate the actual host
device nodes serially, and then it even has an ordered list of
"scanning_hosts" that is used to complete the scanning in-order, so
that the sysfs devices show up in the right order even if things
actually got scanned out-of-order. So scans that finished early will
wait for other scans that are for "earlier" devices, and you end up
with what *looks* ordered to the outside, even if internally it was
all done out-of-order.
"

Regards,
Peter Hurley


Re: [PATCH v2 0/4] Patches to allow consistent mmc / mmcblk numbering w/ device tree

2016-04-29 Thread Peter Hurley
On 04/29/2016 04:01 PM, Doug Anderson wrote:
> * serial allows numbering devices by alias.

Which is in fact a total nightmare.

While stable device order is mandatory in serial because of
console command line parameters and existing userspace expectations,
it is the number one barrier to providing a shared ttyS namespace
for mixed uart platforms.

Stable device order has a very real (and often unforeseen) maintenance
burden.

For example, I noticed these patches are strictly for DT.
I'm assuming that's because guaranteeing stable device order for
mmc in general is more difficult; eg., by performing minimal scan
serially and more expensive portion of the probe async.

Regards,
Peter Hurley





Re: [PATCH v2 08/11] serial: sh-sci: Correct pin initialization on (H)SCIF

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Correct pin initialization on (H)SCIF:
>   - RTS must be deasserted (it's active low),
>   - SCK must be an input, as it may be used as the optional external
> clock input.
> 
> Initial pin configuration must always be done:
>   - Regardless of the presence of dedicated RTS and CTS pins: if the
> register exists, the RTS/CTS bits exist, too,
>   - Regardless of hardware flow control being enabled or not: RTS must
> be deasserted.

This is always setting RTS active; why?

Normally you want to program RTS only in response to ->set_mctrl()
from serial core.  For example, this will set RTS active even though
baud might be set to B0.


> Signed-off-by: Geert Uytterhoeven 
> ---
> v2:
>   - New.
> ---
>  drivers/tty/serial/sh-sci.c | 23 ---
>  1 file changed, 8 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
> index ce7bd165929ea078..c46999f20917964e 100644
> --- a/drivers/tty/serial/sh-sci.c
> +++ b/drivers/tty/serial/sh-sci.c
> @@ -712,21 +712,14 @@ static void sci_init_pins(struct uart_port *port, 
> unsigned int cflag)
>   return;
>   }
>  
> - /*
> -  * For the generic path SCSPTR is necessary. Bail out if that's
> -  * unavailable, too.
> -  */
> - if (!sci_getreg(port, SCSPTR)->size)
> - return;
> -
> - if ((s->cfg->capabilities & SCIx_HAVE_RTSCTS) &&
> - ((!(cflag & CRTSCTS {
> - unsigned short status;
> -
> - status = serial_port_in(port, SCSPTR);
> - status &= ~SCSPTR_CTSIO;
> - status |= SCSPTR_RTSIO;
> - serial_port_out(port, SCSPTR, status); /* Set RTS = 1 */
> + if (sci_getreg(port, SCSPTR)->size) {
> + u16 status = serial_port_in(port, SCSPTR);
> +
> + /* RTS# is output, driven 1 */
> + status |= SCSPTR_RTSIO | SCSPTR_RTSDT;
> + /* CTS# and SCK are inputs */
> + status &= ~(SCSPTR_CTSIO | SCSPTR_SCKIO);
> + serial_port_out(port, SCSPTR, status);
>   }
>  }
>  
> 



Re: [PATCH v2 07/11] serial: sh-sci: Add more Serial Port Control/Data Register documentation

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Improve documentation for the SCIFA/SCIFB Serial Port Control and Data
> Registers:
>   - State clearly that the RTS and CTS lines are active-low,
>   - Document the bits related to the serial port's SCK, RXD, and TXD
> pins.

Reviewed-by: Peter Hurley 



Re: [PATCH v2 06/11] serial: sh-sci: Add more Serial Port Register documentation

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Improve documentation for the (H)SCIF Serial Port Register:
>   - Make it clear the RTS and CTS lines are active-low,
>   - Document the bits related to the serial port's clock pin.

Reviewed-by: Peter Hurley 



Re: [PATCH v2 05/11] serial: sh-sci: Do not open-code sci_getreg()

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Replace open-coded variants of sci_getreg() by function calls, and drop
> intermediate variables where appropriate.

Reviewed-by: Peter Hurley 



Re: [PATCH v2 04/11] serial: sh-sci: Add support for GPIO-controlled modem lines

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Enhance the Renesas SCI UART driver to add support for GPIO-controlled
> modem lines (CTS, DSR, DCD, RNG, RTS, DTR), using the serial_mctrl_gpio
> helpers.
> 
> GPIO-controlled modem lines can be used when dedicated modem lines are
> not available. Invalid configurations specifying both GPIO RTS/CTS and
> dedicated RTS/CTS are rejected.

Reviewed-by: Peter Hurley 



Re: [PATCH v2 03/11] serial: sh-sci: Always set TIOCM_CTS in .get_mctrl() callback

2016-04-29 Thread Peter Hurley
On 04/29/2016 05:58 AM, Geert Uytterhoeven wrote:
> Documentation/serial/driver clearly states:
> 
> If the port does not support CTS, DCD or DSR, the driver should
> indicate that the signal is permanently active.
> 
> Hence always set TIOCM_CTS, as we currently don't look at the CTS
> hardware line state at all.
> 
> FWIW, this fixes the transmit path when hardware-assisted flow control
> is enabled, and userspace enables CRTSCTS.
> The receive path is still broken, as RTS is never asserted.

Reviewed-by: Peter Hurley 



Re: [PATCH] 8250: Hypervisors always export working 16550A UARTs.

2016-04-28 Thread Peter Hurley
On 04/28/2016 03:18 PM, Richard W.M. Jones wrote:
> [This is an opinionated patch, mainly for discussion.]
> 
> I'm trying to reduce the time taken in the kernel in initcalls, with
> my aim being to reduce the current ~700ms spent in initcalls before
> userspace, down to something like 100ms.  All times on my Broadwell-U
> laptop, under virtualization.  The purpose of this is to be able to
> launch VMs around containers with minimal overhead, like Intel Clear
> Containers, but using standard distro kernels and qemu.
> 
> Currently the kernel spends 25ms inspecting the UART that we passed to
> it from qemu to find out whether it's an 8250/16550/16550A perhaps
> with a non-working FIFO or other quirks.  Well, it isn't -- it's a
> working emulated 16550A, with a FIFO and no quirks, and if it isn't,
> we should fix qemu.

I'm sure all of this delay is sizing the fifo:

static int size_fifo(struct uart_8250_port *up)
{
...
for (count = 0; count < 256; count++)
serial_out(up, UART_TX, count);
===>mdelay(20);/* FIXME - schedule_timeout */
for (count = 0; (serial_in(up, UART_LSR) & UART_LSR_DR) &&
 (count < 256); count++)
serial_in(up, UART_RX);
...

return count;
}


> So the patch detects if we're running virtualized (perhaps it should
> only check for qemu/KVM?) and if so, shortcuts the tests.

Yeah, sorry, that's not going to fly.

I'm assuming x86, yes?

There's are multiple ways already supported by this driver to specify
the port based on the platform:

1. Register "serial8250" platform device.
   Set UPF_FIXED_TYPE in ::flags and PORT_16550A in ::type
2. early_serial_setup()
   Same as above, but must be before console_init()
3. Don't enumerate PNP0501 ACPI devices
   These are going to be probed dynamically
4. Add ACPI/PNP custom device
   Fixing the port type isn't supported now but could be
5. Modify the SERIAL_PORT_DFNS
   Fixing the port type isn't supported now either but could be

Regards,
Peter Hurley  


Re: [PATCH V4] audit: add tty field to LOGIN event

2016-04-28 Thread Peter Hurley
On 04/28/2016 12:28 PM, Richard Guy Briggs wrote:
> On 16/04/27, Peter Hurley wrote:
>> On 04/27/2016 06:31 PM, Richard Guy Briggs wrote:
>>> On 16/04/22, Peter Hurley wrote:
>>>> On 04/21/2016 11:14 AM, Richard Guy Briggs wrote:
>>>>> The tty field was missing from AUDIT_LOGIN events.
>>>>>
>>>>> Refactor code to create a new function audit_get_tty(), using it to
>>>>> replace the call in audit_log_task_info() and to add it to
>>>>> audit_log_set_loginuid().  Lock and bump the kref to protect it, adding
>>>>> audit_put_tty() alias to decrement it.
>>>>>
>>>>> Signed-off-by: Richard Guy Briggs 
>>>>> ---
>>>>>
>>>>> V4: Add missing prototype for audit_put_tty() when audit syscall is not
>>>>> enabled (MIPS).
>>>>>
>>>>> V3: Introduce audit_put_tty() alias to decrement kref.
>>>>>
>>>>> V2: Use kref to protect tty signal struct while in use.
>>>>>
>>>>> ---
>>>>>
>>>>>  include/linux/audit.h |   24 
>>>>>  kernel/audit.c|   18 +-
>>>>>  kernel/auditsc.c  |8 ++--
>>>>>  3 files changed, 35 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>>>>> index b40ed5d..32cdafb 100644
>>>>> --- a/include/linux/audit.h
>>>>> +++ b/include/linux/audit.h
>>>>> @@ -26,6 +26,7 @@
>>>>>  #include 
>>>>>  #include 
>>>>>  #include 
>>>>> +#include 
>>>>>  
>>>>>  #define AUDIT_INO_UNSET ((unsigned long)-1)
>>>>>  #define AUDIT_DEV_UNSET ((dev_t)-1)
>>>>> @@ -343,6 +344,23 @@ static inline unsigned int 
>>>>> audit_get_sessionid(struct task_struct *tsk)
>>>>>   return tsk->sessionid;
>>>>>  }
>>>>>  
>>>>> +static inline struct tty_struct *audit_get_tty(struct task_struct *tsk)
>>>>> +{
>>>>> + struct tty_struct *tty = NULL;
>>>>> + unsigned long flags;
>>>>> +
>>>>> + spin_lock_irqsave(&tsk->sighand->siglock, flags);
>>>>> + if (tsk->signal)
>>>>> + tty = tty_kref_get(tsk->signal->tty);
>>>>> + spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
>>>>
>>>>
>>>> Not that I'm objecting because I get that you're just refactoring
>>>> existing code, but I thought I'd point out some stuff.
>>>>
>>>> 1. There's no need to check if signal_struct is NULL (ie. tsk->signal)
>>>>because if it is, this will blow up trying to dereference the
>>>>sighand_struct (ie tsk->sighand).
>>>
>>> Ok.  This logic goes back 10 years and one month less two days. (45d9bb0e)
>>>
>>>> 2. The existing usage is always tsk==current
>>>
>>> My understanding is that when it is called via:
>>>
>>> copy_process()
>>> audit_free()
>>> __audit_free()
>>> audit_log_exit()
>>> audit_log_task_info()
>>>
>>> then tsk != current.
>>
>> While it's true that tsk != current here, everything relevant to tty
>> in task_struct is the same because the nascent task is not even half-done.
>> So tsk->sighand == current->sighand, tsk->signal == current->signal etc.
> 
> I agree this is true except in the case of !CLONE_SIGHAND, if it fails
> after copy_sighand() or copy_signal() then it would be null and would
> get freed before audit_free() is called.  By the time tty gets copied
> from current in this case, it is past the point of failure in
> copy_process().

Oh, right.


>> If you're uncomfortable with pass-through execution like that, then the
>> simple solution is:
>>
>>  struct tty_struct *tty = NULL;
>>
>>  /* tsk != current when copy_process() failed */
>>  if (tsk == current)
>>  tty = get_current_tty();
>>
>> because tty_kref_put(tty) accepts NULL tty and (obviously) so does
>> tty_name(tty).
> 
> Given the circumstances above, this appears reasonable to me at first
> look.

Ok.

I'll spend more analysis time before I actuall

Re: [PATCH V4] audit: add tty field to LOGIN event

2016-04-27 Thread Peter Hurley
On 04/27/2016 06:31 PM, Richard Guy Briggs wrote:
> On 16/04/22, Peter Hurley wrote:
>> On 04/21/2016 11:14 AM, Richard Guy Briggs wrote:
>>> The tty field was missing from AUDIT_LOGIN events.
>>>
>>> Refactor code to create a new function audit_get_tty(), using it to
>>> replace the call in audit_log_task_info() and to add it to
>>> audit_log_set_loginuid().  Lock and bump the kref to protect it, adding
>>> audit_put_tty() alias to decrement it.
>>>
>>> Signed-off-by: Richard Guy Briggs 
>>> ---
>>>
>>> V4: Add missing prototype for audit_put_tty() when audit syscall is not
>>> enabled (MIPS).
>>>
>>> V3: Introduce audit_put_tty() alias to decrement kref.
>>>
>>> V2: Use kref to protect tty signal struct while in use.
>>>
>>> ---
>>>
>>>  include/linux/audit.h |   24 
>>>  kernel/audit.c|   18 +-
>>>  kernel/auditsc.c  |8 ++--
>>>  3 files changed, 35 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>>> index b40ed5d..32cdafb 100644
>>> --- a/include/linux/audit.h
>>> +++ b/include/linux/audit.h
>>> @@ -26,6 +26,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  #define AUDIT_INO_UNSET ((unsigned long)-1)
>>>  #define AUDIT_DEV_UNSET ((dev_t)-1)
>>> @@ -343,6 +344,23 @@ static inline unsigned int audit_get_sessionid(struct 
>>> task_struct *tsk)
>>> return tsk->sessionid;
>>>  }
>>>  
>>> +static inline struct tty_struct *audit_get_tty(struct task_struct *tsk)
>>> +{
>>> +   struct tty_struct *tty = NULL;
>>> +   unsigned long flags;
>>> +
>>> +   spin_lock_irqsave(&tsk->sighand->siglock, flags);
>>> +   if (tsk->signal)
>>> +   tty = tty_kref_get(tsk->signal->tty);
>>> +   spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
>>
>>
>> Not that I'm objecting because I get that you're just refactoring
>> existing code, but I thought I'd point out some stuff.
>>
>> 1. There's no need to check if signal_struct is NULL (ie. tsk->signal)
>>because if it is, this will blow up trying to dereference the
>>sighand_struct (ie tsk->sighand).
> 
> Ok.  This logic goes back 10 years and one month less two days. (45d9bb0e)
> 
>> 2. The existing usage is always tsk==current
> 
> My understanding is that when it is called via:
> 
>   copy_process()
>   audit_free()
>   __audit_free()
>   audit_log_exit()
>   audit_log_task_info()
> 
> then tsk != current.

While it's true that tsk != current here, everything relevant to tty
in task_struct is the same because the nascent task is not even half-done.
So tsk->sighand == current->sighand, tsk->signal == current->signal etc.

If you're uncomfortable with pass-through execution like that, then the
simple solution is:


struct tty_struct *tty = NULL;

/* tsk != current when copy_process() failed */
if (tsk == current)
    tty = get_current_tty();


because tty_kref_put(tty) accepts NULL tty and (obviously) so does
tty_name(tty).


Regards,
Peter Hurley


>  This appears to be the only case which appears to
> force lugging around tsk.  This is noted in that commit referenced
> above.
> 
>> 3. If the idea is to make this invulnerable to tsk being gone, then
>>the usage is unsafe anyway.
>>
>>
>> So ultimately (but not necessarily for this patch) I'd prefer that either
>> a. audit use existing tty api instead of open-coding, or
>> b. add any tty api functions required.
> 
> This latter option did cross my mind...
> 
>> Peter Hurley
>>
>>> +   return tty;
>>> +}
>>> +
>>> +static inline void audit_put_tty(struct tty_struct *tty)
>>> +{
>>> +   tty_kref_put(tty);
>>> +}
>>> +
>>>  extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
>>>  extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t 
>>> gid, umode_t mode);
>>>  extern void __audit_bprm(struct linux_binprm *bprm);
>>> @@ -500,6 +518,12 @@ static inline unsigned int audit_get_sessionid(struct 
>>> task_struct *tsk)
>>>  {
>>> return -1;
>>>  }

Re: [PATCH] tty: provide tty_name() even without CONFIG_TTY

2016-04-27 Thread Peter Hurley
On 04/27/2016 10:24 AM, Arnd Bergmann wrote:
> On Wednesday 27 April 2016 12:20:02 Paul Moore wrote:
>>> diff --git a/include/linux/tty.h b/include/linux/tty.h
>>> index 3b09f235db66..17b247c94440 100644
>>> --- a/include/linux/tty.h
>>> +++ b/include/linux/tty.h
>>> @@ -371,6 +371,7 @@ extern void proc_clear_tty(struct task_struct *p);
>>>  extern struct tty_struct *get_current_tty(void);
>>>  /* tty_io.c */
>>>  extern int __init tty_init(void);
>>> +extern const char *tty_name(const struct tty_struct *tty);
>>>  #else
>>>  static inline void console_init(void)
>>>  { }
>>> @@ -391,6 +392,8 @@ static inline struct tty_struct *get_current_tty(void)
>>>  /* tty_io.c */
>>>  static inline int __init tty_init(void)
>>>  { return 0; }
>>> +static inline const char *tty_name(const struct tty_struct *tty)
>>> +{ return "(none)"; }
>>>  #endif
>>
>> As it currently stands tty_name() returns "NULL tty" when the passed
>> tty_struct is NULL while this patch returns "(none)" in the case of
>> CONFIG_TTY=n; it seems like some consistency might be good, yes?  Or
>> do you think there is value in differentiating between the two cases?
>>
>> From an audit point of view, we would prefer if both were "(none)".
> 
> Right, I noticed that the audit code prints "(none)" here while the
> tty code prints "NULL tty", and that meant I could not make it behave
> the same way as all the existing code. I picked "(none)" because
> in case of CONFIG_TTY being disabled that is more logical: it's
> not a NULL pointer because something went wrong, but instead the
> pointer doesn't matter and we know there is no tty.

Apologies for not having foreseen this in the review.
Arnd's solution looks good to me.



Re: [PATCH V4] audit: add tty field to LOGIN event

2016-04-26 Thread Peter Hurley
On 04/26/2016 03:34 PM, Paul Moore wrote:
> On Fri, Apr 22, 2016 at 1:16 PM, Peter Hurley  
> wrote:
>> On 04/21/2016 11:14 AM, Richard Guy Briggs wrote:
>>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>>> index b40ed5d..32cdafb 100644
>>> --- a/include/linux/audit.h
>>> +++ b/include/linux/audit.h
>>> @@ -26,6 +26,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  #define AUDIT_INO_UNSET ((unsigned long)-1)
>>>  #define AUDIT_DEV_UNSET ((dev_t)-1)
>>> @@ -343,6 +344,23 @@ static inline unsigned int audit_get_sessionid(struct 
>>> task_struct *tsk)
>>>   return tsk->sessionid;
>>>  }
>>>
>>> +static inline struct tty_struct *audit_get_tty(struct task_struct *tsk)
>>> +{
>>> + struct tty_struct *tty = NULL;
>>> + unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&tsk->sighand->siglock, flags);
>>> + if (tsk->signal)
>>> + tty = tty_kref_get(tsk->signal->tty);
>>> + spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
> 
> I just merged Richard's patch, if nothing else it is better than it
> was.  However, I would like to talk about improving things, see below.
> 
>> Not that I'm objecting because I get that you're just refactoring
>> existing code, but I thought I'd point out some stuff.
>>
>> 1. There's no need to check if signal_struct is NULL (ie. tsk->signal)
>>because if it is, this will blow up trying to dereference the
>>sighand_struct (ie tsk->sighand).
>>
>> 2. The existing usage is always tsk==current
> 
> Yep, there is only one caller I found that even works on task_structs
> other than current (see audit_log_exit() via audit_free()), although
> even then when it ends up calling into audit_log_task_info() tsk
> should always be current.
> 
> I've got a patch compiling now to get rid of passing around current as
> a a task_struct argument, assuming nothing blows up in testing I'll
> post/merge it.
> 
>> 3. If the idea is to make this invulnerable to tsk being gone, then
>>the usage is unsafe anyway.
> 
> I don't think that is our concern here.
> 
>> So ultimately (but not necessarily for this patch) I'd prefer that either
>> a. audit use existing tty api instead of open-coding, or
>> b. add any tty api functions required.
> 
> I'm open to suggestions, care to elaborate on either option?

So b) is only necessary if the answer to 3) was yes or if tsk != current.
Otherwise, the new audit_get_tty() is equivalent to get_current_tty()
which is the exported tty core interface for the identical operation.

I was only suggesting b) if get_current_tty() wasn't going to be
sufficient.

> Feel free to elaborate by patch too ;)

I can do that.

Regards,
Peter Hurley


Re: [PATCH 1/1] workqueue: fix ghost PENDING flag while doing MQ IO

2016-04-26 Thread Peter Hurley
On 04/26/2016 04:15 AM, Roman Pen wrote:
> The bug in a workqueue leads to a stalled IO request in MQ ctx->rq_list
> with the following backtrace:
> 
> [  601.347452] INFO: task kworker/u129:5:1636 blocked for more than 120 
> seconds.
> [  601.347574]   Tainted: G   O4.4.5-1-storage+ #6
> [  601.347651] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [  601.348142] kworker/u129:5  D 880803077988 0  1636  2 
> 0x
> [  601.348519] Workqueue: ibnbd_server_fileio_wq 
> ibnbd_dev_file_submit_io_worker [ibnbd_server]
> [  601.348999]  880803077988 88080466b900 8808033f9c80 
> 880803078000
> [  601.349662]  880807c95000 7fff 815b0920 
> 880803077ad0
> [  601.350333]  8808030779a0 815b01d5  
> 880803077a38
> [  601.350965] Call Trace:
> [  601.351203]  [] ? bit_wait+0x60/0x60
> [  601.351444]  [] schedule+0x35/0x80
> [  601.351709]  [] schedule_timeout+0x192/0x230
> [  601.351958]  [] ? blk_flush_plug_list+0xc7/0x220
> [  601.352208]  [] ? ktime_get+0x37/0xa0
> [  601.352446]  [] ? bit_wait+0x60/0x60
> [  601.352688]  [] io_schedule_timeout+0xa4/0x110
> [  601.352951]  [] ? _raw_spin_unlock_irqrestore+0xe/0x10
> [  601.353196]  [] bit_wait_io+0x1b/0x70
> [  601.353440]  [] __wait_on_bit+0x5d/0x90
> [  601.353689]  [] wait_on_page_bit+0xc0/0xd0
> [  601.353958]  [] ? autoremove_wake_function+0x40/0x40
> [  601.354200]  [] __filemap_fdatawait_range+0xe4/0x140
> [  601.354441]  [] filemap_fdatawait_range+0x14/0x30
> [  601.354688]  [] filemap_write_and_wait_range+0x3f/0x70
> [  601.354932]  [] blkdev_fsync+0x1b/0x50
> [  601.355193]  [] vfs_fsync_range+0x49/0xa0
> [  601.355432]  [] blkdev_write_iter+0xca/0x100
> [  601.355679]  [] __vfs_write+0xaa/0xe0
> [  601.355925]  [] vfs_write+0xa9/0x1a0
> [  601.356164]  [] kernel_write+0x38/0x50
> 
> The underlying device is a null_blk, with default parameters:
> 
>   queue_mode= MQ
>   submit_queues = 1
> 
> Verification that nullb0 has something inflight:
> 
> root@pserver8:~# cat /sys/block/nullb0/inflight
>01
> root@pserver8:~# find /sys/block/nullb0/mq/0/cpu* -name rq_list -print -exec 
> cat {} \;
> ...
> /sys/block/nullb0/mq/0/cpu2/rq_list
> CTX pending:
> 8838038e2400
> ...
> 
> During debug it became clear that stalled request is always inserted in
> the rq_list from the following path:
> 
>save_stack_trace_tsk + 34
>blk_mq_insert_requests + 231
>blk_mq_flush_plug_list + 281
>blk_flush_plug_list + 199
>wait_on_page_bit + 192
>__filemap_fdatawait_range + 228
>filemap_fdatawait_range + 20
>filemap_write_and_wait_range + 63
>blkdev_fsync + 27
>vfs_fsync_range + 73
>blkdev_write_iter + 202
>__vfs_write + 170
>vfs_write + 169
>kernel_write + 56
> 
> So blk_flush_plug_list() was called with from_schedule == true.
> 
> If from_schedule is true, that means that finally blk_mq_insert_requests()
> offloads execution of __blk_mq_run_hw_queue() and uses kblockd workqueue,
> i.e. it calls kblockd_schedule_delayed_work_on().
> 
> That means, that we race with another CPU, which is about to execute
> __blk_mq_run_hw_queue() work.
> 
> Further debugging shows the following traces from different CPUs:
> 
>   CPU#0  CPU#1
>   -- ---
>   reqeust A inserted
>   STORE hctx->ctx_map[0] bit marked
>   kblockd_schedule...() returns 1
>   
>  request B inserted
>  STORE hctx->ctx_map[1] bit marked
>  kblockd_schedule...() returns 0
>   *** WORK PENDING bit is cleared ***
>   flush_busy_ctxs() is executed, but
>   bit 1, set by CPU#1, is not observed
> 
> As a result request B pended forever.
> 
> This behaviour can be explained by speculative LOAD of hctx->ctx_map on
> CPU#0, which is reordered with clear of PENDING bit and executed _before_
> actual STORE of bit 1 on CPU#1.
> 
> The proper fix is an explicit full barrier , which guarantees
> that clear of PENDING bit is to be executed before all possible
> speculative LOADS or STORES inside actual work function.
> 
> Signed-off-by: Roman Pen 
> Cc: Gioh Kim 
> Cc: Michael Wang 
> Cc: Tejun Heo 
> Cc: Jens Axboe 
> Cc: linux-bl...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
>  kernel/workqueue.c | 29 +
>  1 file changed, 29 insertions(+)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 450c21f..0ec0594 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -649,6 +649,35 @@ static void set_work_pool_and_clear_pending(struct 
> work_struct *work,
>*/
>   smp_wmb();
>   set_work_data(work, (unsigned long)pool_id << WORK_OFFQ_POOL_SHIFT, 0);
> + /*
> +  * The following mb guarantees t

Re: [PATCH 1/1] [RFC] workqueue: fix ghost PENDING flag while doing MQ IO

2016-04-26 Thread Peter Hurley
On 04/26/2016 10:45 AM, Tejun Heo wrote:
> As long as what's happening is clearly documented, I think either is
> fine.  I'm gonna go with Roman's mb patch for -stable fix but think
> it'd be nice to have a separate patch to consolidate the paths which
> clear PENDING and make them use xchg.  If you can spin up a patch for
> that, I'd be happy to apply it to wq/for-3.7.

Ok.



Re: [PATCH 1/1] [RFC] workqueue: fix ghost PENDING flag while doing MQ IO

2016-04-26 Thread Peter Hurley
Hi Tejun,

On 04/26/2016 08:15 AM, Tejun Heo wrote:
> Hello, Peter.
> 
> On Mon, Apr 25, 2016 at 06:22:01PM -0700, Peter Hurley wrote:
>> This is the same bug I wrote about 2 yrs ago (but with the wrong fix).
>>
>> http://lkml.iu.edu/hypermail/linux/kernel/1402.2/04697.html
>>
>> Unfortunately I didn't have a reproducer at all :/
> 
> Ah, bummer.
> 
>> The atomic_long_xchg() patch has several benefits over the naked barrier:
>>
>> 1. set_work_pool_and_clear_pending() has the same requirements as
>>clear_work_data(); note that both require write barrier before and
>>full barrier after.
> 
> clear_work_data() is only used by __cancel_work_timer() and there's no
> following execution or anything where rescheduling memory loads can
> cause any issue.

I meant this block:

clear_work_data(work);

/*
 * Paired with prepare_to_wait() above so that either
 * waitqueue_active() is visible here or !work_is_canceling() is
 * visible there.
 */
smp_mb();


>> 2. xchg() et al imply full barrier before and full barrier after.
>>
>> 3. The naked barriers could be removed, while improving efficiency.
>>On x86, mov + mfence => xchg
> 
> It's unlikely to make any measureable difference.  Is xchg() actually
> cheaper than store + rmb?

store + mfence (full barrier), yes. Roughly 2x faster.

https://lkml.org/lkml/2015/11/2/607


>> 4. Maybe fixes other hidden bugs.
>>For example, I'm wondering if reordering with set_work_pwq/list_add_tail
>>would be a problem; ie., what if work is visible on the worklist _before_
>>data is initialized by set_work_pwq()?
> 
> Worklist is always accessed under the pool lock.

I see a lot of list_empty() checks while not holding pool lock, but I get
what you mean: that work visible on the worklist isn't actually inspected
without holding the pool lock.


> The barrier comes
> into play only because we're using bare PENDING bit for
> synchronization.

I see that it's only transitions of PENDING that need full barriers but it's
not clear from the semantics the conditions under which pending is already
set. If performance isn't the concern there, maybe those uses (where PENDING
is expected to already be set) should check and warn?


> I'm not necessarily against making all clearings of
> PENDING to be followed by a rmb or use xhcg.  Reasons 2-4 are pretty
> weak tho.

I agree 2 and 3 are not the best reasons.
Actually, it looks that I'm in the minority anyway, and that style-wise,
naked barrier is preferred.

Regards,
Peter Hurley


Re: [PATCH 1/1] [RFC] workqueue: fix ghost PENDING flag while doing MQ IO

2016-04-25 Thread Peter Hurley
On 04/25/2016 08:48 AM, Tejun Heo wrote:
> Hello, Roman.
> 
> On Mon, Apr 25, 2016 at 05:22:51PM +0200, Roman Pen wrote:
> ...
>>   CPU#6CPU#2
>>   reqeust 884000343600 inserted
>>   hctx marked as pended
>>   kblockd_schedule...() returns 1
>>   
>>   *** WORK_STRUCT_PENDING_BIT is cleared ***
>>   flush_busy_ctxs() is executed
>>reqeust 884000343cc0 inserted
>>hctx marked as pended
>>kblockd_schedule...() returns 0
>>
>>WTF?
>>
>> As a result 884000343cc0 request pended forever.
>>
>> According to the trace output I see that another CPU _always_ observes
>> WORK_STRUCT_PENDING_BIT as set for that hctx->run_work, even it was
>> cleared on another CPU.
>>
>> Checking the workqueue.c code I see that clearing the bit is nothing
>> more, but atomic_long_set(), which is  instruction. This
>> function:
>>
>>   static inline void set_work_data()
>>
>> In attempt to "fix" the mystery I replaced atomic_long_set() call with
>> atomic_long_xchg(), which is  instruction.
>>
>> The problem has gone.
>>
>> For me it looks that test_and_set_bit() ( instruction) does
>> not require flush of all CPU caches, which can be dirty after executing
>> of  on another CPU.  But  really updates the memory and the
>> following execution of  observes that bit was cleared.
>>
>> As a conculusion I can say, that I am lucky enough and can reproduce
>> this bug in several minutes on a specific load (I tried many other
>> simple loads using fio, even using btrecord/btreplay, no success).
>> And that easy reproduction on a specific load gives me some freedom
>> to test and then to be sure, that problem has gone.
> 
> Heh, excellent debugging.  I wonder how old this bug is.

This is the same bug I wrote about 2 yrs ago (but with the wrong fix).

http://lkml.iu.edu/hypermail/linux/kernel/1402.2/04697.html

Unfortunately I didn't have a reproducer at all :/


> cc'ing David
> Howells who ISTR to have reported a similar issue.  The root problem
> here, I think, is that PENDING is used to synchronize between
> different queueing instances but we don't have proper memory barrier
> after it.
> 
>   A   B
> 
>   queue (test_and_sets PENDING)
>   dispatch (clears PENDING)
>   execute queue (test_and_sets PENDING)
> 
> So, for B, the guarantee must be that either A starts executing after
> B's test_and_set or B's test_and_set succeeds; however, as we don't
> have any memory barrier between dispatch and execute, there's nothing
> preventing the processor from scheduling some memory fetch operations
> from the execute stage before the clearing of PENDING - ie. A might
> not see what B has done prior to queue even if B's test_and_set fails
> indicating that A should.  Can you please test whether the following
> patch fixes the issue?
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2232ae3..8ec2b5e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -666,6 +666,7 @@ static void set_work_pool_and_clear_pending(struct 
> work_struct *work,
>*/
>   smp_wmb();
>   set_work_data(work, (unsigned long)pool_id << WORK_OFFQ_POOL_SHIFT, 0);
> + smp_mb();
>  }

The atomic_long_xchg() patch has several benefits over the naked barrier:

1. set_work_pool_and_clear_pending() has the same requirements as
   clear_work_data(); note that both require write barrier before and
   full barrier after.

2. xchg() et al imply full barrier before and full barrier after.

3. The naked barriers could be removed, while improving efficiency.
   On x86, mov + mfence => xchg

4. Maybe fixes other hidden bugs.
   For example, I'm wondering if reordering with set_work_pwq/list_add_tail
   would be a problem; ie., what if work is visible on the worklist _before_
   data is initialized by set_work_pwq()?

Regards,
Peter Hurley





Re: [PATCH] serial: mctrl_gpio: Drop support for out1-gpios and out2-gpios

2016-04-22 Thread Peter Hurley
On 04/22/2016 08:10 AM, Geert Uytterhoeven wrote:
> The OUT1 and OUT2 pins present on some legacy UARTs are basically GPIOs.
> It doesn't make much sense to emulate GPIOs using other GPIOs, hence
> drop support for that.

Thanks Geert.

Reviewed-by: Peter Hurley 




Re: [PATCH V4] audit: add tty field to LOGIN event

2016-04-22 Thread Peter Hurley
On 04/21/2016 11:14 AM, Richard Guy Briggs wrote:
> The tty field was missing from AUDIT_LOGIN events.
> 
> Refactor code to create a new function audit_get_tty(), using it to
> replace the call in audit_log_task_info() and to add it to
> audit_log_set_loginuid().  Lock and bump the kref to protect it, adding
> audit_put_tty() alias to decrement it.
> 
> Signed-off-by: Richard Guy Briggs 
> ---
> 
> V4: Add missing prototype for audit_put_tty() when audit syscall is not
> enabled (MIPS).
> 
> V3: Introduce audit_put_tty() alias to decrement kref.
> 
> V2: Use kref to protect tty signal struct while in use.
> 
> ---
> 
>  include/linux/audit.h |   24 
>  kernel/audit.c|   18 +-
>  kernel/auditsc.c  |8 ++--
>  3 files changed, 35 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index b40ed5d..32cdafb 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define AUDIT_INO_UNSET ((unsigned long)-1)
>  #define AUDIT_DEV_UNSET ((dev_t)-1)
> @@ -343,6 +344,23 @@ static inline unsigned int audit_get_sessionid(struct 
> task_struct *tsk)
>   return tsk->sessionid;
>  }
>  
> +static inline struct tty_struct *audit_get_tty(struct task_struct *tsk)
> +{
> + struct tty_struct *tty = NULL;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&tsk->sighand->siglock, flags);
> + if (tsk->signal)
> + tty = tty_kref_get(tsk->signal->tty);
> + spin_unlock_irqrestore(&tsk->sighand->siglock, flags);


Not that I'm objecting because I get that you're just refactoring
existing code, but I thought I'd point out some stuff.

1. There's no need to check if signal_struct is NULL (ie. tsk->signal)
   because if it is, this will blow up trying to dereference the
   sighand_struct (ie tsk->sighand).

2. The existing usage is always tsk==current

3. If the idea is to make this invulnerable to tsk being gone, then
   the usage is unsafe anyway.


So ultimately (but not necessarily for this patch) I'd prefer that either
a. audit use existing tty api instead of open-coding, or
b. add any tty api functions required.


Regards,
Peter Hurley


> + return tty;
> +}
> +
> +static inline void audit_put_tty(struct tty_struct *tty)
> +{
> + tty_kref_put(tty);
> +}
> +
>  extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
>  extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t gid, 
> umode_t mode);
>  extern void __audit_bprm(struct linux_binprm *bprm);
> @@ -500,6 +518,12 @@ static inline unsigned int audit_get_sessionid(struct 
> task_struct *tsk)
>  {
>   return -1;
>  }
> +static inline struct tty_struct *audit_get_tty(struct task_struct *tsk)
> +{
> + return NULL;
> +}
> +static inline void audit_put_tty(struct tty_struct *tty)
> +{ }
>  static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
>  { }
>  static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 3a3e5de..7edd776 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -64,7 +64,6 @@
>  #include 
>  #endif
>  #include 
> -#include 
>  #include 
>  #include 
>  
> @@ -1873,21 +1872,14 @@ void audit_log_task_info(struct audit_buffer *ab, 
> struct task_struct *tsk)
>  {
>   const struct cred *cred;
>   char comm[sizeof(tsk->comm)];
> - char *tty;
> + struct tty_struct *tty;
>  
>   if (!ab)
>   return;
>  
>   /* tsk == current */
>   cred = current_cred();
> -
> - spin_lock_irq(&tsk->sighand->siglock);
> - if (tsk->signal && tsk->signal->tty && tsk->signal->tty->name)
> - tty = tsk->signal->tty->name;
> - else
> - tty = "(none)";
> - spin_unlock_irq(&tsk->sighand->siglock);
> -
> + tty = audit_get_tty(tsk);
>   audit_log_format(ab,
>" ppid=%d pid=%d auid=%u uid=%u gid=%u"
>" euid=%u suid=%u fsuid=%u"
> @@ -1903,11 +1895,11 @@ void audit_log_task_info(struct audit_buffer *ab, 
> struct task_struct *tsk)
>from_kgid(&init_user_ns, cred->egid),
>from_kgid(&init_user_ns, cred->sgid),
>from_kgid(&init_user_ns, cred->fsgid),
> -  tty, audit_get_sessionid(tsk));
> -
> +  

Re: [PATCH 01/16] devpts: Attempting to get it right

2016-04-19 Thread Peter Hurley
On 04/19/2016 05:24 PM, Peter Hurley wrote:
> On 04/19/2016 04:35 PM, Linus Torvalds wrote:
>> On Tue, Apr 19, 2016 at 3:06 PM, Eric W. Biederman
>>  wrote:
>>>
>>> I have work inspired by this rolled into my code.  I will post shortly
>>> after a little more testing.
>>
>> Actually, I have a slightly fixed version in my tree. I've been
>> running this on my own machines for a few days, just to verify, along
>> with some testing.
>>
>> The fixes are some cleanups of the header file (the !UNIX98 section
>> that nobody uses was bogus), and fixing "devpts_get_ref()" to get the
>> "struct file *" argument too. The current code doesn't need it, but
>> the code to actually look up the right pts/ directory from the ptmx
>> file open needs it because that's where the path is - passing in just
>> the inode isn't sufficient).
>>
>> Anyway, I think I'll just merge my branch instead of sending out
>> another emailed patch, because I don't think that patch is
>> controversial or unsafe. It doesn't actually change any semantics, and
>> only does cleanups. If it breaks something due to the rules about
>> private_data being different for slave and master side pty's, then the
>> old code was broken too - it used to always put a inode pointer in
>> there, but they were very different inode pointers.
> 
> Yes, the old code was broken. The new code always uses the ptmx inode.

In re-reading, I was not as clear as I should have been here.
Yes, the different inodes are stored in their corresponding driver_data
but the ptmx inode is always used for final teardown.


> Which is a stopgap measure to prevent the devpts instance from
> teardown when the only open filp is /dev/tty.
> 
> What I want to do instead (but not for -stable) is bump the inode
> reference at controlling tty association instead (either first
> qualifying open or ioctl(TIOCSCTTY)). That way it would always be
> the slave inode. This method is non-trivial though because it makes
> disassociation more complicated.
> 
> Regards,
> Peter Hurley
> 



Re: [PATCH 01/16] devpts: Attempting to get it right

2016-04-19 Thread Peter Hurley
On 04/19/2016 04:35 PM, Linus Torvalds wrote:
> On Tue, Apr 19, 2016 at 3:06 PM, Eric W. Biederman
>  wrote:
>>
>> I have work inspired by this rolled into my code.  I will post shortly
>> after a little more testing.
> 
> Actually, I have a slightly fixed version in my tree. I've been
> running this on my own machines for a few days, just to verify, along
> with some testing.
> 
> The fixes are some cleanups of the header file (the !UNIX98 section
> that nobody uses was bogus), and fixing "devpts_get_ref()" to get the
> "struct file *" argument too. The current code doesn't need it, but
> the code to actually look up the right pts/ directory from the ptmx
> file open needs it because that's where the path is - passing in just
> the inode isn't sufficient).
> 
> Anyway, I think I'll just merge my branch instead of sending out
> another emailed patch, because I don't think that patch is
> controversial or unsafe. It doesn't actually change any semantics, and
> only does cleanups. If it breaks something due to the rules about
> private_data being different for slave and master side pty's, then the
> old code was broken too - it used to always put a inode pointer in
> there, but they were very different inode pointers.

Yes, the old code was broken. The new code always uses the ptmx inode.
Which is a stopgap measure to prevent the devpts instance from
teardown when the only open filp is /dev/tty.

What I want to do instead (but not for -stable) is bump the inode
reference at controlling tty association instead (either first
qualifying open or ioctl(TIOCSCTTY)). That way it would always be
the slave inode. This method is non-trivial though because it makes
disassociation more complicated.

Regards,
Peter Hurley



Re: Q: tty_open() && hangup

2016-04-15 Thread Peter Hurley
Hi Oleg,

On 04/15/2016 10:19 AM, Oleg Nesterov wrote:
> Hi,
> 
> I am fingting with obscure pty bug in rhel6 which _might be_ explained
> by some hangup/reopen races, and this motivated me to look at upstream
> code which I can't understand too ;)
> 
> Lets look at tty_open(),
> 
>   2112  tty = tty_open_current_tty(device, filp);
>   2113  if (!tty)
>   2114  tty = tty_open_by_driver(device, inode, filp);
>   2115
>   2116  if (IS_ERR(tty)) {
>   2117  tty_free_file(filp);
>   2118  retval = PTR_ERR(tty);
>   2119  if (retval != -EAGAIN || signal_pending(current))
>   2120  return retval;
>   2121  schedule();
>   2122  goto retry_open;
> 
> And why do we need this schedule() above? It is nop in TASK_RUNNING.
> 
> If we need it for non-preempt kernels to avoid some sort of livelock
> this can't work if rt_task(current) == T.
> 
> If we just want to add a preemption point then perhaps we should use
> cond_resched/might_resched ?

Yeah, it's just a preemption point that pre-dates cond_resched.


> I am not saying this is wrong, but looks confusing.
> 
>   2130  if (tty->ops->open)
>   2131  retval = tty->ops->open(tty, filp);
>   2132  else
>   2133  retval = -ENODEV;
>   2134  filp->f_flags = saved_flags;
>   2135
>   2136  if (retval) {
>   2137  tty_debug_hangup(tty, "open error %d, releasing\n", 
> retval);
>   2138
>   2139  tty_unlock(tty); /* need to call tty_release without 
> BTM */
>   2140  tty_release(inode, filp);
>   2141  if (retval != -ERESTARTSYS)
>   2142  return retval;
>   2143
>   2144  if (signal_pending(current))
>   2145  return retval;
> 
> So we can only get here if tty->ops->open returns -ERESTARTSYS without
> signal_pending(). This doesn't look good. But OK, lets forget this.

tty_port_block_til_ready() returns -ERESTARTSYS if !ASYNC_HUP_NOTIFY and
tty was hungup or the h/w was reset (eg. while waiting for tty lock).
IOW, parallel hangup() or release() happened at the point where the
tty lock was not held.


>   2147  schedule();
> 
> Again, this looks confusing.
> 
>   2148  /*
>   2149   * Need to reset f_op in case a hangup happened.
>   2150   */
>   2151  if (tty_hung_up_p(filp))
>   2152  filp->f_op = &tty_fops;
> 
> And this is called after tty_add_file() which makes this filp visible to
> __tty_hangup(), and without tty_lock().

tty_release() has removed the filp from the files list already (with the
tty lock held), so if the file operations were reset it's because a
hangup happened when the tty lock was not held (eg. waiting for reopen
or after dropping the lock and reacquiring it in tty_release())


> How can we trust this check? __tty_hangup() can set hung_up_tty_fops()
> right after tty_hung_up_p() returns F.

__tty_hangup() will not have had visibility to this filp after
tty_release().


> Don't we need something like below? Otherwise afaics tty_open() can
> actually succeed (and clear TTY_HUPPED) after restart, but return with
> tty_hung_up_p(filp) == T.

No.

Regards,
Peter Hurley



> --- x/drivers/tty/tty_io.c
> +++ x/drivers/tty/tty_io.c
> @@ -2122,6 +2122,13 @@ retry_open:
>   goto retry_open;
>   }
>  
> + /*
> +  * This is only possible after retry_open, we drop tty_lock()
> +  * without tty_del_file().
> +  */
> + if (tty_hung_up_p(filp))
> + filp->f_op = &tty_fops;
> +
>   tty_add_file(tty, filp);
>  
>   check_tty_count(tty, __func__);
> @@ -2145,11 +2152,6 @@ retry_open:
>   return retval;
>  
>   schedule();
> - /*
> -  * Need to reset f_op in case a hangup happened.
> -  */
> - if (tty_hung_up_p(filp))
> - filp->f_op = &tty_fops;
>   goto retry_open;
>   }
>   clear_bit(TTY_HUPPED, &tty->flags);
> 



Re: 8250 dma issues ( was Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes)

2016-04-14 Thread Peter Hurley
On 04/14/2016 08:07 AM, One Thousand Gnomes wrote:
>>>> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
>>>>without pause/resume.  
>>>
>>> Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
>>> control is one workaround but the user has no chance of knowing that
>>> XON/XOFF has been silently disabled.
> 
> You can clear the bits in the termios when the termios is set and the
> application *should* interpret that as not supported. I doubt many
> applications do for the XON/XOFF case. Equally you can just say that soft
> flow control turns off DMA or reduces buffering depending upon the data
> rate. We have plenty of hardware in the kernel that is more optimal in
> some configurations than others.
> 
> This also shouldn't be about whether 4K is a lot - it's about time to
> respond. Thus the _time_ latency of getting the ^S/^Q out is what matters
> at higher rates. At low speed (1200-9600 etc) you want to be able to
> respond within a few characters because the chances are the device the
> other end is not very bright.

Currently, tcflow(TCIOFF/TCION) ignores the flow control modes and always
sends XON/XOFF, so disabling DMA based on flow control mode won't work.


>>> the transfer right away. Oh now I see the same thing in
>>> edma_completion_handler(). Okay but this affects now everyone that
>>> relies on low latency?  
>>
>> Well, the real problem is that only one rx buffer is being used serially,
>> first filled by the dma h/w, then emptied by the driver, then resubmitted.
>> This creates a gap of time between the dma h/w completion interrupt and
>> the resubmission where data loss is possible (and happens).
> 
> Most low latency users are concerned about the latency between transmit
> and receive.

Sebastian really confused this problem by relating it to low latency.
That's not the problem.

The problem is that softirqs and tasklets are not reliable bottom halves
anymore, because since 3.8 they don't always run when interrupts are
re-enabled. They can now be deferred to the lowest priority process,
ksoftirqd.

For a uart trying not to overrun a 64-byte fifo at 3Mbaud creates a hard
213us deadline for the uart driver to give the dmaengine driver the
_next_ RX buffer to fill. Right now, the 8250 driver is only using 1
RX buffer, so until the DMA completion handler runs (in tasklet), the
8250 driver doesn't supply the dmaengine driver a new one.



> The usual case is windowless protocols like firmware
> downloaders. For higher speed that tends to be driven by the DMA
> timeouts, for lower baud rates you can perhaps mitigate this by using
> chains of very small buffers or just turning off DMA just as we turn off
> some of the FIFOs at very low speed ?
> 
>> But that's why I'd like to bring the two implementations closer, so that
>> maybe both can be replaced with a single rx dma transaction flow.
>> [ And perhaps extending tty buffer to perform direct fill, skipping the
>>   buffer copy ]
> 
> For the general case what IMHO is needed is probably not a direct fill of
> the tty buffer (which is surprisingly locking hard - we used to have one
> but it was broken) but rather a fastpath around it. With the specific
> exception of N_TTY I think every single other line discipline we have is
> capable of accepting a pointer and length to a block of data that ceases
> to be valid the moment the function returns. All the networking ones
> certainly are and it would speed up the usual culprits (3G modems over
> USB, bluetooth over onboard 3.3v uart etc). 
> 
> So a way to call
> 
>   port->fast_rx(data, flags, len);
> 
> with a rule that you never mix fast and tty buffers, and with an atomic
> swap of port->fast_rx between tty_buffer queueing logic, discard and
> ldisc->fast_rx pointers done when the ldisc is set or changes.

Yes, that's one possible alternative. The drawback is that makes
testing more difficult, but maybe the tradeoff would be worth it.

Regards,
Peter Hurley


> There are very few cases where n_tty is the one that needs the optimized
> path: uucp died a long time ago 8)
> 
> Alan
> 



Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes

2016-04-14 Thread Peter Hurley
On 04/12/2016 11:42 AM, Peter Hurley wrote:
> On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
>> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>>> On 2016-04-05, Peter Hurley  wrote:
>>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>>> It has been observed that the TX-DMA can stall
>>>>>>>
>>>>>>> Does this happen on any other OMAP part besides am335x?
>>>>>>> I looked back over the LKML history of this and didn't see
>>>>>>> any other design implicated in this problem.
>>>>>>
>>>>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>>>>> dma-tx stall with am335x/edma and dra7/sdma.

What "test" is reproducing this problem?


>>>>> I thought we already established sdma was not to be used since
>>>>> the hardware does not actually support pausing without data loss.
>>>>
>>>> This workaround was not invented for sdma but for edma (with am335x).
>>>
>>> According to John above, dra7/sdma requires this workaround.
>>
>> It was reported by Frans Klaver against am335x
>>  http://lkml.kernel.org/r/20140908183353.GB4686@ci00147.xsens-tech.local
>>
>> and I managed to reproduce this with his yocto image on dra7 and am335x:
>>  http://lkml.kernel.org/r/20140921204100.ga10...@linutronix.de

I overlooked Sebastian's statement from that message:

This includes
changing the baudrate (not by yocto but the driver sets it to 0 and then
to the requested one) and this seems to be responsible for the "bad
bytes".

What is changing the baud rate to 0? How about get a stack trace for that?



> [...]
> 
>>> - hangs changing some unknown register if tx dma in progress
>>>   (ie., this termios change workaround)
>>
>> I think some registers are the baud-rate registers which pause engine.
> 
> Let's back up here and focus on just this problem for now.
> 
> Since this is observable on dra7/sdma, then it's not related to
> the OMAP_DMA_TX_KICK problem. IOW, dropping tx dma support for am335x
> does not make this go away, which is what I was asking.
> 
> Now, if the DLL/DLH/MDR1 register writes are causing dma to stall,
> then skipping those if they're unchanged should fix this, and then
> pause/terminating in-progress DMA if any of these registers are being
> written would be acceptable since some data loss is to be expected
> when changing the baud rate without waiting.
> 



Re: omap uart + dma issues (Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes)

2016-04-13 Thread Peter Hurley
On 04/13/2016 04:11 AM, Sekhar Nori wrote:
> On Wednesday 13 April 2016 05:30 AM, Peter Hurley wrote:
> 
>>>> - generates spurious uart interrupt for every rx dma transaction
>>>>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>>>>   _Even with this workaround_, it still generates spurious interrupt 
>>>> warning
>>>>   which shuts off interrupts for several ms while logging the error
>>>>   message to the console, virtually guaranteeing lost data.
>>>
>>> as I wrote in my other email I think RDI should be disabled with DMA
>>
>>
>> I'll test to see if disabling RDI eliminates the UART_IIR_NO_INT spurious
>> interrupts.

Ok; disabling UART_IER_RDI eliminates the UART_IIR_NO_INT spurious
interrupts.

However, disabling RDI disables RX timeout as well, so data just sits in
the RX fifo with no way to get it out. AFAICT that's a showstopper.


>>> according the Intel manual and I *think* someone here reported that
>>> they see the same problem.
>>
>> Let's confirm with the Intel folks that this is true, which would argue
>> for using the omap-style rx dma flow.
> 
> Andy Shevchenko pointed this out here: https://lkml.org/lkml/2016/2/23/588

which Andy noted as well:

On 02/23/2016 08:56 AM, Andy Shevchenko wrote:
> The problem is that we have no separate bit to control timeout
> interrupts from UART.



Re: [PATCH] audit: add tty field to LOGIN event

2016-04-13 Thread Peter Hurley
Hi Richard,

On 04/13/2016 04:25 PM, Richard Guy Briggs wrote:
> The tty field was missing from AUDIT_LOGIN events.
> 
> Refactor code to create a new function audit_get_tty(), using it to
> replace the call in audit_log_task_info() and to add it to
> audit_log_set_loginuid().
> 
> Signed-off-by: Richard Guy Briggs 
> ---
>  include/linux/audit.h |   18 ++
>  kernel/audit.c|   11 +--
>  kernel/auditsc.c  |5 +++--
>  3 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index b40ed5d..20c6649 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define AUDIT_INO_UNSET ((unsigned long)-1)
>  #define AUDIT_DEV_UNSET ((dev_t)-1)
> @@ -343,6 +344,19 @@ static inline unsigned int audit_get_sessionid(struct 
> task_struct *tsk)
>   return tsk->sessionid;
>  }
>  
> +static inline char *audit_get_tty(struct task_struct *tsk)
> +{
> + char *tty;
> +
> + spin_lock_irq(&tsk->sighand->siglock);
> + if (tsk->signal && tsk->signal->tty && tsk->signal->tty->name)
> + tty = tsk->signal->tty->name;
> + else
> + tty = "(none)";
> + spin_unlock_irq(&tsk->sighand->siglock);

This is unsafe because the tty could be immediately torn down after the
siglock is dropped, and return a dangling ptr.


> + return tty;
> +}
> +
>  extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
>  extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t gid, 
> umode_t mode);
>  extern void __audit_bprm(struct linux_binprm *bprm);
> @@ -500,6 +514,10 @@ static inline unsigned int audit_get_sessionid(struct 
> task_struct *tsk)
>  {
>   return -1;
>  }
> +static inline char *audit_get_tty(struct task_struct *tsk)
> +{
> + return "(invalid)";
> +}
>  static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
>  { }
>  static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 3a3e5de..fae11df 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -64,7 +64,6 @@
>  #include 
>  #endif
>  #include 
> -#include 
>  #include 
>  #include 
>  
> @@ -1873,7 +1872,6 @@ void audit_log_task_info(struct audit_buffer *ab, 
> struct task_struct *tsk)
>  {
>   const struct cred *cred;
>   char comm[sizeof(tsk->comm)];
> - char *tty;

struct tty_struct *tty;

>  
>   if (!ab)
>   return;
> @@ -1881,13 +1879,6 @@ void audit_log_task_info(struct audit_buffer *ab, 
> struct task_struct *tsk)
>   /* tsk == current */
>   cred = current_cred();
>  
> - spin_lock_irq(&tsk->sighand->siglock);
> - if (tsk->signal && tsk->signal->tty && tsk->signal->tty->name)
> - tty = tsk->signal->tty->name;
> - else
> - tty = "(none)";
> - spin_unlock_irq(&tsk->sighand->siglock);

tty = get_current_tty();

> -
>   audit_log_format(ab,
>" ppid=%d pid=%d auid=%u uid=%u gid=%u"
>" euid=%u suid=%u fsuid=%u"
> @@ -1903,7 +1894,7 @@ void audit_log_task_info(struct audit_buffer *ab, 
> struct task_struct *tsk)
>from_kgid(&init_user_ns, cred->egid),
>from_kgid(&init_user_ns, cred->sgid),
>from_kgid(&init_user_ns, cred->fsgid),
> -  tty, audit_get_sessionid(tsk));
> +  audit_get_tty(tsk), audit_get_sessionid(tsk));

tty_name(tty), );
^^
returns "NULL tty" if tty == NULL

tty_kref_put(tty);


>  
>   audit_log_format(ab, " comm=");
>   audit_log_untrustedstring(ab, get_task_comm(comm, tsk));
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 195ffae..a0467fb 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1993,8 +1993,9 @@ static void audit_log_set_loginuid(kuid_t koldloginuid, 
> kuid_t kloginuid,
>   return;
>   audit_log_format(ab, "pid=%d uid=%u", task_pid_nr(current), uid);
>   audit_log_task_context(ab);
> - audit_log_format(ab, " old-auid=%u auid=%u old-ses=%u ses=%u res=%d",
> -  oldloginuid, loginuid, oldsessionid, sessionid, !rc);

tty = get_current_tty();

> + audit_log_format(ab, " old-auid=%u auid=%u tty=%s old-ses=%u ses=%u 
> res=%d",
> +  oldloginuid, loginuid, audit_get_tty(current),

..., tty_name(tty),

> +  oldsessionid, sessionid, !rc);

tty_kref_put(tty);

Regards,
Peter Hurley


>   audit_log_end(ab);
>  }
>  
> 



Re: Is there any race between flush_to_ldisc() and release_one_tty()?

2016-04-12 Thread Peter Hurley
Hi Prasad,

Thanks for the report.


On 04/12/2016 08:22 AM, Sodagudi Prasad wrote:
> 
> Hi All,
> 
> It looks like there is race between flush_to_ldisc() and release_one_tty().

Not necessarily.

Driver could have destroyed the port prematurely. Or the driver could have
rescheduled the input kworker after it was cancelled in release_tty().

What driver is this?


> Following crash is observed even after including below change.
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/tty/tty_buffer.c?id=7098296a362a96051fa120abf48f0095818b99cd
> https://lkml.org/lkml/2015/9/1/491
> 
> 
> [386532.450351] Unable to handle kernel paging request at virtual address 
> 6b6b6b6b6b6b6b93
> [386532.465217] pgd = ffc05bea5000
> [386532.467677] [6b6b6b6b6b6b6b93] *pgd=, 
> *pud=
> [386532.474715] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [386532.480350] Modules linked in: wlan(O) [last unloaded: wlan]
> [386532.486085] CPU: 5 PID: 31970 Comm: kworker/5:1 Tainted: GW  O   
> 3.18.24-ga9bbc02-00076-g1434803 #1
> [386532.495885] Hardware name: Qualcomm Technologies, Inc. MSM8953 MTP (DT)
> [386532.502581] Workqueue: events flush_to_ldisc
> [386532.506909] task: ffc06162 ti: ffc011efc000 task.ti: 
> ffc011efc000
> [386532.514465] PC is at ldsem_down_read_trylock+0x0/0x48
> [386532.519583] LR is at tty_ldisc_ref+0x24/0x4c
> 
> 
> 
> [386533.028262] Process kworker/5:1 (pid: 31970, stack limit = 
> 0xffc011efc058)
> [386533.035553] Call trace:
> [386533.038080] [] ldsem_down_read_trylock+0x0/0x48
> [386533.044236] [] flush_to_ldisc+0x28/0x124
> [386533.049794] [] process_one_work+0x238/0x3f0
> [386533.055608] [] worker_thread+0x2f8/0x418
> [386533.061163] [] kthread+0xec/0xf8


Why is this trace report elided?

Does this happen on vanilla 3.18.24 kernel? What about a more recent kernel?


> 1) It is not clear how READ_ONCE would fix the race between flush_to_ldisc() 
> and release_one_tty() discussed in https://lkml.org/lkml/2015/9/1/491. could 
> you please provide more information?

It doesn't.

READ_ONCE() only guarantees the compiler won't reload tty from port->itty after
the check for NULL; that in turn guarantees that the kworker hasn't been 
cancelled
yet (since it's now running) and release_tty() can't advance until this kworker
completes.

IOW, the kworker cancel in release_tty() is what guarantees flush_to_ldisc()
is not concurrent with release_one_tty().

But this doesn't factor in at all in the observed crash.
Although the information provided is quite limited(?!), the tty address itself
is bogus; not that this used to point to a valid-but-now-free tty.

What's most likely happened is the tty port has been freed while its
buffer work was running, which is almost certainly a driver bug.


> 2) Is there any chance that, other core could free tty memory between 442 and 
> 445 lines?

Sure, but I don't think that's what's happened here.
As I wrote above, more likely the port has been freed, so the value
retrieved for port->itty was bogus.

FWIW, the ipwireless tty driver will happily free the tty while it's in use.

And the serial core will let you unload the uart driver and free the tty
ports while the tty is open and in use, as well.


> 434 static void flush_to_ldisc(struct work_struct *work)
> 435 {
> ...
> ...
> ...
> 441 tty = READ_ONCE(port->itty);
> 442 if (tty == NULL)
> 443 return;
> 444
> 445 disc = tty_ldisc_ref(tty);
> 446 if (disc == NULL)
> 447 return;




omap uart + dma issues (Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes)

2016-04-12 Thread Peter Hurley
On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>> On 2016-04-05, Peter Hurley  wrote:
>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>> It has been observed that the TX-DMA can stall

[...]

>>>> http://www.spinics.net/lists/linux-serial/msg18503.html
>>>
>>> This could be fixed. See
>>>   http://www.spinics.net/lists/linux-serial/msg18517.html
>>>   http://www.spinics.net/lists/linux-serial/msg18531.html
>>>
>>> rmk was fine with it from what I read. So what is missing is just
>>> refurbish the patch (update the comment according to rmk replay) and
>>> then we could re-enable DMA again.
>>
>> That's hardly all that is required.
> 
> well it would enable pause of RX transfers.

I'm not totally convinced of that.

Does a RX timeout ensure that sdma won't try to complete the transaction
if more rx data arrives between the raising of the rx timeout interrupt
and the pausing of rx transaction?

My guess is not, which will result in lost data in normal operation.


> TX would not work (unless TI's HW people can confirm that it will).
> 
>> 1. edma pause returns an error if the descriptor has already been retired
>>when a pause is attempted. This makes distinguishing between reporting an
>>error for unsupported feature indistinguishable from a transient dma
>>error that can simply be logged.

Also, this error is reported during normal operation, which is bogus.
The h/w descriptor has completed but the completion handler has not yet been
called, so the "error" is expected.

Which, laughably, is the same error code (-EINVAL) omap-dma uses to "creatively"
report it doesn't actually support pause/resume, despite what 
dma_get_slave_caps()
reports.

[ split spurious uart interrupts, XON/XOFF transmit, rx dma flow, and code
  code maintenance to separate email discussion ]


>> IOW, omap dma has turned into one big tangle of workarounds.
>
> Most of them are hardware shortcomings.

Totally agree. But at some point, working around h/w becomes prohibitive;
maybe not for a vendor tree, but for mainline, yes.


> I think disabling RX-DMA due to missing pause in omap-dma is the only
> workaround that could be avoided if the driver would be changed.
> 
>> Let's start with making a list of which TI designs need which workarounds.
>>
>> *am335x*
> 
> I am not sure if the limitations are based on the DMA engine or the
> 
>> - requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
>>   workaround necessitating completely different tx dma completion handler)
> 
> This one for instance I don't see on BeagleBoard xM / omap36xx and
> DRA7x. Both (not affected) use SDMA instead EDMA.

Ok.
A goal here would be to use serial8250_tx_dma() for these designs
(the delayed restore workaround makes that not possible right now though).

> It would be interesting to see if DRA7x is affected once it uses EDMA.

Agree; because if this problem is limited to the am335x then maybe
tx dma isn't worth doing for that design.


[ cut to "8250 dma issues" ]

[ cut to my 1st reply  ]


>> - generates spurious uart interrupt for every rx dma transaction
>>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>>   _Even with this workaround_, it still generates spurious interrupt warning
>>   which shuts off interrupts for several ms while logging the error
>>   message to the console, virtually guaranteeing lost data.
> 
> as I wrote in my other email I think RDI should be disabled with DMA


I'll test to see if disabling RDI eliminates the UART_IIR_NO_INT spurious
interrupts.

> according the Intel manual and I *think* someone here reported that
> they see the same problem.

Let's confirm with the Intel folks that this is true, which would argue
for using the omap-style rx dma flow.



>> Can any TI design use the baseline 8250 tx dma transaction flow without
>> workarounds?  I know the am335x can't; any others?
>
> Am335x. Has edma and so has dm814x. According to the code, dm814x based
> HW does not need it, can this be confirmed? Sekhar, TOny?

Slight misunderstanding; what I mean is what TI designs can actually use
serial8250_tx_dma()?

Right now, it seems both am335x and dra7 require omap_8250_tx_dma
(for the delayed_restore workaround and the direct write to tx fifo to
kick tx dma).

[ Assume that the UART_CAP_RPM functionality is added to serial8250

8250 dma issues ( was Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes)

2016-04-12 Thread Peter Hurley
[ +to Heikki, Andy ]

On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:

[ elided parts not relevant to shared 8250 dma discussion ]

>> 2. The question of a spurious uart interrupt with every dma transaction
>>on am335x is still unanswered.
> 
> This is correct. If I remember correctly, the Intel people see the same
> thing and I *think* John told me that the Intel manual says that RDI
> should be disabled if DMA is used.

Not sure we're talking about the same thing here?

I mean the 10 UART_IIR_NO_INTs that trigger irq disable, for which
John submitted a patch to ack these. I've never seen any discussion
regarding this problem on intel platforms.

An *extra* 6000+ interrupts/sec. for no purpose is bad.


>> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
>>without pause/resume.
>
> Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
> control is one workaround but the user has no chance of knowing that
> XON/XOFF has been silently disabled.
> 
> You could send the x_char after TX transfer completed. After all you
> need to ensure that you have some space in the TX-FIFO. However if you
> send a 4KiB of data you might want to send x_char rather sooner than
> later. I *think* even with pause the hardware will complete the last
> burst before stopping but is probably better than waiting for the 4KiB
> to complete.

Yeah, it doesn't need to be the very next byte but a better effort should
be made. 4k is lot of scroll-by.


>> 4. Since virt-dma uses tasklets which since 3.8 are no longer serviced
>>in a timely manner, rx dma is unreliable, since it's often kicked out
>>to regular interrupts.
> 
> Is this only the delay in omap_dma_callback() (which you don't have
> !cyclic) or something else? omap_dma_issue_pending() seems to program
> the transfer right away. Oh now I see the same thing in
> edma_completion_handler(). Okay but this affects now everyone that
> relies on low latency?

Well, the real problem is that only one rx buffer is being used serially,
first filled by the dma h/w, then emptied by the driver, then resubmitted.
This creates a gap of time between the dma h/w completion interrupt and
the resubmission where data loss is possible (and happens).

In the omap8250 rx dma flow:

1. Submit 1st rx dma transaction
2. h/w finishes 1st rx dma transaction
3. * schedule completion handler * which may be delayed significant
   amount of time ( > 1ms )
4. data still arriving at uart
5. completion handler eventually runs but too late, so the
   2nd rx dma transaction is not submitted in time to prevent data
   loss.

This rx dma flow creates a hard deadline starting at step 3 and ending at
step 5 above that Linux won't meet.

The normal 8250 dma flow has a similar gap but it seems to be less
frequent, probably because of the disparity in rx buffer size
(4k vs. 48 bytes)

In the 8250 rx dma flow:

1. Receive rx interrupt, submit 1st rx dma transaction
2. h/w finishes 1st rx dma transaction
3. * schedule completion handler * which may be delayed significant
   amount of time ( > 1ms )
4. data still arriving at uart, more rx interrupts received but
   rx_running == T so no add'l transaction scheduled
5. completion handler eventually runs but too late, so the
   2nd rx dma transaction is not submitted in time to prevent data
   loss.

Instead, both platforms should use a ping-pong scheme (or more generally
a ring of buffers): initially submit multiple rx buffers which allows
the dmaengine driver to start immediately on next buffer instead of
waiting for resubmission. Not sure if this will work though, haven't
really experimented with it at all yet.

But that's why I'd like to bring the two implementations closer, so that
maybe both can be replaced with a single rx dma transaction flow.
[ And perhaps extending tty buffer to perform direct fill, skipping the
  buffer copy ]



>> 5. omap dma maintenance is not keeping up with baseline dma.
> 
> John switched to cylic mode so he was not effected very much non-pause
> problem.

But the issue here is mainline, so it's still a problem.
[ FWIW, my main issue with John's approach was the excessive buffering. ]


[ On omap workarounds: ]
>> - requires rx dma already queued before UART data ready interrupt
>>   (ie., necessitates completely different irq handler and rx dma completion
>>   handler)
> 
> true. But is this something that would work for others, too?

Good point, I don't see why not.
Let's find out what the Intel guys think?




Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes

2016-04-12 Thread Peter Hurley
On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>> On 2016-04-05, Peter Hurley  wrote:
>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>> It has been observed that the TX-DMA can stall
>>>>>>
>>>>>> Does this happen on any other OMAP part besides am335x?
>>>>>> I looked back over the LKML history of this and didn't see
>>>>>> any other design implicated in this problem.
>>>>>
>>>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>>>> dma-tx stall with am335x/edma and dra7/sdma.
>>>>
>>>> I thought we already established sdma was not to be used since
>>>> the hardware does not actually support pausing without data loss.
>>>
>>> This workaround was not invented for sdma but for edma (with am335x).
>>
>> According to John above, dra7/sdma requires this workaround.
> 
> It was reported by Frans Klaver against am335x
>  http://lkml.kernel.org/r/20140908183353.GB4686@ci00147.xsens-tech.local
> 
> and I managed to reproduce this with his yocto image on dra7 and am335x:
>  http://lkml.kernel.org/r/20140921204100.ga10...@linutronix.de

[...]

>> - hangs changing some unknown register if tx dma in progress
>>   (ie., this termios change workaround)
> 
> I think some registers are the baud-rate registers which pause engine.

Let's back up here and focus on just this problem for now.

Since this is observable on dra7/sdma, then it's not related to
the OMAP_DMA_TX_KICK problem. IOW, dropping tx dma support for am335x
does not make this go away, which is what I was asking.

Now, if the DLL/DLH/MDR1 register writes are causing dma to stall,
then skipping those if they're unchanged should fix this, and then
pause/terminating in-progress DMA if any of these registers are being
written would be acceptable since some data loss is to be expected
when changing the baud rate without waiting.


Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes

2016-04-11 Thread Peter Hurley
On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>> On 2016-04-05, Peter Hurley  wrote:
>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>> It has been observed that the TX-DMA can stall
>>>>
>>>> Does this happen on any other OMAP part besides am335x?
>>>> I looked back over the LKML history of this and didn't see
>>>> any other design implicated in this problem.
>>>
>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>> dma-tx stall with am335x/edma and dra7/sdma.
>>
>> I thought we already established sdma was not to be used since
>> the hardware does not actually support pausing without data loss.
> 
> This workaround was not invented for sdma but for edma (with am335x).

According to John above, dra7/sdma requires this workaround.


>> http://www.spinics.net/lists/linux-serial/msg18503.html
> 
> This could be fixed. See
>   http://www.spinics.net/lists/linux-serial/msg18517.html
>   http://www.spinics.net/lists/linux-serial/msg18531.html
> 
> rmk was fine with it from what I read. So what is missing is just
> refurbish the patch (update the comment according to rmk replay) and
> then we could re-enable DMA again.

That's hardly all that is required.

1. edma pause returns an error if the descriptor has already been retired
   when a pause is attempted. This makes distinguishing between reporting an
   error for unsupported feature indistinguishable from a transient dma
   error that can simply be logged.
2. The question of a spurious uart interrupt with every dma transaction
   on am335x is still unanswered.
3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
   without pause/resume.
4. Since virt-dma uses tasklets which since 3.8 are no longer serviced
   in a timely manner, rx dma is unreliable, since it's often kicked out
   to regular interrupts.
5. omap dma maintenance is not keeping up with baseline dma.


IOW, omap dma has turned into one big tangle of workarounds.

Let's start with making a list of which TI designs need which workarounds.

*am335x*
- requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
  workaround necessitating completely different tx dma completion handler)
- requires rx dma already queued before UART data ready interrupt
  (ie., necessitates completely different irq handler and rx dma completion
  handler)
- hangs changing some unknown register if tx dma in progress
  (ie., this termios change workaround)
- generates spurious uart interrupt for every rx dma transaction
  (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
  _Even with this workaround_, it still generates spurious interrupt warning
  which shuts off interrupts for several ms while logging the error
  message to the console, virtually guaranteeing lost data.


Can any TI design use the baseline 8250 tx dma transaction flow without
workarounds?  I know the am335x can't; any others?
Can any TI design use the baseline 8250 rx dma transaction flow without
workarounds?  Again, I know the am335x can't; any others?



>> So I'm wondering if we're carrying all this extra DMA complexity
>> and workarounds for just am335x?
>>
> 
> Sebastian
> 



Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes

2016-04-11 Thread Peter Hurley
On 04/11/2016 01:18 AM, John Ogness wrote:
> On 2016-04-05, Peter Hurley  wrote:
>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>> It has been observed that the TX-DMA can stall
>>
>> Does this happen on any other OMAP part besides am335x?
>> I looked back over the LKML history of this and didn't see
>> any other design implicated in this problem.
> 
> I just ran the tests again using 4.6-rc2. I am able to reproduce the
> dma-tx stall with am335x/edma and dra7/sdma.

I thought we already established sdma was not to be used since
the hardware does not actually support pausing without data loss.

http://www.spinics.net/lists/linux-serial/msg18503.html


So I'm wondering if we're carrying all this extra DMA complexity
and workarounds for just am335x?


> Note: To achieve the stall, both the delayed_restore and the
>   rx_dma_broken features of the mainline 8250_omap driver needed to
>   be removed.
> 
> John Ogness
> 



Re: [PATCH resend v2] tty: n_tty -- Add new TIOCPEEKRAW ioctl to peek unread data

2016-04-11 Thread Peter Hurley
On 04/11/2016 02:01 AM, Cyrill Gorcunov wrote:
> On Sun, Apr 10, 2016 at 10:43:28PM -0700, Peter Hurley wrote:
>> Hi Cyrill,
>>
>> On 04/07/2016 05:53 AM, Cyrill Gorcunov wrote:
>>> Currently when we checkpoint PTY connections we ignore data queued inside
>>> read buffer of ldisk, such data is barely lost after the restore. So we
>>> would like to have an ability to dump and restore it.
> 
> Hi Peter! First of all, thanks a huge for comments!
> 
>> Do you want to be able to restore the exact state (like column and
>> newline markers)?
>>
>> I'm wondering if a for-purpose checkpoint/restore ioctl would be more
>> appropriate?

My thoughts here are two-fold.

1. I'd like whatever functionality is to be added to support checkpoint/
restore to be a restrictive as possible while still accomplishing your
objectives _exactly because I don't want it to grow other uses_.
That's why I suggested asserting the task state (and maybe adding
ptrace state checks) so that the interface can only be used for this
purpose.

2. If we want to make changes because the existing read()/write()
interface is inadequate, let's add a complete interface.

> Strictly speaking saving complete state would be ideal but this require
> exporting internal n_tty structure into user space as api (sure we
> can name the members in any way we like, but this would force then
> to keep backward compatibility when n_tty code get changed in future).
> I think loosing some information like column and marker is acceptable
> and fetching unread buffers is more general. I can try to implement
> c/r'ing ioctl though which would not only fetch buffers but column
> and such, and see how it goes, sounds ok?

My thinking here was to use an opaque buffer; it could be sized in the
same manner as the peek interface; ie., provide a buffer address with
size, and the call fails with an updated size required if inadequate
(or whatever, but I think you get the idea).


>> A generic peek() may find other uses which could make future improvements
>> difficult.
>>
>> Plus this ioctl() is only reading the 4k line discipline read buffer, and
>> not the tty buffers which could contain up to another 8k of unprocessed
>> input.
> 
> Which is scheduled for flush into port and then landed into ld when
> write() called. True, I managed to miss this moment, thanks! So the
> proper general peek() handling should be implemented on tty code level
> rather than ld, and fetch both -- start with tty buffer and proceed with
> ld buffer then, right?

To safely copy the tty buffers, it needs to be implemented like
tty_buffer_flush():

1. get an ldisc reference (ioctls don't automatically do this)

ld = tty_ldisc_ref(tty);
if (!ld)
return -EIO;

2. lock out any other consumers of tty buffers:

tty_buffer_lock_exclusive(tty->port);

   This will abort any in-progress input kworker and ensures neither
   buf->head nor head->read advance.

   * This does not lock out parallel producers so it's important to   *
   * guarantee there are no parallel writers (see my lock discussion) *

3. walk the list of tty buffers, copying each one:

struct tty_buffer *head = port->buf->head;
while (head) {
count = smp_load_acquire(&head->commit) - head->read)
if (count)
/* copy tty buffer */
head = smp_load_acquire(head->next);
}

4. Now call into the line discipline to copy its data:

if (ld->ops->checkpoint)
ld->ops->checkpoint()

5. Restart kworker

tty_buffer_unlock_exclusive(tty->port);

6. Release ldisc reference

tty_ldisc_deref(ld);



>>> Here is a new ioctl code which simply copies data from read buffer into
>>> the userspace without any additional processing (just like terminal is
>>> sitting in a raw mode).
>>>
>>> The idea behind is when we checkpointing PTY pairs we peek this unread
>>> data into images on disk and on restore phase we find the linked peer
>>> and write it back thus reading peer will see it in read buffer.
>>
>> Have you frozen every process with either end open? I could see how this 
>> would
>> be easy for an entire process group but what if the processes are unrelated?
>> How do you ensure you have the correct linked peer with multiple devpts 
>> instances?
> 
> Currently we freeze all processes which are requested for dumping and then
> we walk over all files. Once tty peer is met (actually we support pty
> and vt only at the moment) we remember its index via TIOCGPTN (for master
> peers) and expect to find appropriate slave peer. You know, h

Re: [PATCH] drivers/tty: fix BUG_ON when calls serial8250_suspend_port

2016-04-11 Thread Peter Hurley
On 04/07/2016 11:33 PM, Yang Yingliang wrote:
> My kthread calls serial8250_suspend_port/serial8250_resume_port when the
> system is booting, it triggers a BUG_ON:
> 
> BUG: failure at 
> drivers/tty/serial/8250/8250_core.c:1852/serial_unlink_irq_chain()!
> Kernel panic - not syncing: BUG!
> CPU: 21 PID: 927 Comm: uart_debug_thre Not tainted 4.1.18+ #96
> Call trace:
> [] dump_backtrace+0x0/0x128
> [] show_stack+0x14/0x1c
> [] dump_stack+0x88/0xa8
> [] panic+0xe4/0x228
> [] univ8250_release_irq+0xc0/0x124
> [] serial8250_do_shutdown+0xdc/0x144
> [] serial8250_shutdown+0x20/0x28
> [] uart_suspend_port+0x200/0x234
> [] serial8250_suspend_port+0x5c/0xac
> [] uart_debug+0x24/0x3c
> [] kthread+0xdc/0xf0
> 
> I found port->flags is used without lock in tty_port_block_til_ready().


Actually the port->flags are updated within the port->lock critical
section, but as you show below, mixing spinlocked protection with
atomic bit ops is bad.

I fixed this problem more generally, as there are many more places
the port flags are accessed in a non-atomic way.

Please test my series "Replace kernel-defined ASYNC_ bits"; note
that series also has two other patch dependencies which are detailed
in the cover letter.


Regards,
Peter Hurley


> The flags
> will be overwritten with old value after it updates in uart_suspend_port().
>   CPU A   CPU B
>   in uart_suspend_portin tty_port_block_til_ready
>   get port->flags
>   clear ASYNCB_INITIALIZED
>   set port->flags
> The flags still has ASYNCB_INITIALIZED. When uart_suspend_port is called next 
> time,
> it trriggers the BUG_ON. So use set_bit/clear_bit to avoid the flags being 
> overwritten.
> 
> Cc: Greg Kroah-Hartman 
> Cc: Jiri Slaby 
> Signed-off-by: Yang Yingliang 
> ---
>  drivers/tty/tty_port.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
> index 40b3183..d57cf70 100644
> --- a/drivers/tty/tty_port.c
> +++ b/drivers/tty/tty_port.c
> @@ -237,7 +237,7 @@ void tty_port_hangup(struct tty_port *port)
>  
>   spin_lock_irqsave(&port->lock, flags);
>   port->count = 0;
> - port->flags &= ~ASYNC_NORMAL_ACTIVE;
> + clear_bit(ASYNCB_NORMAL_ACTIVE, &port->flags);
>   tty = port->tty;
>   if (tty)
>   set_bit(TTY_IO_ERROR, &tty->flags);
> @@ -376,14 +376,14 @@ int tty_port_block_til_ready(struct tty_port *port,
>   /* if non-blocking mode is set we can pass directly to open unless
>  the port has just hung up or is in another error state */
>   if (tty->flags & (1 << TTY_IO_ERROR)) {
> - port->flags |= ASYNC_NORMAL_ACTIVE;
> + set_bit(ASYNCB_NORMAL_ACTIVE, &port->flags);
>   return 0;
>   }
>   if (filp->f_flags & O_NONBLOCK) {
>   /* Indicate we are open */
>   if (tty->termios.c_cflag & CBAUD)
>   tty_port_raise_dtr_rts(port);
> - port->flags |= ASYNC_NORMAL_ACTIVE;
> + set_bit(ASYNCB_NORMAL_ACTIVE, &port->flags);
>   return 0;
>   }
>  
> @@ -443,7 +443,7 @@ int tty_port_block_til_ready(struct tty_port *port,
>   port->count++;
>   port->blocked_open--;
>   if (retval == 0)
> - port->flags |= ASYNC_NORMAL_ACTIVE;
> + set_bit(ASYNCB_NORMAL_ACTIVE, &port->flags);
>   spin_unlock_irqrestore(&port->lock, flags);
>   return retval;
>  }
> @@ -533,7 +533,8 @@ void tty_port_close_end(struct tty_port *port, struct 
> tty_struct *tty)
>   spin_lock_irqsave(&port->lock, flags);
>   wake_up_interruptible(&port->open_wait);
>   }
> - port->flags &= ~(ASYNC_NORMAL_ACTIVE | ASYNC_CLOSING);
> + clear_bit(ASYNCB_NORMAL_ACTIVE, &port->flags);
> + clear_bit(ASYNCB_CLOSING, &port->flags);
>   wake_up_interruptible(&port->close_wait);
>   spin_unlock_irqrestore(&port->lock, flags);
>  }
> 



Re: [PATCH v2 07/11] serial: 8250: enable AFE on ports where FIFO is 16 bytes

2016-04-11 Thread Peter Hurley
On 04/11/2016 05:51 AM, Andy Shevchenko wrote:
> Intel Quark has 16550A compatible UART with autoflow feature enabled. It has
> only 16 bytes of FIFO. Currently serial8250_do_set_termios() prevents to 
> enable
> autoflow since the minimum requirement of 32 bytes of FIFO size.
> 
> Drop a FIFO size limitation to allow autoflow control be enabled on such 
> UARTs.
> 
> While here, comment out UART_CAP_AFE for PORT_AR7 since it wasn't working and
> it will be not a good idea to use it in conjunction with trigger level of 1
> byte.

Reviewed-by: Peter Hurley 



Re: [PATCH resend v2] tty: n_tty -- Add new TIOCPEEKRAW ioctl to peek unread data

2016-04-10 Thread Peter Hurley
Hi Cyrill,

On 04/07/2016 05:53 AM, Cyrill Gorcunov wrote:
> Currently when we checkpoint PTY connections we ignore data queued inside
> read buffer of ldisk, such data is barely lost after the restore. So we
> would like to have an ability to dump and restore it.

Do you want to be able to restore the exact state (like column and
newline markers)?

I'm wondering if a for-purpose checkpoint/restore ioctl would be more
appropriate?

A generic peek() may find other uses which could make future improvements
difficult.

Plus this ioctl() is only reading the 4k line discipline read buffer, and
not the tty buffers which could contain up to another 8k of unprocessed
input.


> Here is a new ioctl code which simply copies data from read buffer into
> the userspace without any additional processing (just like terminal is
> sitting in a raw mode).
> 
> The idea behind is when we checkpointing PTY pairs we peek this unread
> data into images on disk and on restore phase we find the linked peer
> and write it back thus reading peer will see it in read buffer.

Have you frozen every process with either end open? I could see how this would
be easy for an entire process group but what if the processes are unrelated?
How do you ensure you have the correct linked peer with multiple devpts 
instances?


> I tried several approaches (including simply reuse of canon_copy_from_read_buf
> and copy_from_read_buf) but it makes code more complicated, so I end up
> in plain copying of buffer data.
> 
> Strictly speaking, we could try handle it inside CRIU itself, as
> Peter Hurley proposed (switch terminal to raw mode, read data,
> and return original termios back) but in this case we may hit
> the scenario when we read data and failed to restore it back
> (due to any error) leaving the task we're dumping without
> read buffer, thus peeking data looks like be the only
> safe way.

Code comments below.


> v2:
>  - Use @char in ioctl definition.
> 
> Signed-off-by: Cyrill Gorcunov 
> CC: Jiri Slaby 
> CC: Peter Hurley 
> CC: Greg Kroah-Hartman 
> CC: Andrey Vagin 
> CC: Pavel Emelianov 
> CC: Vladimir Davydov 
> CC: Konstantin Khorenko 
> ---
>  drivers/tty/n_tty.c   |   22 ++
>  include/uapi/asm-generic/ioctls.h |1 +
>  2 files changed, 23 insertions(+)
> 
> Index: linux-ml.git/drivers/tty/n_tty.c
> ===
> --- linux-ml.git.orig/drivers/tty/n_tty.c
> +++ linux-ml.git/drivers/tty/n_tty.c
> @@ -2420,6 +2420,26 @@ static unsigned long inq_canon(struct n_
>   return nr;
>  }
>  
> +static ssize_t n_tty_peek_raw(struct tty_struct *tty, unsigned char __user 
> *buf)

function name is redundant; n_tty_peek() is sufficient.

Needs a buffer-length parameter; could return the size required to completely
copy the ldisc read buffer, if smaller than required.


> +{
> + struct n_tty_data *ldata = tty->disc_data;
> + ssize_t retval;
> +
> + if (!mutex_trylock(&ldata->atomic_read_lock))
> + return -EAGAIN;
> +
> + down_read(&tty->termios_rwsem);

Why take these two locks if parallel operations are not expected?
Conversely, I would expect this to assert current state is TASK_TRACED.

Assuming the other pty end is halted, this ioctl would also need to
wait for the input kworker to stop:

--- >% ---
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index a946e49..744cb87 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -614,3 +614,8 @@ bool tty_buffer_cancel_work(struct tty_port *port)
 {
return cancel_work_sync(&port->buf.work);
 }
+
+void tty_buffer_flush_work(struct tty_port *port)
+{
+   flush_work(&port->buf.work);
+}
diff --git a/include/linux/tty.h b/include/linux/tty.h
index bf1bcdb..a541132 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -480,6 +480,7 @@ extern void tty_buffer_init(struct tty_port *port);
 extern void tty_buffer_set_lock_subclass(struct tty_port *port);
 extern bool tty_buffer_restart_work(struct tty_port *port);
 extern bool tty_buffer_cancel_work(struct tty_port *port);
+extern void tty_buffer_flush_work(struct tty_port *port);
 extern speed_t tty_termios_baud_rate(struct ktermios *termios);
 extern speed_t tty_termios_input_baud_rate(struct ktermios *termios);
 extern void tty_termios_encode_baud_rate(struct ktermios *termios,





tty_buffer_flush_work(tty->port);

> + retval = read_cnt(ldata);
> + if (retval) {
> + const unsigned char *from = read_buf_addr(ldata, 
> ldata->read_tail);
> + if (copy_to_user(buf, from, retval))
> + retval = -EFAULT;
> + }
> + up_read(&tty->termios_rwsem);
> + mutex_unlock(&lda

Re: [PATCH 2/2] tty: Remove stale parameter comment

2016-04-10 Thread Peter Hurley
On 04/10/2016 01:24 AM, Jiri Slaby wrote:
> On 04/10/2016, 05:36 AM, Peter Hurley wrote:
>> noctty was removed as a parameter by commit 216513411937586
> 
> It is actually 11e1d4aa4da.

Right; not sure where that number came from.


> 
>> ("tty: Consolidate noctty check in tty_open()").
>>
>> Signed-off-by: Peter Hurley 
>> ---
>>  drivers/tty/tty_io.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>> index 3cdd63b..50979be 100644
>> --- a/drivers/tty/tty_io.c
>> +++ b/drivers/tty/tty_io.c
>> @@ -1960,7 +1960,6 @@ static struct tty_struct *tty_open_current_tty(dev_t 
>> device, struct file *filp)
>>   *  tty_lookup_driver - lookup a tty driver for a given device file
>>   *  @device: device number
>>   *  @filp: file pointer to tty
>> - *  @noctty: set if the device should not become a controlling tty
>>   *  @index: index for the device in the @return driver
>>   *  @return: driver for this inode (with increased refcount)
>>   *
>>
> 
> 



[PATCH 1/2] tty: Remove unused TTY_NUMBER() macro

2016-04-09 Thread Peter Hurley
TTY_NUMBER() has been unused since v2.5.71; removed by
"[PATCH] callout removal: callout is gone".

Signed-off-by: Peter Hurley 
---
 drivers/tty/tty_io.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 320dc4d..3cdd63b 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -230,9 +230,6 @@ static void tty_del_file(struct file *file)
tty_free_file(file);
 }
 
-
-#define TTY_NUMBER(tty) ((tty)->index + (tty)->driver->name_base)
-
 /**
  * tty_name-   return tty naming
  * @tty: tty structure
-- 
2.8.1



[PATCH 2/2] tty: Remove stale parameter comment

2016-04-09 Thread Peter Hurley
noctty was removed as a parameter by commit 216513411937586
("tty: Consolidate noctty check in tty_open()").

Signed-off-by: Peter Hurley 
---
 drivers/tty/tty_io.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 3cdd63b..50979be 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1960,7 +1960,6 @@ static struct tty_struct *tty_open_current_tty(dev_t 
device, struct file *filp)
  * tty_lookup_driver - lookup a tty driver for a given device file
  * @device: device number
  * @filp: file pointer to tty
- * @noctty: set if the device should not become a controlling tty
  * @index: index for the device in the @return driver
  * @return: driver for this inode (with increased refcount)
  *
-- 
2.8.1



[PATCH 1/8] tty: Define ASYNC_ replacement bits

2016-04-09 Thread Peter Hurley
Prepare for relocating kernel private state bits out of tty_port::flags
field; tty_port::flags field is not atomic and can become corrupted
by concurrent updates. It also suffers from the complication of sharing
in a userspace-visible field which must be masked.

Define new tty_port::iflags field and new, substitute bit definitions
for the former ASYNC_* flags.

Signed-off-by: Peter Hurley 
---
 include/linux/tty.h| 16 +++-
 include/uapi/linux/tty_flags.h |  9 -
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 89f9c91..4e0dbda0 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -228,7 +228,8 @@ struct tty_port {
int count;  /* Usage count */
wait_queue_head_t   open_wait;  /* Open waiters */
wait_queue_head_t   delta_msr_wait; /* Modem status change */
-   unsigned long   flags;  /* TTY flags ASY_*/
+   unsigned long   flags;  /* User TTY flags ASYNC_ */
+   unsigned long   iflags; /* Internal flags TTY_PORT_ */
unsigned char   console:1,  /* port is a console */
low_latency:1;  /* optional: tune for latency */
struct mutexmutex;  /* Locking */
@@ -242,6 +243,19 @@ struct tty_port {
struct kref kref;   /* Ref counter */
 };
 
+/* tty_port::iflags bits -- use atomic bit ops */
+#define TTY_PORT_INITIALIZED   0   /* device is initialized */
+#define TTY_PORT_SUSPENDED 1   /* device is suspended */
+#define TTY_PORT_ACTIVE2   /* device is open */
+
+/*
+ * uart drivers: use the uart_port::status field and the UPSTAT_* defines
+ * for s/w-based flow control steering and carrier detection status
+ */
+#define TTY_PORT_CTS_FLOW  3   /* h/w flow control enabled */
+#define TTY_PORT_CHECK_CD  4   /* carrier detect enabled */
+
+
 /*
  * Where all of the state associated with a tty is kept while the tty
  * is open.  Since the termios state should be kept even if the tty
diff --git a/include/uapi/linux/tty_flags.h b/include/uapi/linux/tty_flags.h
index 072e41e..8e1a436 100644
--- a/include/uapi/linux/tty_flags.h
+++ b/include/uapi/linux/tty_flags.h
@@ -32,7 +32,12 @@
 #define ASYNCB_MAGIC_MULTIPLIER16 /* Use special CLK or divisor */
 #define ASYNCB_LAST_USER   16
 
-/* Internal flags used only by kernel */
+/*
+ * Internal flags used only by kernel (read-only)
+ *
+ * WARNING: These flags are no longer used and have been superceded by the
+ * TTY_PORT_ flags in the iflags field (and not userspace-visible)
+ */
 #define ASYNCB_INITIALIZED 31 /* Serial port was initialized */
 #define ASYNCB_SUSPENDED   30 /* Serial port is suspended */
 #define ASYNCB_NORMAL_ACTIVE   29 /* Normal device is active */
@@ -44,6 +49,7 @@
 #define ASYNCB_CONS_FLOW   23 /* flow control for console  */
 #define ASYNCB_FIRST_KERNEL22
 
+/* Masks */
 #define ASYNC_HUP_NOTIFY   (1U << ASYNCB_HUP_NOTIFY)
 #define ASYNC_SUSPENDED(1U << ASYNCB_SUSPENDED)
 #define ASYNC_FOURPORT (1U << ASYNCB_FOURPORT)
@@ -72,6 +78,7 @@
 #define ASYNC_SPD_WARP (ASYNC_SPD_HI|ASYNC_SPD_SHI)
 #define ASYNC_SPD_MASK (ASYNC_SPD_HI|ASYNC_SPD_VHI|ASYNC_SPD_SHI)
 
+/* These flags are no longer used (and were always masked from userspace) */
 #define ASYNC_INITIALIZED  (1U << ASYNCB_INITIALIZED)
 #define ASYNC_NORMAL_ACTIVE(1U << ASYNCB_NORMAL_ACTIVE)
 #define ASYNC_BOOT_AUTOCONF(1U << ASYNCB_BOOT_AUTOCONF)
-- 
2.8.1



[PATCH 7/8] tty: mxser: Remove unused ASYNC_SHARE_IRQ flag

2016-04-09 Thread Peter Hurley
ASYNC*_SHARE_IRQ is no longer used; remove.

Signed-off-by: Peter Hurley 
---
 drivers/tty/mxser.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/tty/mxser.c b/drivers/tty/mxser.c
index 7e8c27b..98d2bd1 100644
--- a/drivers/tty/mxser.c
+++ b/drivers/tty/mxser.c
@@ -2392,7 +2392,6 @@ static int mxser_initbrd(struct mxser_board *brd,
if (brd->chip_flag != MOXA_OTHER_UART)
mxser_enable_must_enchance_mode(info->ioaddr);
 
-   info->port.flags = ASYNC_SHARE_IRQ;
info->type = brd->uart_type;
 
process_txrx_fifo(info);
-- 
2.8.1



[PATCH 3/8] tty: Replace ASYNC_NORMAL_ACTIVE bit and update atomically

2016-04-09 Thread Peter Hurley
Replace ASYNC_NORMAL_ACTIVE bit in the tty_port::flags field with
TTY_PORT_ACTIVE bit in the tty_port::iflags field. Introduce helpers
tty_port_set_active() and tty_port_active() to abstract atomic bit ops.

Extract state changes from port lock sections, as this usage is
broken and confused; the state transitions are protected by the
tty lock (which mutually excludes parallel open/close/hangup),
and no user tests the active state while holding the port lock.

Signed-off-by: Peter Hurley 
---
 drivers/isdn/i4l/isdn_tty.c  | 20 +---
 drivers/tty/amiserial.c  |  2 +-
 drivers/tty/rocket.c |  5 +++--
 drivers/tty/serial/crisv10.c |  8 
 drivers/tty/serial/serial_core.c |  8 
 drivers/tty/synclink.c   |  6 +++---
 drivers/tty/synclink_gt.c|  6 +++---
 drivers/tty/synclinkmp.c |  6 +++---
 drivers/tty/tty_port.c   | 12 ++--
 include/linux/tty.h  | 12 
 net/irda/ircomm/ircomm_tty.c | 10 +-
 11 files changed, 53 insertions(+), 42 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index f1edc08..d8468f3 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1622,7 +1622,7 @@ isdn_tty_hangup(struct tty_struct *tty)
return;
isdn_tty_shutdown(info);
port->count = 0;
-   port->flags &= ~ASYNC_NORMAL_ACTIVE;
+   tty_port_set_active(port, 0);
port->tty = NULL;
wake_up_interruptible(&port->open_wait);
 }
@@ -1979,7 +1979,7 @@ isdn_tty_find_icall(int di, int ch, setup_parm *setup)
 #endif
if (
 #ifndef FIX_FILE_TRANSFER
-   (info->port.flags & ASYNC_NORMAL_ACTIVE) &&
+   tty_port_active(&info->port) &&
 #endif
(info->isdn_driver == -1) &&
(info->isdn_channel == -1) &&
@@ -2018,8 +2018,6 @@ isdn_tty_find_icall(int di, int ch, setup_parm *setup)
return (wret == 2) ? 3 : 0;
 }
 
-#define TTY_IS_ACTIVE(info)(info->port.flags & ASYNC_NORMAL_ACTIVE)
-
 int
 isdn_tty_stat_callback(int i, isdn_ctrl *c)
 {
@@ -2077,7 +2075,7 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
 #ifdef ISDN_TTY_STAT_DEBUG
printk(KERN_DEBUG "tty_STAT_DCONN ttyI%d\n", 
info->line);
 #endif
-   if (TTY_IS_ACTIVE(info)) {
+   if (tty_port_active(&info->port)) {
if (info->dialing == 1) {
info->dialing = 2;
return 1;
@@ -2088,7 +2086,7 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
 #ifdef ISDN_TTY_STAT_DEBUG
printk(KERN_DEBUG "tty_STAT_DHUP ttyI%d\n", info->line);
 #endif
-   if (TTY_IS_ACTIVE(info)) {
+   if (tty_port_active(&info->port)) {
if (info->dialing == 1)
isdn_tty_modem_result(RESULT_BUSY, 
info);
if (info->dialing > 1)
@@ -2118,7 +2116,7 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
 * waiting for it and
 * set DCD-bit of its modem-status.
 */
-   if (TTY_IS_ACTIVE(info) ||
+   if (tty_port_active(&info->port) ||
(info->port.blocked_open &&
 (info->emu.mdmreg[REG_DCD] & BIT_DCD))) {
info->msr |= UART_MSR_DCD;
@@ -2145,7 +2143,7 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
 #ifdef ISDN_TTY_STAT_DEBUG
printk(KERN_DEBUG "tty_STAT_BHUP ttyI%d\n", info->line);
 #endif
-   if (TTY_IS_ACTIVE(info)) {
+   if (tty_port_active(&info->port)) {
 #ifdef ISDN_DEBUG_MODEM_HUP
printk(KERN_DEBUG "Mhup in ISDN_STAT_BHUP\n");
 #endif
@@ -2157,7 +2155,7 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
 #ifdef ISDN_TTY_STAT_DEBUG
printk(KERN_DEBUG "tty_STAT_NODCH ttyI%d\n", 
info->line);
 #endif
-   if (TTY_IS_ACTIVE(info)) {
+   if (tty_port_active(&info->port)) {
if (info->dialing) {
info->dialing = 0;
info->last_l2 = -1;
@@ -2183,14 +2181,14 @@ isdn_tty_stat_callback(int i, isdn_ctrl *c)
return 1;
 #ifdef CONFIG_ISDN_TTY_FAX
case ISDN_STAT_FAXIND:
-   if (TTY_IS_ACTIVE(info)) {
+

[PATCH 4/8] tty: Replace ASYNC_CHECK_CD and update atomically

2016-04-09 Thread Peter Hurley
Replace ASYNC_CHECK_CD bit in the tty_port::flags field with
TTY_PORT_CHECK_CD bit in the tty_port::iflags field. Introduce helpers
tty_port_set_check_carrier() and tty_port_check_carrier() to abstract
the atomic bit ops.

Signed-off-by: Peter Hurley 
---
 drivers/char/pcmcia/synclink_cs.c   |  8 ++--
 drivers/isdn/i4l/isdn_tty.c |  8 ++--
 drivers/tty/amiserial.c |  9 +++--
 drivers/tty/cyclades.c  | 14 --
 drivers/tty/isicom.c|  7 ++-
 drivers/tty/mxser.c |  9 +++--
 drivers/tty/synclink.c  |  8 ++--
 drivers/tty/synclink_gt.c   |  8 ++--
 drivers/tty/synclinkmp.c|  8 ++--
 include/linux/tty.h | 13 +
 net/irda/ircomm/ircomm_tty.c|  4 ++--
 net/irda/ircomm/ircomm_tty_attach.c |  2 +-
 net/irda/ircomm/ircomm_tty_ioctl.c  |  5 +
 13 files changed, 39 insertions(+), 64 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index bdf41ac..bf54f4e 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -1101,7 +1101,7 @@ static void dcd_change(MGSLPC_INFO *info, struct 
tty_struct *tty)
wake_up_interruptible(&info->status_event_wait_q);
wake_up_interruptible(&info->event_wait_q);
 
-   if (info->port.flags & ASYNC_CHECK_CD) {
+   if (tty_port_check_carrier(&info->port)) {
if (debug_level >= DEBUG_LEVEL_ISR)
printk("%s CD now %s...", info->device_name,
   (info->serial_signals & SerialSignal_DCD) ? "on" 
: "off");
@@ -1467,11 +1467,7 @@ static void mgslpc_change_params(MGSLPC_INFO *info, 
struct tty_struct *tty)
info->timeout += HZ/50; /* Add .02 seconds of slop */
 
tty_port_set_cts_flow(&info->port, cflag & CRTSCTS);
-
-   if (cflag & CLOCAL)
-   info->port.flags &= ~ASYNC_CHECK_CD;
-   else
-   info->port.flags |= ASYNC_CHECK_CD;
+   tty_port_set_check_carrier(&info->port, ~cflag & CLOCAL);
 
/* process tty input control flags */
 
diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index d8468f3..023a350a 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1043,11 +1043,7 @@ isdn_tty_change_speed(modem_info *info)
if (!(cflag & PARODD))
cval |= UART_LCR_EPAR;
 
-   if (cflag & CLOCAL)
-   port->flags &= ~ASYNC_CHECK_CD;
-   else {
-   port->flags |= ASYNC_CHECK_CD;
-   }
+   tty_port_set_check_carrier(port, ~cflag & CLOCAL);
 }
 
 static int
@@ -2526,7 +2522,7 @@ isdn_tty_modem_result(int code, modem_info *info)
if (info->closing || (!info->port.tty))
return;
 
-   if (info->port.flags & ASYNC_CHECK_CD)
+   if (tty_port_check_carrier(&info->port))
tty_hangup(info->port.tty);
}
 }
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index 80d6165..b4ab97d 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -398,7 +398,7 @@ static void check_modem_status(struct serial_state *info)
wake_up_interruptible(&port->delta_msr_wait);
}
 
-   if ((port->flags & ASYNC_CHECK_CD) && (dstatus & SER_DCD)) {
+   if (tty_port_check_carrier(port) && (dstatus & SER_DCD)) {
 #if (defined(SERIAL_DEBUG_OPEN) || defined(SERIAL_DEBUG_INTR))
printk("ttyS%d CD now %s...", info->line,
   (!(status & SER_DCD)) ? "on" : "off");
@@ -730,12 +730,9 @@ static void change_speed(struct tty_struct *tty, struct 
serial_state *info,
tty_port_set_cts_flow(port, cflag & CRTSCTS);
if (cflag & CRTSCTS)
info->IER |= UART_IER_MSI;
-   if (cflag & CLOCAL)
-   port->flags &= ~ASYNC_CHECK_CD;
-   else {
-   port->flags |= ASYNC_CHECK_CD;
+   tty_port_set_check_carrier(port, ~cflag & CLOCAL);
+   if (~cflag & CLOCAL)
info->IER |= UART_IER_MSI;
-   }
/* TBD:
 * Does clearing IER_MSI imply that we should disable the VBL interrupt 
?
 */
diff --git a/drivers/tty/cyclades.c b/drivers/tty/cyclades.c
index 1a12776..9d1e19b 100644
--- a/drivers/tty/cyclades.c
+++ b/drivers/tty/cyclades.c
@@ -714,7 +714,7 @@ static void cyy_chip_modem(struct cyclades_card *cinfo, int 
chip,
wake_up_interruptible(&info->port.delta_msr_wait);
}
 
-   if ((mdm_change & CyDCD) && (info->port.flags & ASYNC_CHECK_CD)) {
+   if ((mdm_chang

[PATCH 8/8] tty: core: Undefine ASYNC_* flags superceded by TTY_PORT* flags

2016-04-09 Thread Peter Hurley
Purposefully break out-of-tree driver compiles using kernel
ASYNC_* bits which have been superceded by TTY_PORT* flags and
their respective helper functions.

Signed-off-by: Peter Hurley 
---
 include/uapi/linux/tty_flags.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/tty_flags.h b/include/uapi/linux/tty_flags.h
index 8e1a436..66e4d8b 100644
--- a/include/uapi/linux/tty_flags.h
+++ b/include/uapi/linux/tty_flags.h
@@ -38,6 +38,7 @@
  * WARNING: These flags are no longer used and have been superceded by the
  * TTY_PORT_ flags in the iflags field (and not userspace-visible)
  */
+#ifndef _KERNEL_
 #define ASYNCB_INITIALIZED 31 /* Serial port was initialized */
 #define ASYNCB_SUSPENDED   30 /* Serial port is suspended */
 #define ASYNCB_NORMAL_ACTIVE   29 /* Normal device is active */
@@ -48,6 +49,7 @@
 #define ASYNCB_SHARE_IRQ   24 /* for multifunction cards, no longer used */
 #define ASYNCB_CONS_FLOW   23 /* flow control for console  */
 #define ASYNCB_FIRST_KERNEL22
+#endif
 
 /* Masks */
 #define ASYNC_HUP_NOTIFY   (1U << ASYNCB_HUP_NOTIFY)
@@ -78,6 +80,7 @@
 #define ASYNC_SPD_WARP (ASYNC_SPD_HI|ASYNC_SPD_SHI)
 #define ASYNC_SPD_MASK (ASYNC_SPD_HI|ASYNC_SPD_VHI|ASYNC_SPD_SHI)
 
+#ifndef _KERNEL_
 /* These flags are no longer used (and were always masked from userspace) */
 #define ASYNC_INITIALIZED  (1U << ASYNCB_INITIALIZED)
 #define ASYNC_NORMAL_ACTIVE(1U << ASYNCB_NORMAL_ACTIVE)
@@ -88,5 +91,6 @@
 #define ASYNC_SHARE_IRQ(1U << ASYNCB_SHARE_IRQ)
 #define ASYNC_CONS_FLOW(1U << ASYNCB_CONS_FLOW)
 #define ASYNC_INTERNAL_FLAGS   (~((1U << ASYNCB_FIRST_KERNEL) - 1))
+#endif
 
 #endif
-- 
2.8.1



[PATCH 0/8] Replace kernel-defined ASYNC_ bits

2016-04-09 Thread Peter Hurley
As outlined in my January email ("RFC: out-of-tree tty driver breakage"),
the tty/serial core uses 5 bits in the tty_port.flags field to manage
state. They are:

ASYNCB_INITIALIZED
ASYNCB_SUSPENDED
ASYNCB_NORMAL_ACTIVE
ASYNCB_CTS_FLOW
ASYNCB_CHECK_CD

(NB: ASYNC_CLOSING was recently removed)

However, updates to this field (tty_port.flags) can be and often are
non-atomic. Additionally, the field is visible to/modifiable by userspace
(the state bits above are not modifiable by userspace though).

This series moves these state bits into a different tty_port field
(iflags) and abstracts state tests and changes with trivial helpers.

The last patch of the series purposefully breaks out-of-tree driver
builds to ensure they update state test/change methods to the helpers
instead.

REQUIRES: "tty: Replace TTY_IO_ERROR bit tests with tty_io_error()"
  "tty: Replace TTY_THROTTLED bit tests with tty_throttled()"

Regards,

Peter Hurley (8):
  tty: Define ASYNC_ replacement bits
  tty: Replace ASYNC_CTS_FLOW bit and update atomically
  tty: Replace ASYNC_NORMAL_ACTIVE bit and update atomically
  tty: Replace ASYNC_CHECK_CD and update atomically
  tty: Replace ASYNC_SUSPENDED bit and update atomically
  tty: Replace ASYNC_INITIALIZED bit and update atomically
  tty: mxser: Remove unused ASYNC_SHARE_IRQ flag
  tty: core: Undefine ASYNC_* flags superceded by TTY_PORT* flags

 drivers/char/pcmcia/synclink_cs.c   | 25 +---
 drivers/ipack/devices/ipoctal.c |  5 +--
 drivers/isdn/i4l/isdn_tty.c | 38 --
 drivers/s390/char/con3215.c | 22 +--
 drivers/tty/amiserial.c | 31 +++
 drivers/tty/cyclades.c  | 38 --
 drivers/tty/isicom.c| 19 -
 drivers/tty/moxa.c  | 10 ++---
 drivers/tty/mxser.c | 28 +-
 drivers/tty/n_gsm.c |  8 ++--
 drivers/tty/rocket.c| 13 ---
 drivers/tty/serial/crisv10.c| 25 ++--
 drivers/tty/serial/serial_core.c| 40 ++-
 drivers/tty/synclink.c  | 65 ++-
 drivers/tty/synclink_gt.c   | 35 +++--
 drivers/tty/synclinkmp.c| 35 +++--
 drivers/tty/tty_port.c  | 25 ++--
 drivers/usb/class/cdc-acm.c |  4 +-
 drivers/usb/serial/console.c|  4 +-
 drivers/usb/serial/generic.c|  6 +--
 drivers/usb/serial/mxuport.c|  6 +--
 drivers/usb/serial/sierra.c |  4 +-
 drivers/usb/serial/usb-serial.c |  2 +-
 drivers/usb/serial/usb_wwan.c   |  4 +-
 include/linux/tty.h | 77 -
 include/uapi/linux/tty_flags.h  | 13 ++-
 net/irda/ircomm/ircomm_tty.c| 29 +++---
 net/irda/ircomm/ircomm_tty_attach.c |  2 +-
 net/irda/ircomm/ircomm_tty_ioctl.c  | 10 ++---
 29 files changed, 320 insertions(+), 303 deletions(-)

-- 
2.8.1



[PATCH 6/8] tty: Replace ASYNC_INITIALIZED bit and update atomically

2016-04-09 Thread Peter Hurley
Replace ASYNC_INITIALIZED bit in the tty_port::flags field with
TTY_PORT_INITIALIZED bit in the tty_port::iflags field. Introduce helpers
tty_port_set_initialized() and tty_port_initialized() to abstract
atomic bit ops.

Note: the transforms for test_and_set_bit() and test_and_clear_bit()
are unnecessary as the state transitions are already mutually exclusive;
the tty lock prevents concurrent open/close/hangup.

Signed-off-by: Peter Hurley 
---
 drivers/char/pcmcia/synclink_cs.c  | 12 +-
 drivers/ipack/devices/ipoctal.c|  5 ++---
 drivers/isdn/i4l/isdn_tty.c| 10 -
 drivers/s390/char/con3215.c| 12 +-
 drivers/tty/amiserial.c| 14 ++--
 drivers/tty/cyclades.c | 14 ++--
 drivers/tty/isicom.c   |  6 ++---
 drivers/tty/moxa.c | 10 -
 drivers/tty/mxser.c| 14 +---
 drivers/tty/n_gsm.c|  8 +++
 drivers/tty/rocket.c   | 10 -
 drivers/tty/serial/crisv10.c   | 17 +++---
 drivers/tty/serial/serial_core.c   | 24 +++-
 drivers/tty/synclink.c | 46 ++
 drivers/tty/synclink_gt.c  | 16 ++---
 drivers/tty/synclinkmp.c   | 16 ++---
 drivers/tty/tty_port.c | 13 ++-
 drivers/usb/class/cdc-acm.c|  4 ++--
 drivers/usb/serial/console.c   |  4 ++--
 drivers/usb/serial/generic.c   |  6 ++---
 drivers/usb/serial/mxuport.c   |  6 ++---
 drivers/usb/serial/sierra.c|  4 ++--
 drivers/usb/serial/usb-serial.c|  2 +-
 drivers/usb/serial/usb_wwan.c  |  4 ++--
 include/linux/tty.h| 13 +++
 net/irda/ircomm/ircomm_tty.c   | 15 +++--
 net/irda/ircomm/ircomm_tty_ioctl.c |  2 +-
 27 files changed, 157 insertions(+), 150 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index bf54f4e..345ca7c 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -1272,7 +1272,7 @@ static int startup(MGSLPC_INFO * info, struct tty_struct 
*tty)
if (debug_level >= DEBUG_LEVEL_INFO)
printk("%s(%d):startup(%s)\n", __FILE__, __LINE__, 
info->device_name);
 
-   if (info->port.flags & ASYNC_INITIALIZED)
+   if (tty_port_initialized(&info->port))
return 0;
 
if (!info->tx_buf) {
@@ -1311,7 +1311,7 @@ static int startup(MGSLPC_INFO * info, struct tty_struct 
*tty)
if (tty)
clear_bit(TTY_IO_ERROR, &tty->flags);
 
-   info->port.flags |= ASYNC_INITIALIZED;
+   tty_port_set_initialized(&info->port, 1);
 
return 0;
 }
@@ -1322,7 +1322,7 @@ static void shutdown(MGSLPC_INFO * info, struct 
tty_struct *tty)
 {
unsigned long flags;
 
-   if (!(info->port.flags & ASYNC_INITIALIZED))
+   if (!tty_port_initialized(&info->port))
return;
 
if (debug_level >= DEBUG_LEVEL_INFO)
@@ -1361,7 +1361,7 @@ static void shutdown(MGSLPC_INFO * info, struct 
tty_struct *tty)
if (tty)
set_bit(TTY_IO_ERROR, &tty->flags);
 
-   info->port.flags &= ~ASYNC_INITIALIZED;
+   tty_port_set_initialized(&info->port, 0);
 }
 
 static void mgslpc_program_hw(MGSLPC_INFO *info, struct tty_struct *tty)
@@ -2338,7 +2338,7 @@ static void mgslpc_close(struct tty_struct *tty, struct 
file * filp)
if (tty_port_close_start(port, tty, filp) == 0)
goto cleanup;
 
-   if (port->flags & ASYNC_INITIALIZED)
+   if (tty_port_initialized(port))
mgslpc_wait_until_sent(tty, info->timeout);
 
mgslpc_flush_buffer(tty);
@@ -2371,7 +2371,7 @@ static void mgslpc_wait_until_sent(struct tty_struct 
*tty, int timeout)
if (mgslpc_paranoia_check(info, tty->name, "mgslpc_wait_until_sent"))
return;
 
-   if (!(info->port.flags & ASYNC_INITIALIZED))
+   if (!tty_port_initialized(&info->port))
goto exit;
 
orig_jiffies = jiffies;
diff --git a/drivers/ipack/devices/ipoctal.c b/drivers/ipack/devices/ipoctal.c
index 035d544..75dd15d 100644
--- a/drivers/ipack/devices/ipoctal.c
+++ b/drivers/ipack/devices/ipoctal.c
@@ -629,8 +629,7 @@ static void ipoctal_hangup(struct tty_struct *tty)
tty_port_hangup(&channel->tty_port);
 
ipoctal_reset_channel(channel);
-
-   clear_bit(ASYNCB_INITIALIZED, &channel->tty_port.flags);
+   tty_port_set_initialized(&channel->tty_port, 0);
wake_up_interruptible(&channel->tty_port.open_wait);
 }
 
@@ -642,7 +641,7 @@ static void ipoctal_shutdown(struct tty_struct *tty)
return;
 
ipoctal_reset_channel(channel);
-   clear_bit(ASYNCB_INITIALIZED, &channel->tty_port.flags

[PATCH 5/8] tty: Replace ASYNC_SUSPENDED bit and update atomically

2016-04-09 Thread Peter Hurley
Replace ASYNC_SUSPENDED bit in the tty_port::flags field with
TTY_PORT_SUSPENDED bit in the tty_port::iflags field. Introduce helpers
tty_port_set_suspended() and tty_port_suspended() to abstract
atomic bit ops.

Signed-off-by: Peter Hurley 
---
 drivers/s390/char/con3215.c  | 12 ++--
 drivers/tty/serial/serial_core.c |  8 
 include/linux/tty.h  | 13 +
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c
index e7e078b..114fe28 100644
--- a/drivers/s390/char/con3215.c
+++ b/drivers/s390/char/con3215.c
@@ -289,7 +289,7 @@ static void raw3215_timeout(unsigned long __data)
 
spin_lock_irqsave(get_ccwdev_lock(raw->cdev), flags);
raw->flags &= ~RAW3215_TIMER_RUNS;
-   if (!(raw->port.flags & ASYNC_SUSPENDED)) {
+   if (!tty_port_suspended(&raw->port)) {
raw3215_mk_write_req(raw);
raw3215_start_io(raw);
if ((raw->queued_read || raw->queued_write) &&
@@ -312,7 +312,7 @@ static void raw3215_timeout(unsigned long __data)
 static inline void raw3215_try_io(struct raw3215_info *raw)
 {
if (!(raw->port.flags & ASYNC_INITIALIZED) ||
-   (raw->port.flags & ASYNC_SUSPENDED))
+   tty_port_suspended(&raw->port))
return;
if (raw->queued_read != NULL)
raw3215_start_io(raw);
@@ -494,7 +494,7 @@ static void raw3215_make_room(struct raw3215_info *raw, 
unsigned int length)
/* While console is frozen for suspend we have no other
 * choice but to drop message from the buffer to make
 * room for even more messages. */
-   if (raw->port.flags & ASYNC_SUSPENDED) {
+   if (tty_port_suspended(&raw->port)) {
raw3215_drop_line(raw);
continue;
}
@@ -773,7 +773,7 @@ static int raw3215_pm_stop(struct ccw_device *cdev)
raw = dev_get_drvdata(&cdev->dev);
spin_lock_irqsave(get_ccwdev_lock(raw->cdev), flags);
raw3215_make_room(raw, RAW3215_BUFFER_SIZE);
-   raw->port.flags |= ASYNC_SUSPENDED;
+   tty_port_set_suspended(&raw->port, 1);
spin_unlock_irqrestore(get_ccwdev_lock(raw->cdev), flags);
return 0;
 }
@@ -786,7 +786,7 @@ static int raw3215_pm_start(struct ccw_device *cdev)
/* Allow I/O again and flush output buffer. */
raw = dev_get_drvdata(&cdev->dev);
spin_lock_irqsave(get_ccwdev_lock(raw->cdev), flags);
-   raw->port.flags &= ~ASYNC_SUSPENDED;
+   tty_port_set_suspended(&raw->port, 0);
raw->flags |= RAW3215_FLUSHING;
raw3215_try_io(raw);
raw->flags &= ~RAW3215_FLUSHING;
@@ -859,7 +859,7 @@ static void con3215_flush(void)
unsigned long flags;
 
raw = raw3215[0];  /* console 3215 is the first one */
-   if (raw->port.flags & ASYNC_SUSPENDED)
+   if (tty_port_suspended(&raw->port))
/* The console is still frozen for suspend. */
if (ccw_device_force_console(raw->cdev))
/* Forcing didn't work, no panic message .. */
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 2471380..9336067 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -249,7 +249,7 @@ static void uart_shutdown(struct tty_struct *tty, struct 
uart_state *state)
 * a DCD drop (hangup) at just the right time.  Clear suspended bit so
 * we don't try to resume a port that has been shutdown.
 */
-   clear_bit(ASYNCB_SUSPENDED, &port->flags);
+   tty_port_set_suspended(port, 0);
 
/*
 * Free the transmit buffer page.
@@ -2007,7 +2007,7 @@ int uart_suspend_port(struct uart_driver *drv, struct 
uart_port *uport)
const struct uart_ops *ops = uport->ops;
int tries;
 
-   set_bit(ASYNCB_SUSPENDED, &port->flags);
+   tty_port_set_suspended(port, 1);
clear_bit(ASYNCB_INITIALIZED, &port->flags);
 
spin_lock_irq(&uport->lock);
@@ -2088,7 +2088,7 @@ int uart_resume_port(struct uart_driver *drv, struct 
uart_port *uport)
console_start(uport->cons);
}
 
-   if (port->flags & ASYNC_SUSPENDED) {
+   if (tty_port_suspended(port)) {
const struct uart_ops *ops = uport->ops;
int ret;
 
@@ -2118,7 +2118,7 @@ int uart_resume_port(struct uart_driver *drv, struct 
uart_port *uport)
}
}
 
-   clear_bit(ASYNCB_SUSPENDED, &port->flags);
+   tty_port_set_suspended(port, 0);
}
 
 

[PATCH 2/8] tty: Replace ASYNC_CTS_FLOW bit and update atomically

2016-04-09 Thread Peter Hurley
Replace ASYNC_CTS_FLOW bit in the tty_port::flags field with
TTY_PORT_CTS_FLOW bit in the tty_port::iflags field. Add
tty_port_set_cts_flow() helper to abstract the atomic bit ops.

Signed-off-by: Peter Hurley 
---
 drivers/char/pcmcia/synclink_cs.c  |  5 +
 drivers/tty/amiserial.c|  6 ++
 drivers/tty/cyclades.c | 10 --
 drivers/tty/isicom.c   |  6 ++
 drivers/tty/mxser.c|  4 +---
 drivers/tty/synclink.c |  7 ++-
 drivers/tty/synclink_gt.c  |  5 +
 drivers/tty/synclinkmp.c   |  5 +
 include/linux/tty.h| 12 ++--
 net/irda/ircomm/ircomm_tty_ioctl.c |  3 +--
 10 files changed, 25 insertions(+), 38 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index bcae5bb..bdf41ac 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -1466,10 +1466,7 @@ static void mgslpc_change_params(MGSLPC_INFO *info, 
struct tty_struct *tty)
}
info->timeout += HZ/50; /* Add .02 seconds of slop */
 
-   if (cflag & CRTSCTS)
-   info->port.flags |= ASYNC_CTS_FLOW;
-   else
-   info->port.flags &= ~ASYNC_CTS_FLOW;
+   tty_port_set_cts_flow(&info->port, cflag & CRTSCTS);
 
if (cflag & CLOCAL)
info->port.flags &= ~ASYNC_CHECK_CD;
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index e68208e..92717b0 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -727,11 +727,9 @@ static void change_speed(struct tty_struct *tty, struct 
serial_state *info,
info->IER &= ~UART_IER_MSI;
if (port->flags & ASYNC_HARDPPS_CD)
info->IER |= UART_IER_MSI;
-   if (cflag & CRTSCTS) {
-   port->flags |= ASYNC_CTS_FLOW;
+   tty_port_set_cts_flow(port, cflag & CRTSCTS);
+   if (cflag & CRTSCTS)
info->IER |= UART_IER_MSI;
-   } else
-   port->flags &= ~ASYNC_CTS_FLOW;
if (cflag & CLOCAL)
port->flags &= ~ASYNC_CHECK_CD;
else {
diff --git a/drivers/tty/cyclades.c b/drivers/tty/cyclades.c
index d67e542..1a12776 100644
--- a/drivers/tty/cyclades.c
+++ b/drivers/tty/cyclades.c
@@ -2083,13 +2083,11 @@ static void cy_set_line_char(struct cyclades_port 
*info, struct tty_struct *tty)
info->cor1 |= CyPARITY_NONE;
 
/* CTS flow control flag */
-   if (cflag & CRTSCTS) {
-   info->port.flags |= ASYNC_CTS_FLOW;
+   tty_port_set_cts_flow(&info->port, cflag & CRTSCTS);
+   if (cflag & CRTSCTS)
info->cor2 |= CyCtsAE;
-   } else {
-   info->port.flags &= ~ASYNC_CTS_FLOW;
+   else
info->cor2 &= ~CyCtsAE;
-   }
if (cflag & CLOCAL)
info->port.flags &= ~ASYNC_CHECK_CD;
else
@@ -2234,7 +2232,7 @@ static void cy_set_line_char(struct cyclades_port *info, 
struct tty_struct *tty)
}
/* As the HW flow control is done in firmware, the driver
   doesn't need to care about it */
-   info->port.flags &= ~ASYNC_CTS_FLOW;
+   tty_port_set_cts_flow(&info->port, 0);
 
/* XON/XOFF/XANY flow control flags */
sw_flow = 0;
diff --git a/drivers/tty/isicom.c b/drivers/tty/isicom.c
index 8bf6763..c5f06b5 100644
--- a/drivers/tty/isicom.c
+++ b/drivers/tty/isicom.c
@@ -765,11 +765,9 @@ static void isicom_config_port(struct tty_struct *tty)
 
/* flow control settings ...*/
flow_ctrl = 0;
-   port->port.flags &= ~ASYNC_CTS_FLOW;
-   if (C_CRTSCTS(tty)) {
-   port->port.flags |= ASYNC_CTS_FLOW;
+   tty_port_set_cts_flow(&port->port, C_CRTSCTS(tty));
+   if (C_CRTSCTS(tty))
flow_ctrl |= ISICOM_CTSRTS;
-   }
if (I_IXON(tty))
flow_ctrl |= ISICOM_RESPOND_XONXOFF;
if (I_IXOFF(tty))
diff --git a/drivers/tty/mxser.c b/drivers/tty/mxser.c
index f23c2a1..8f3fdad 100644
--- a/drivers/tty/mxser.c
+++ b/drivers/tty/mxser.c
@@ -711,8 +711,8 @@ static int mxser_change_speed(struct tty_struct *tty,
/* CTS flow control flag and modem status interrupts */
info->IER &= ~UART_IER_MSI;
info->MCR &= ~UART_MCR_AFE;
+   tty_port_set_cts_flow(&info->port, cflag & CRTSCTS);
if (cflag & CRTSCTS) {
-   info->port.flags |= ASYNC_CTS_FLOW;
info->IER |= UART_IER_MSI;
if ((info->type == PORT_16550A) || (info->board->chip_flag)) {
   

[PATCH] tty: Replace TTY_THROTTLED bit tests with tty_throttled()

2016-04-09 Thread Peter Hurley
Abstract TTY_THROTTLED bit tests with tty_throttled().

Signed-off-by: Peter Hurley 
---
 drivers/char/pcmcia/synclink_cs.c  | 2 +-
 drivers/mmc/card/sdio_uart.c   | 2 +-
 drivers/net/usb/hso.c  | 2 +-
 drivers/staging/fwserial/fwserial.c| 2 +-
 drivers/staging/speakup/selection.c| 2 +-
 drivers/tty/amiserial.c| 2 +-
 drivers/tty/hvc/hvc_console.c  | 2 +-
 drivers/tty/hvc/hvcs.c | 2 +-
 drivers/tty/hvc/hvsi.c | 2 +-
 drivers/tty/moxa.c | 2 +-
 drivers/tty/nozomi.c   | 2 +-
 drivers/tty/serial/serial_core.c   | 2 +-
 drivers/tty/synclink.c | 2 +-
 drivers/tty/synclink_gt.c  | 2 +-
 drivers/tty/synclinkmp.c   | 2 +-
 drivers/tty/tty_ioctl.c| 4 ++--
 drivers/tty/vt/selection.c | 2 +-
 drivers/usb/gadget/function/u_serial.c | 4 ++--
 drivers/usb/serial/digi_acceleport.c   | 3 +--
 include/linux/tty.h| 5 +
 net/irda/ircomm/ircomm_tty_ioctl.c | 2 +-
 21 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index 825db42..bcae5bb 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -2316,7 +2316,7 @@ static void mgslpc_set_termios(struct tty_struct *tty, 
struct ktermios *old_term
/* Handle transition away from B0 status */
if (!(old_termios->c_cflag & CBAUD) && C_BAUD(tty)) {
info->serial_signals |= SerialSignal_DTR;
-   if (!C_CRTSCTS(tty) || !test_bit(TTY_THROTTLED, &tty->flags))
+   if (!C_CRTSCTS(tty) || !tty_throttled(tty))
info->serial_signals |= SerialSignal_RTS;
spin_lock_irqsave(&info->lock, flags);
set_signals(info);
diff --git a/drivers/mmc/card/sdio_uart.c b/drivers/mmc/card/sdio_uart.c
index 5415056..5af6fb9 100644
--- a/drivers/mmc/card/sdio_uart.c
+++ b/drivers/mmc/card/sdio_uart.c
@@ -895,7 +895,7 @@ static void sdio_uart_set_termios(struct tty_struct *tty,
/* Handle transition away from B0 status */
if (!(old_termios->c_cflag & CBAUD) && (cflag & CBAUD)) {
unsigned int mask = TIOCM_DTR;
-   if (!(cflag & CRTSCTS) || !test_bit(TTY_THROTTLED, &tty->flags))
+   if (!(cflag & CRTSCTS) || !tty_throttled(tty))
mask |= TIOCM_RTS;
sdio_uart_set_mctrl(port, mask);
}
diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
index 111d907..4b44586 100644
--- a/drivers/net/usb/hso.c
+++ b/drivers/net/usb/hso.c
@@ -2029,7 +2029,7 @@ static int put_rxbuf_data(struct urb *urb, struct 
hso_serial *serial)
 
tty = tty_port_tty_get(&serial->port);
 
-   if (tty && test_bit(TTY_THROTTLED, &tty->flags)) {
+   if (tty && tty_throttled(tty)) {
tty_kref_put(tty);
return -1;
}
diff --git a/drivers/staging/fwserial/fwserial.c 
b/drivers/staging/fwserial/fwserial.c
index 9b23b5c..1f9389d 100644
--- a/drivers/staging/fwserial/fwserial.c
+++ b/drivers/staging/fwserial/fwserial.c
@@ -1305,7 +1305,7 @@ static void fwtty_set_termios(struct tty_struct *tty, 
struct ktermios *old)
if ((baud == 0) && (old->c_cflag & CBAUD)) {
port->mctrl &= ~(TIOCM_DTR | TIOCM_RTS);
} else if ((baud != 0) && !(old->c_cflag & CBAUD)) {
-   if (C_CRTSCTS(tty) || !test_bit(TTY_THROTTLED, &tty->flags))
+   if (C_CRTSCTS(tty) || !tty_throttled(tty))
port->mctrl |= TIOCM_DTR | TIOCM_RTS;
else
port->mctrl |= TIOCM_DTR;
diff --git a/drivers/staging/speakup/selection.c 
b/drivers/staging/speakup/selection.c
index 41ef099..0149edc 100644
--- a/drivers/staging/speakup/selection.c
+++ b/drivers/staging/speakup/selection.c
@@ -150,7 +150,7 @@ static void __speakup_paste_selection(struct work_struct 
*work)
add_wait_queue(&vc->paste_wait, &wait);
while (sel_buffer && sel_buffer_lth > pasted) {
set_current_state(TASK_INTERRUPTIBLE);
-   if (test_bit(TTY_THROTTLED, &tty->flags)) {
+   if (tty_throttled(tty)) {
schedule();
continue;
}
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index 183e98e..e68208e 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -1342,7 +1342,7 @@ static void rs_set_termios(struct tty_struct *tty, struct 
ktermios *old_termios)
/* Handle transition away from B0 status */
if (!(old_termios->c_cflag & CBAUD) && (cflag & CBAUD)) {
info-&g

[PATCH] tty: Replace TTY_IO_ERROR bit tests with tty_io_error()

2016-04-09 Thread Peter Hurley
Abstract TTY_IO_ERROR status test treewide with tty_io_error().
NB: tty->flags uses atomic bit ops; replace non-atomic bit test
with test_bit().

Signed-off-by: Peter Hurley 
---

v3: redo of an earlier patch titled "tty: Use test_bit() with tty->flags"
v2: rebase

 arch/ia64/hp/sim/simserial.c   | 2 +-
 drivers/char/pcmcia/synclink_cs.c  | 2 +-
 drivers/isdn/i4l/isdn_tty.c| 6 +++---
 drivers/s390/char/tty3270.c| 4 ++--
 drivers/staging/dgnc/dgnc_tty.c| 2 +-
 drivers/tty/amiserial.c| 6 +++---
 drivers/tty/mxser.c| 7 +++
 drivers/tty/pty.c  | 2 +-
 drivers/tty/serial/crisv10.c   | 5 ++---
 drivers/tty/serial/serial_core.c   | 8 
 drivers/tty/synclink.c | 4 ++--
 drivers/tty/synclink_gt.c  | 4 ++--
 drivers/tty/synclinkmp.c   | 4 ++--
 drivers/tty/tty_io.c   | 5 ++---
 drivers/tty/tty_port.c | 2 +-
 include/linux/tty.h| 5 +
 net/irda/ircomm/ircomm_tty.c   | 2 +-
 net/irda/ircomm/ircomm_tty_ioctl.c | 6 +++---
 18 files changed, 39 insertions(+), 37 deletions(-)

diff --git a/arch/ia64/hp/sim/simserial.c b/arch/ia64/hp/sim/simserial.c
index e70cade..21fd50d 100644
--- a/arch/ia64/hp/sim/simserial.c
+++ b/arch/ia64/hp/sim/simserial.c
@@ -300,7 +300,7 @@ static int rs_ioctl(struct tty_struct *tty, unsigned int 
cmd, unsigned long arg)
if ((cmd != TIOCGSERIAL) && (cmd != TIOCSSERIAL) &&
(cmd != TIOCSERCONFIG) && (cmd != TIOCSERGSTRUCT) &&
(cmd != TIOCMIWAIT)) {
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
}
 
diff --git a/drivers/char/pcmcia/synclink_cs.c 
b/drivers/char/pcmcia/synclink_cs.c
index 22c2765..825db42 100644
--- a/drivers/char/pcmcia/synclink_cs.c
+++ b/drivers/char/pcmcia/synclink_cs.c
@@ -2246,7 +2246,7 @@ static int mgslpc_ioctl(struct tty_struct *tty,
 
if ((cmd != TIOCGSERIAL) && (cmd != TIOCSSERIAL) &&
(cmd != TIOCMIWAIT)) {
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
}
 
diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c
index 947d5c9..f1edc08 100644
--- a/drivers/isdn/i4l/isdn_tty.c
+++ b/drivers/isdn/i4l/isdn_tty.c
@@ -1351,7 +1351,7 @@ isdn_tty_tiocmget(struct tty_struct *tty)
 
if (isdn_tty_paranoia_check(info, tty->name, __func__))
return -ENODEV;
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
 
mutex_lock(&modem_info_mutex);
@@ -1378,7 +1378,7 @@ isdn_tty_tiocmset(struct tty_struct *tty,
 
if (isdn_tty_paranoia_check(info, tty->name, __func__))
return -ENODEV;
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
 
 #ifdef ISDN_DEBUG_MODEM_IOCTL
@@ -1419,7 +1419,7 @@ isdn_tty_ioctl(struct tty_struct *tty, uint cmd, ulong 
arg)
 
if (isdn_tty_paranoia_check(info, tty->name, "isdn_tty_ioctl"))
return -ENODEV;
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
switch (cmd) {
case TCSBRK:   /* SVID version: non-zero arg --> no break */
diff --git a/drivers/s390/char/tty3270.c b/drivers/s390/char/tty3270.c
index e96fc7f..080a987 100644
--- a/drivers/s390/char/tty3270.c
+++ b/drivers/s390/char/tty3270.c
@@ -1804,7 +1804,7 @@ static int tty3270_ioctl(struct tty_struct *tty, unsigned 
int cmd,
tp = tty->driver_data;
if (!tp)
return -ENODEV;
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
return kbd_ioctl(tp->kbd, cmd, arg);
 }
@@ -1818,7 +1818,7 @@ static long tty3270_compat_ioctl(struct tty_struct *tty,
tp = tty->driver_data;
if (!tp)
return -ENODEV;
-   if (tty->flags & (1 << TTY_IO_ERROR))
+   if (tty_io_error(tty))
return -EIO;
return kbd_ioctl(tp->kbd, cmd, (unsigned long)compat_ptr(arg));
 }
diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c
index bcd2bdf..5c22159 100644
--- a/drivers/staging/dgnc/dgnc_tty.c
+++ b/drivers/staging/dgnc/dgnc_tty.c
@@ -1255,7 +1255,7 @@ static int dgnc_block_til_ready(struct tty_struct *tty,
if (file->f_flags & O_NONBLOCK)
break;
 
-   if (tty->flags & (1 << TTY_IO_ERROR)) {
+   if (tty_io_error(tty)) {
retval = -EIO;
break;
 

Re: [PATCH v1 06/12] serial: 8250_dma: stop ongoing RX DMA on exception

2016-04-08 Thread Peter Hurley
On 04/08/2016 01:07 AM, Andy Shevchenko wrote:
> On Fri, Apr 8, 2016 at 2:54 AM, Peter Hurley  wrote:
>> On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
>>> If we get an exeption interrupt. i.e. UART_IIR_RLSI, stop any ongoing RX DMA
>>> transfer otherwise it might generates more spurious interrupts and make port
>>> unavailable anymore.
>>
>> Then how to know which rx byte the error is for if dma continues anyway?
>> What if there are multiple error bytes?
> 
> And how should it work?
> We get an interrupt during DMA, if we don't stop DMA it will be racy
> with direct readings.

It makes sense to me that the ongoing DMA needs paused, flushed & terminated,
but the UART should have already aborted the DMA at the first error byte,
so it doesn't make sense to me that the DMA hardware went sideways.

Have you verified that the actual byte in error is reported as the frame/parity
byte and that error-free data is unmangled? Like with a data pattern and a logic
analyzer?


>>
>>
>>> As has been seen on Intel Broxton system:
>>
>> This system shouldn't be setup for UART DMA imo.
> 
> Same approach is done in 8250_omap.

Well, omap8250 has totally different (and possibly unnecessary) rx dma flow.

During the development of the omap8250 driver, it was discovered that the
normal 8250 rx dma flow didn't work reliably on OMAP; ie., the rx dma wouldn't
start once rx uart interrupt had already happened.

*So omap8250 sets up rx dma before any data has been received*
That's the dma that is cancelled when an RLSI interrupt is received;
on OMAP the residue is always 0.

Well, it turns out that the omap8250 rx dma flow *may* be limited to only
1 specific design, the am335x, which has a bunch of other dma issues, with
both tx and rx dma. So all that omap8250 dma handling might be going
away anyway.

IOW, omap8250 is a terrible dma model; do not use.
[Granted the current model needs some work as well; eg., using ping-pong
dma buffers to weather dmaengine descriptor completion latency).

Regards,
Peter Hurley


Re: [PATCH v1 09/12] serial: 8250_lpss: split LPSS driver to separate module

2016-04-08 Thread Peter Hurley
On 04/08/2016 01:17 AM, Andy Shevchenko wrote:
> On Fri, Apr 8, 2016 at 4:42 AM, Peter Hurley  wrote:
>> On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
>>> Intes SoCs, such as Braswell, have DesignWare UART. Split out to separate
>>> module which also will be used for Intel Quark later.
>>
>> What's the rationale?
> 
> 1. Not poison 8250_pci with too many quirks.
> 2. They all use same DMA engine, otherwise we might end up in all
> possible DMA engines included in one file.
> 3. All of them are actually DesignWare, so, in the future we might
> share code between 8250_dw and 8250_lpss.

Just my opinion, but I like to see the rationale in the changelog.


>> And this really isn't a split; this patch introduces a number of significant
>> changes from the pci version.
> 
> Some style changes, yes, but "significant"?
> For example?

I'm just pointing out the changelog doesn't really match the
commit. I'm not suggesting necessarily to redo the series, but just more
adequately reflect the change. See below.


>>
>>
>>> Signed-off-by: Andy Shevchenko 
>>> ---
>>>  drivers/tty/serial/8250/8250_lpss.c | 279 
>>> 
>>>  drivers/tty/serial/8250/8250_pci.c  | 227 ++---
>>>  drivers/tty/serial/8250/Kconfig |  14 +-
>>>  drivers/tty/serial/8250/Makefile|   1 +
>>>  4 files changed, 301 insertions(+), 220 deletions(-)
>>>  create mode 100644 drivers/tty/serial/8250/8250_lpss.c
>>>
>>> diff --git a/drivers/tty/serial/8250/8250_lpss.c 
>>> b/drivers/tty/serial/8250/8250_lpss.c
>>> new file mode 100644
>>> index 000..bca4adb
>>> --- /dev/null
>>> +++ b/drivers/tty/serial/8250/8250_lpss.c
>>> @@ -0,0 +1,279 @@
>>> +/*
>>> + * 8250_lpss.c - Driver for UART on Intel Braswell and various other Intel 
>>> SoCs
>>> + *
>>> + * Copyright (C) 2016 Intel Corporation
>>> + * Author: Andy Shevchenko 
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include 
>>> +#include 
>>> +
>>> +#include "8250.h"
>>> +
>>> +#define PCI_DEVICE_ID_INTEL_BYT_UART10x0f0a
>>> +#define PCI_DEVICE_ID_INTEL_BYT_UART20x0f0c
>>> +
>>> +#define PCI_DEVICE_ID_INTEL_BSW_UART10x228a
>>> +#define PCI_DEVICE_ID_INTEL_BSW_UART20x228c
>>> +
>>> +#define PCI_DEVICE_ID_INTEL_BDW_UART10x9ce3
>>> +#define PCI_DEVICE_ID_INTEL_BDW_UART20x9ce4
>>> +
>>> +/* Intel LPSS specific registers */
>>> +
>>> +#define BYT_PRV_CLK  0x800
>>> +#define BYT_PRV_CLK_EN   BIT(0)
>>> +#define BYT_PRV_CLK_M_VAL_SHIFT  1
>>> +#define BYT_PRV_CLK_N_VAL_SHIFT  16
>>> +#define BYT_PRV_CLK_UPDATE   BIT(31)
>>> +
>>> +#define BYT_TX_OVF_INT   0x820
>>> +#define BYT_TX_OVF_INT_MASK  BIT(1)
>>> +
>>> +struct lpss8250;
>>> +
>>> +struct lpss8250_board {
>>> + unsigned long freq;
>>> + unsigned int base_baud;
>>> + int (*setup)(struct lpss8250 *, struct uart_port *p);
>>> +};
>>
>> New concept.
>>
>>> +
>>> +struct lpss8250 {
>>> + int line;
>>> + struct lpss8250_board *board;
>>> +
>>> + /* DMA parameters */
>>> + struct uart_8250_dma dma;
>>> + struct dw_dma_slave dma_param;
>>> + u8 dma_maxburst;
>>> +};
>>> +
>>> +/*/
>>
>> Please remove.
>>
>>> +
>>> +static void lpss8250_set_termios(struct uart_port *p,
>>> +  struct ktermios *termios,
>>> +  struct ktermios *old)
>>> +{
>>> + unsigned int baud = tty_termios_baud_rate(termios);
>>> + struct lpss8250 *lpss = p->private_data;
>>> + unsigned long fref = lpss->board->freq, fuart = baud * 16;
>>>

Re: [PATCH v1 09/12] serial: 8250_lpss: split LPSS driver to separate module

2016-04-07 Thread Peter Hurley
On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
> Intes SoCs, such as Braswell, have DesignWare UART. Split out to separate
> module which also will be used for Intel Quark later.

What's the rationale?

And this really isn't a split; this patch introduces a number of significant
changes from the pci version.


> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/tty/serial/8250/8250_lpss.c | 279 
> 
>  drivers/tty/serial/8250/8250_pci.c  | 227 ++---
>  drivers/tty/serial/8250/Kconfig |  14 +-
>  drivers/tty/serial/8250/Makefile|   1 +
>  4 files changed, 301 insertions(+), 220 deletions(-)
>  create mode 100644 drivers/tty/serial/8250/8250_lpss.c
> 
> diff --git a/drivers/tty/serial/8250/8250_lpss.c 
> b/drivers/tty/serial/8250/8250_lpss.c
> new file mode 100644
> index 000..bca4adb
> --- /dev/null
> +++ b/drivers/tty/serial/8250/8250_lpss.c
> @@ -0,0 +1,279 @@
> +/*
> + * 8250_lpss.c - Driver for UART on Intel Braswell and various other Intel 
> SoCs
> + *
> + * Copyright (C) 2016 Intel Corporation
> + * Author: Andy Shevchenko 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include "8250.h"
> +
> +#define PCI_DEVICE_ID_INTEL_BYT_UART10x0f0a
> +#define PCI_DEVICE_ID_INTEL_BYT_UART20x0f0c
> +
> +#define PCI_DEVICE_ID_INTEL_BSW_UART10x228a
> +#define PCI_DEVICE_ID_INTEL_BSW_UART20x228c
> +
> +#define PCI_DEVICE_ID_INTEL_BDW_UART10x9ce3
> +#define PCI_DEVICE_ID_INTEL_BDW_UART20x9ce4
> +
> +/* Intel LPSS specific registers */
> +
> +#define BYT_PRV_CLK  0x800
> +#define BYT_PRV_CLK_EN   BIT(0)
> +#define BYT_PRV_CLK_M_VAL_SHIFT  1
> +#define BYT_PRV_CLK_N_VAL_SHIFT  16
> +#define BYT_PRV_CLK_UPDATE   BIT(31)
> +
> +#define BYT_TX_OVF_INT   0x820
> +#define BYT_TX_OVF_INT_MASK  BIT(1)
> +
> +struct lpss8250;
> +
> +struct lpss8250_board {
> + unsigned long freq;
> + unsigned int base_baud;
> + int (*setup)(struct lpss8250 *, struct uart_port *p);
> +};

New concept.

> +
> +struct lpss8250 {
> + int line;
> + struct lpss8250_board *board;
> +
> + /* DMA parameters */
> + struct uart_8250_dma dma;
> + struct dw_dma_slave dma_param;
> + u8 dma_maxburst;
> +};
> +
> +/*/

Please remove.

> +
> +static void lpss8250_set_termios(struct uart_port *p,
> +  struct ktermios *termios,
> +  struct ktermios *old)
> +{
> + unsigned int baud = tty_termios_baud_rate(termios);
> + struct lpss8250 *lpss = p->private_data;
> + unsigned long fref = lpss->board->freq, fuart = baud * 16;
> + unsigned long w = BIT(15) - 1;
> + unsigned long m, n;
> + u32 reg;
> +
> + /* Get Fuart closer to Fref */
> + fuart *= rounddown_pow_of_two(fref / fuart);
> +
> + /*
> +  * For baud rates 0.5M, 1M, 1.5M, 2M, 2.5M, 3M, 3.5M and 4M the
> +  * dividers must be adjusted.
> +  *
> +  * uartclk = (m / n) * 100 MHz, where m <= n
> +  */
> + rational_best_approximation(fuart, fref, w, w, &m, &n);
> + p->uartclk = fuart;
> +
> + /* Reset the clock */
> + reg = (m << BYT_PRV_CLK_M_VAL_SHIFT) | (n << BYT_PRV_CLK_N_VAL_SHIFT);
> + writel(reg, p->membase + BYT_PRV_CLK);
> + reg |= BYT_PRV_CLK_EN | BYT_PRV_CLK_UPDATE;
> + writel(reg, p->membase + BYT_PRV_CLK);
> +
> + p->status &= ~UPSTAT_AUTOCTS;
> + if (termios->c_cflag & CRTSCTS)
> + p->status |= UPSTAT_AUTOCTS;
> +
> + serial8250_do_set_termios(p, termios, old);
> +}
> +
> +/*/

Please remove.

> +
> +static int byt_serial_setup(struct lpss8250 *lpss, struct uart_port *port)

This would have been much easier to review if you had simply moved it here
and done the rework in a follow-on patch.


> +{
> + struct dw_dma_slave *param = &lpss->dma_param;
> + struct uart_8250_port *up = up_to_u8250p(port);
> + struct pci_dev *pdev = to_pci_dev(port->dev);
> + struct pci_dev *dma_dev = pci_get_slot(pdev->bus, 
> PCI_DEVFN(PCI_SLOT(pdev->devfn), 0));
> +
> + switch (pdev->device) {
> + case PCI_DEVICE_ID_INTEL_BYT_UART1:
> + case PCI_DEVICE_ID_INTEL_BSW_UART1:
> + case PCI_DEVICE_ID_INTEL_BDW_UART1:
> + param->src_id = 3;
> + param->dst_id = 2;
> + break;
> + case PCI_DEVICE_ID_INTEL_BYT_UART2:
> + case PCI_DEVICE_ID_INTEL_BSW_UART2:
> + case PCI_DEVICE_ID_INTEL_BDW_UART2:
> + param->src_id = 5;
> + para

Re: [PATCH v1 07/12] serial: 8250_dma: adjust DMA address of the UART

2016-04-07 Thread Peter Hurley
On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
> Some UARTs, e.g. one is used in Intel Quark, have a different address base for
> DMA operations. Introduce an additional field in struct uart_8250_dma to cover
> those cases.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/tty/serial/8250/8250.h | 2 ++
>  drivers/tty/serial/8250/8250_dma.c | 5 +++--
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
> index 047a7ba..040deb2 100644
> --- a/drivers/tty/serial/8250/8250.h
> +++ b/drivers/tty/serial/8250/8250.h
> @@ -31,6 +31,8 @@ struct uart_8250_dma {
>   struct dma_chan *rxchan;
>   struct dma_chan *txchan;
>  
> + phys_addr_t dma_addr;
> +

Let's add separate rx and tx device addrs.


>   dma_addr_t  rx_addr;
>   dma_addr_t  tx_addr;
>  
> diff --git a/drivers/tty/serial/8250/8250_dma.c 
> b/drivers/tty/serial/8250/8250_dma.c
> index b134bec..a314492 100644
> --- a/drivers/tty/serial/8250/8250_dma.c
> +++ b/drivers/tty/serial/8250/8250_dma.c
> @@ -164,16 +164,17 @@ int serial8250_rx_dma(struct uart_8250_port *p, 
> unsigned int iir)
>  int serial8250_request_dma(struct uart_8250_port *p)
>  {
>   struct uart_8250_dma*dma = p->dma;
> + phys_addr_t dma_addr = dma->dma_addr ? dma->dma_addr : p->port.mapbase;
>   dma_cap_mask_t  mask;
>  
>   /* Default slave configuration parameters */
>   dma->rxconf.direction   = DMA_DEV_TO_MEM;
>   dma->rxconf.src_addr_width  = DMA_SLAVE_BUSWIDTH_1_BYTE;
> - dma->rxconf.src_addr= p->port.mapbase + UART_RX;
> + dma->rxconf.src_addr= dma_addr + UART_RX;
>  
>   dma->txconf.direction   = DMA_MEM_TO_DEV;
>   dma->txconf.dst_addr_width  = DMA_SLAVE_BUSWIDTH_1_BYTE;
> - dma->txconf.dst_addr= p->port.mapbase + UART_TX;
> + dma->txconf.dst_addr= dma_addr + UART_TX;
>  
>   dma_cap_zero(mask);
>   dma_cap_set(DMA_SLAVE, mask);
> 



Re: [PATCH v1 05/12] serial: 8250_dma: switch to new dmaengine_terminate_* API

2016-04-07 Thread Peter Hurley
On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
> Convert dmaengine_terminate_all() calls to synchronous and asynchronous
> versions where appropriate.

Reviewed-by: Peter Hurley 



Re: [PATCH v1 06/12] serial: 8250_dma: stop ongoing RX DMA on exception

2016-04-07 Thread Peter Hurley
On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
> If we get an exeption interrupt. i.e. UART_IIR_RLSI, stop any ongoing RX DMA
> transfer otherwise it might generates more spurious interrupts and make port
> unavailable anymore.

Then how to know which rx byte the error is for if dma continues anyway?
What if there are multiple error bytes?


> As has been seen on Intel Broxton system:

This system shouldn't be setup for UART DMA imo.


> ...
> [  168.526281] serial8250: too much work for irq5
> [  168.535908] serial8250: too much work for irq5
> [  173.449464] serial8250_interrupt: 4439 callbacks suppressed
> [  173.455694] serial8250: too much work for irq5
> ...
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/tty/serial/8250/8250_dma.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_dma.c 
> b/drivers/tty/serial/8250/8250_dma.c
> index 9d80bb1..b134bec 100644
> --- a/drivers/tty/serial/8250/8250_dma.c
> +++ b/drivers/tty/serial/8250/8250_dma.c
> @@ -110,6 +110,16 @@ err:
>   return ret;
>  }
>  
> +static void __dma_rx_stop(struct uart_8250_port *p, struct uart_8250_dma 
> *dma)
> +{
> + if (!dma->rx_running)
> + return;
> +
> + dmaengine_pause(dma->rxchan);
> + __dma_rx_complete(p);
> + dmaengine_terminate_async(dma->rxchan);
> +}
> +
>  int serial8250_rx_dma(struct uart_8250_port *p, unsigned int iir)
>  {
>   struct uart_8250_dma*dma = p->dma;
> @@ -118,17 +128,14 @@ int serial8250_rx_dma(struct uart_8250_port *p, 
> unsigned int iir)
>   switch (iir & 0x3f) {
>   case UART_IIR_RLSI:
>   /* 8250_core handles errors and break interrupts */
> + __dma_rx_stop(p, dma);
>   return -EIO;
>   case UART_IIR_RX_TIMEOUT:
>   /*
>* If RCVR FIFO trigger level was not reached, complete the
>* transfer and let 8250_core copy the remaining data.
>*/
> - if (dma->rx_running) {
> - dmaengine_pause(dma->rxchan);
> - __dma_rx_complete(p);
> - dmaengine_terminate_async(dma->rxchan);
> - }
> + __dma_rx_stop(p, dma);
>   return -ETIMEDOUT;
>   default:
>   break;
> 



Re: [PATCH v1 08/12] serial: 8250: enable AFE on ports where FIFO is 16 bytes

2016-04-07 Thread Peter Hurley
On 04/07/2016 01:37 PM, Andy Shevchenko wrote:
> Intel Quark has 16550A compatible UART with autoflow feature enabled. It has
> only 16 bytes of FIFO. Currently serial8250_do_set_termios() prevents to 
> enable
> autoflow since the minimum requirement of 32 bytes of FIFO size.
> 
> Decrease a FIFO size limitation to 16 bytes to allow autoflow control be
> enabled on such UARTs.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/tty/serial/8250/8250_port.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_port.c 
> b/drivers/tty/serial/8250/8250_port.c
> index e213da0..3f8121e 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2522,9 +2522,9 @@ serial8250_do_set_termios(struct uart_port *port, 
> struct ktermios *termios,
>* the trigger, or the MCR RTS bit is cleared.  In the case where
>* the remote UART is not using CTS auto flow control, we must
>* have sufficient FIFO entries for the latency of the remote
> -  * UART to respond.  IOW, at least 32 bytes of FIFO.
> +  * UART to respond.  IOW, at least 16 bytes of FIFO.
>*/
> - if (up->capabilities & UART_CAP_AFE && port->fifosize >= 32) {
> + if (up->capabilities & UART_CAP_AFE && port->fifosize >= 16) {

Let's just remove the fifosize test and rely on UART_CAP_AFE to enable
AFE.  Please remove comment from "In the case where ..."

Also, I think the PORT_A7 port type should have UART_CAP_AFE commented out,
especially since/if the trigger level is 1 byte.

Regards,
Peter Hurley

>   up->mcr &= ~UART_MCR_AFE;
>   if (termios->c_cflag & CRTSCTS)
>   up->mcr |= UART_MCR_AFE;
> 



Re: [PATCH v7 0/5] ACPI: parse the SPCR table

2016-04-06 Thread Peter Hurley
On 04/06/2016 07:18 AM, Mark Salter wrote:
> On Wed, 2016-04-06 at 11:52 +0100, Graeme Gregory wrote:
>> On Wed, Apr 06, 2016 at 01:24:12PM +0300, Aleksey Makarov wrote:
>>>
>>>
>>>
>>> On 04/05/2016 07:27 PM, Mark Salter wrote:
>>>>
>>>> Could you CC me on future postings of this series, please?
>>>>
>>>> So v3 fixed a problem where platforms using 8250 didn't work.
>>>> Then in v5, that fix was lost so 8250 no longer works.
>>> That fix was dropped because now this is implemented differently.  
>>> Now the code uses the existing match() member of struct console.
>>> It has already been implemented for 8250.  
>>> Probably SPCR table specifies incorrect data.
>>> It was reported that D02 board has similar issue.
>>>
>> The D02 issue referred to is the 8250 implementation they used has 32bit
>> only MMIO access which there is no manner to represent in SPCR. It also
>> doesn't use the "standard" 16550 clock rate so supplying a baud rate
>> makes it change to a random baud rate.
> 
> So, the problem with Mustang and m400 was the mmio vs mmio32.

Bummer. Maybe that firmware can use a different DBG2 port designation to
indicate it's a DesignWare 8250 (or generically 32-bit mmio).


> Right now,
> the kernel only supports mmio32 in the 8250_dw.c driver when probed with
> ACPI.

While that's true specifically for DesignWare 8250 + ACPI, that's not
true for _all_ 8250 + ACPI.


> Other x86 machines just use legacy 8250 ports which don't need
> entries in the DSDT and use legacy io.

I find that statement hard to believe. I know for a fact ia64 servers didn't,
so the assertion that SPCR + 16550 + MMIO automatically means 32-bit data width
seems unlikely.

In fact, I suspect the opposite is true; that before these arm64 designs,
SPCR + 16550 + MMIO automatically meant 8-bit data width, which is why it's
unspecified in the SPCR table.

Regards,
Peter Hurley


> The 8250_dw driver does support
> non-standard clocks specified in the DSDT. That works fine for Mustang
> and m400.
> 
> So with this quick hack, v7 is working with Mustang/m400:
> 
> diff --git a/drivers/acpi/spcr.c b/drivers/acpi/spcr.c
> index 68ffc33..dade9de 100644
> --- a/drivers/acpi/spcr.c
> +++ b/drivers/acpi/spcr.c
> @@ -37,6 +37,7 @@ int __init parse_spcr(bool earlycon)
>   acpi_status status;
>   char *uart;
>   char *iotype;
> + int mmio;
>   int baud_rate;
>   int err;
>  
> @@ -56,8 +57,8 @@ int __init parse_spcr(bool earlycon)
>   goto done;
>   }
>  
> - iotype = table->serial_port.space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY ?
> - "mmio" : "io";
> + mmio = table->serial_port.space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY;
> + iotype = mmio ? "mmio" : "io";
>  
>   switch (table->interface_type) {
>   case ACPI_DBG2_ARM_SBSA_32BIT:
> @@ -71,6 +72,8 @@ int __init parse_spcr(bool earlycon)
>   case ACPI_DBG2_16550_COMPATIBLE:
>   case ACPI_DBG2_16550_SUBSET:
>   uart = "uart";
> + if (mmio)
> + iotype = "mmio32";
>   break;
>   default:
>   err = -ENOENT;
> 
>>
>> We may need two 8250 subset entries, one just means no clock given, and
>> one no clock given + mmio32 only.
>>
>> Graeme
>>
>>
>>>
>>> Thank you
>>> Aleksey Makarov
>>>
>>>>
>>>>
>>>> --Mark
>>>>
>>>>
>>>> On Thu, 2016-03-31 at 16:40 +0300, Aleksey Makarov wrote:
>>>>>
>>>>> 'ARM Server Base Boot Requirements' [1] mentions SPCR (Serial Port
>>>>> Console Redirection Table) [2] as a mandatory ACPI table that
>>>>> specifies the configuration of serial console.
>>>>>
>>>>> Move "earlycon" early_param handling to earlycon.c to parse this option 
>>>>> once
>>>>>
>>>>> *** PATCH  "ACPICA: Headers: Add new constants for the DBG2 ACPI table"
>>>>> *** IS JUST TO ENABLE BUILDING THE PATCHSET ON linux-next
>>>>>
>>>>> Patch "ACPICA: Headers: Add new constants for the DBG2 ACPI table" is 
>>>>> required
>>>>> for the next patch.  It is taken from ACPICA series [3], but it has not 
>>>>> appeared
>>>>> in linux-next yet.
>>>>>
>>>>> Parse SPCR table, set

Re: [bisect] Merge tag 'mmc-v4.6' of git://git.linaro.org/people/ulf.hansson/mmc (was [GIT PULL] MMC for v.4.6)

2016-04-05 Thread Peter Hurley
On 04/05/2016 01:59 AM, Ulf Hansson wrote:
> Although, what puzzles me around this particular issue, is how an SoC
> configuration can rely on this fragile behaviour.
> All you have to do to break the assumption of fixed mmcblk ids, is to
> boot with an SD card inserted and then without. Perhaps these SoCs
> just doesn't support this use case!?

Both configurations boot reliably; without the uSD inserted, the
boot and root partitions on the eMMC are booted instead.

Without a uSD inserted, the only mmc block device is the eMMC which is
/dev/mmcblk0, and the root partition is still /dev/mmcblk0p2.

Note though that this particular bootscript can load add'l bootscripts
from the boot partition; in this particular case, the eMMC root
partition is set as a fixed UUID in the bootscript from the
boot partition.

Regards,
Peter Hurley


Re: [PATCH] tty: serial: 8250_omap: do not defer termios changes

2016-04-04 Thread Peter Hurley
On 03/31/2016 01:41 AM, John Ogness wrote:
> From: Sebastian Andrzej Siewior 
> 
> It has been observed that the TX-DMA can stall

Does this happen on any other OMAP part besides am335x?
I looked back over the LKML history of this and didn't see any other
design implicated in this problem.

> if termios changes
> occur while a TX-DMA operation is in progress. Previously this was
> handled by allowing set_termios to return immediately and deferring
> the change until the operation is completed. Now set_termios will
> block until the operation is completed, proceed with the changes,
> then schedule any pending next operation.

I ask because I wonder if this is related to the tx dma trigger
problem (worked around by writing a byte to the tx fifo on am335x)?

If so, then
1) It should use the OMAP_DMA_TX_KICK bug flag
2) It could pause dma, complete set_termios(), then resume dma which would
keep everything nice and linear.

Or even just drop DMA for the am335x and only use the normal 8250_dma
support for newer OMAP designs.

Regards,
Peter Hurley


> Commit message and forward port by John Ogness.
> 
> Tested-by: John Ogness 
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
>  Patch against next-20160331.
> 
>  drivers/tty/serial/8250/8250_omap.c |   57 --
>  1 file changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_omap.c 
> b/drivers/tty/serial/8250/8250_omap.c
> index 6f76051..9459b4d 100644
> --- a/drivers/tty/serial/8250/8250_omap.c
> +++ b/drivers/tty/serial/8250/8250_omap.c
> @@ -100,9 +100,9 @@ struct omap8250_priv {
>   u8 wer;
>   u8 xon;
>   u8 xoff;
> - u8 delayed_restore;
>   u16 quot;
>  
> + wait_queue_head_t termios_wait;
>   bool is_suspending;
>   int wakeirq;
>   int wakeups_enabled;
> @@ -257,18 +257,6 @@ static void omap8250_update_mdr1(struct uart_8250_port 
> *up,
>  static void omap8250_restore_regs(struct uart_8250_port *up)
>  {
>   struct omap8250_priv *priv = up->port.private_data;
> - struct uart_8250_dma*dma = up->dma;
> -
> - if (dma && dma->tx_running) {
> - /*
> -  * TCSANOW requests the change to occur immediately however if
> -  * we have a TX-DMA operation in progress then it has been
> -  * observed that it might stall and never complete. Therefore we
> -  * delay DMA completes to prevent this hang from happen.
> -  */
> - priv->delayed_restore = 1;
> - return;
> - }
>  
>   serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B);
>   serial_out(up, UART_EFR, UART_EFR_ECB);
> @@ -310,6 +298,7 @@ static void omap8250_restore_regs(struct uart_8250_port 
> *up)
>   up->port.ops->set_mctrl(&up->port, up->port.mctrl);
>  }
>  
> +static void omap_8250_dma_tx_complete(void *param);
>  /*
>   * OMAP can use "CLK / (16 or 13) / div" for baud rate. And then we have have
>   * some differences in how we want to handle flow control.
> @@ -320,6 +309,8 @@ static void omap_8250_set_termios(struct uart_port *port,
>  {
>   struct uart_8250_port *up = up_to_u8250p(port);
>   struct omap8250_priv *priv = up->port.private_data;
> + struct uart_8250_dma *dma = up->dma;
> + unsigned int complete_dma = 0;
>   unsigned char cval = 0;
>   unsigned int baud;
>  
> @@ -430,7 +421,7 @@ static void omap_8250_set_termios(struct uart_port *port,
>   priv->scr = OMAP_UART_SCR_RX_TRIG_GRANU1_MASK | OMAP_UART_SCR_TX_EMPTY |
>   OMAP_UART_SCR_TX_TRIG_GRANU1_MASK;
>  
> - if (up->dma)
> + if (dma)
>   priv->scr |= OMAP_UART_SCR_DMAMODE_1 |
>   OMAP_UART_SCR_DMAMODE_CTL;
>  
> @@ -460,6 +451,24 @@ static void omap_8250_set_termios(struct uart_port *port,
>   priv->efr |= OMAP_UART_SW_TX;
>   }
>   }
> +
> + if (dma && dma->tx_running) {
> + /*
> +  * TCSANOW requests the change to occur immediately, however
> +  * if we have a TX-DMA operation in progress then it has been
> +  * observed that it might stall and never complete. Therefore
> +  * we wait until DMA completes to prevent this hang from
> +  * happening.
> +  */
> +
> + dma->tx_running = 2;
> +
> + spin_unlock_irq(&up->port.lock);
> + wait_event_interruptible(priv->termios_wait,
> +  dma->tx_running == 3);
> + spin_lock_irq(&up->port.lock);
> + 

Re: [PATCH v3] Fix OpenSSH pty regression on close

2016-04-04 Thread Peter Hurley
On 03/06/2016 01:16 PM, Brian Bloniarz wrote:
> OpenSSH expects the (non-blocking) read() of pty master to return
> EAGAIN only if it has received all of the slave-side output after
> it has received SIGCHLD. This used to work on pre-3.12 kernels.
> 
> This fix effectively forces non-blocking read() and poll() to
> block for parallel i/o to complete for all ttys. It also unwinds
> these changes:
> 
> 1) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>tty: Fix pty master read() after slave closes
> 
> 2) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>pty, n_tty: Simplify input processing on final close
> 
> 3) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>    pty: Fix input race when closing

Reviewed-by: Peter Hurley 


Re: [PATCH v7 3/5] ACPI: parse SPCR and enable matching console

2016-04-04 Thread Peter Hurley
On 03/31/2016 06:40 AM, Aleksey Makarov wrote:
> 'ARM Server Base Boot Requiremets' [1] mentions SPCR (Serial Port
> Console Redirection Table) [2] as a mandatory ACPI table that
> specifies the configuration of serial console.
> 
> Defer initialization of DT earlycon until ACPI/DT decision is made.
> 
> Parse the ACPI SPCR table, setup earlycon if required,
> enable specified console.
> 
> Thanks to Peter Hurley for explaining how this should work.
> 
> [1] 
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0044a/index.html
> [2] 
> https://msdn.microsoft.com/en-us/library/windows/hardware/dn639132(v=vs.85).aspx

So for the code:

Reviewed-by: Peter Hurley 



However, I still have concerns regarding the license of this code
as GPL, when the Microsoft patent notice in [2] above only provides
two license options, neither of which are referred to in the
drivers/acpi/spcr.c file header and neither of which (I believe) are
GPL compatible.



> Signed-off-by: Aleksey Makarov 
> ---
>  drivers/acpi/Kconfig  |   3 ++
>  drivers/acpi/Makefile |   1 +
>  drivers/acpi/spcr.c   | 111 
> ++
>  drivers/tty/serial/earlycon.c |  19 +++-
>  include/linux/acpi.h  |   6 +++
>  include/linux/serial_core.h   |   6 +++
>  6 files changed, 144 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/acpi/spcr.c
> 
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 65fb483..5611eb6 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -77,6 +77,9 @@ config ACPI_DEBUGGER_USER
>  
>  endif
>  
> +config ACPI_SPCR_TABLE
> + bool
> +
>  config ACPI_SLEEP
>   bool
>   depends on SUSPEND || HIBERNATION
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index 7395928..f70ae14 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -82,6 +82,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)   += ec_sys.o
>  obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o
>  obj-$(CONFIG_ACPI_BGRT)  += bgrt.o
>  obj-$(CONFIG_ACPI_CPPC_LIB)  += cppc_acpi.o
> +obj-$(CONFIG_ACPI_SPCR_TABLE)+= spcr.o
>  obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
>  
>  # processor has its own "processor." module_param namespace
> diff --git a/drivers/acpi/spcr.c b/drivers/acpi/spcr.c
> new file mode 100644
> index 000..68ffc33
> --- /dev/null
> +++ b/drivers/acpi/spcr.c
> @@ -0,0 +1,111 @@
> +/*
> + * Copyright (c) 2012, Intel Corporation
> + * Copyright (c) 2015, Red Hat, Inc.
> + * Copyright (c) 2015, 2016 Linaro Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> +#define pr_fmt(fmt) "ACPI: SPCR: " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/**
> + * parse_spcr() - parse ACPI SPCR table and add preferred console
> + *
> + * @earlycon: set up earlycon for the console specified by the table
> + *
> + * For the architectures with support for ACPI, CONFIG_ACPI_SPCR_TABLE may be
> + * defined to parse ACPI SPCR table.  As a result of the parsing preferred
> + * console is registered and if @earlycon is true, earlycon is set up.
> + *
> + * When CONFIG_ACPI_SPCR_TABLE is defined, this function should should be 
> called
> + * from arch inintialization code as soon as the DT/ACPI decision is made.
> + *
> + */
> +int __init parse_spcr(bool earlycon)
> +{
> + static char opts[64];
> + struct acpi_table_spcr *table;
> + acpi_size table_size;
> + acpi_status status;
> + char *uart;
> + char *iotype;
> + int baud_rate;
> + int err;
> +
> + if (acpi_disabled)
> + return -ENODEV;
> +
> + status = acpi_get_table_with_size(ACPI_SIG_SPCR, 0,
> +   (struct acpi_table_header **)&table,
> +   &table_size);
> +
> + if (ACPI_FAILURE(status))
> + return -ENOENT;
> +
> + if (table->header.revision < 2) {
> + err = -ENOENT;
> + pr_err("wrong table version\n");
> + goto done;
> + }
> +
> + iotype = table->serial_port.space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY ?
> + "mmio" : "io";
> +
> + switch (table->interface_type) {
> + case ACPI_DBG2_ARM_SBSA_32BIT:
> + iotype = "mmio32";
> + /* fall through */
> + case ACPI_DBG2

Re: [PATCH v7 5/5] serial: pl011: add console matching function

2016-04-04 Thread Peter Hurley
On 03/31/2016 06:40 AM, Aleksey Makarov wrote:
> This patch adds function pl011_console_match() that implements
> method match of struct console.  It allows to match consoles against
> data specified in a string, for example taken from command line or
> compiled by ACPI SPCR table handler.

Reviewed-by: Peter Hurley 


Re: [PATCH v7 1/5] of/serial: move earlycon early_param handling to serial

2016-04-04 Thread Peter Hurley
On 03/31/2016 06:40 AM, Aleksey Makarov wrote:
> From: Leif Lindholm 
> 
> We have multiple "earlycon" early_param handlers - merge the DT one into
> the main earlycon one.  It's a cleanup that also will be useful
> to defer setting up DT console until ACPI/DT decision is made.
> 
> Rename the exported function to avoid clashing with the function from
> arch/microblaze/kernel/prom.c

Reviewed-by: Peter Hurley 


Re: [bisect] Merge tag 'mmc-v4.6' of git://git.linaro.org/people/ulf.hansson/mmc (was [GIT PULL] MMC for v.4.6)

2016-04-04 Thread Peter Hurley
On 04/04/2016 12:49 PM, Linus Torvalds wrote:
> On Mon, Apr 4, 2016 at 12:29 PM, Peter Hurley  
> wrote:
>>
>> Yeah, a straight revert of 520bd7a8b415 resumes normal service:
> 
> Ok, so we have that as an option.
> 
>> Should I send a proper revert?
> 
> Let's see if somebody can come up with a better serialization model.
> We're only at rc2, and we now have a fallback fix, let's wait a week
> or two to see if the mmc people can come up with something.
> 
> But maybe you can send me a revert patch around rc4 or so if the
> problem still remains, ok?

Will do.

Regards,
Peter Hurley



Re: tty: memory leak in tty_open

2016-04-04 Thread Peter Hurley
Hi Dmitry,

On 04/04/2016 01:56 AM, Dmitry Vyukov wrote:
> Hello,
> 
> The following program causes a memory leak:

Thanks for the report. This is the same problem fixed in 4.5
There was a merge snafu; Greg has the patch in his queue that
fixes this for 4.6

Regards,
Peter Hurley


> unreferenced object 0x8800643c3168 (size 2048):
>   comm "syz-executor", pid 29328, jiffies 4295358628 (age 32.380s)
>   hex dump (first 32 bytes):
> 01 54 00 00 0c 00 00 00 88 27 e1 67 00 88 ff ff  .T...'.g
> 88 1f 54 68 00 88 ff ff 40 b1 f0 86 ff ff ff ff  ..Th@...
>   backtrace:
> [] kmemleak_alloc+0x63/0xa0 mm/kmemleak.c:915
> [< inline >] kmemleak_alloc_recursive include/linux/kmemleak.h:47
> [< inline >] slab_post_alloc_hook mm/slab.h:406
> [< inline >] slab_alloc_node mm/slub.c:2602
> [< inline >] slab_alloc mm/slub.c:2610
> [] kmem_cache_alloc_trace+0x160/0x3d0 mm/slub.c:2627
> [< inline >] kmalloc include/linux/slab.h:478
> [< inline >] kzalloc include/linux/slab.h:622
> [] alloc_tty_struct+0x98/0x840 drivers/tty/tty_io.c:3156
> [] tty_init_dev+0x78/0x4b0 drivers/tty/tty_io.c:1532
> [< inline >] tty_open_by_driver drivers/tty/tty_io.c:2066
> [] tty_open+0xde1/0x1140 drivers/tty/tty_io.c:2114
> [] chrdev_open+0x22a/0x4c0 fs/char_dev.c:388
> [] do_dentry_open+0x6a2/0xcb0 fs/open.c:736
> [] vfs_open+0x17b/0x1f0 fs/open.c:853
> [< inline >] do_last fs/namei.c:3238
> [] path_openat+0x51bb/0x5ce0 fs/namei.c:3374
> [] do_filp_open+0x18e/0x250 fs/namei.c:3409
> [] do_sys_open+0x1fc/0x420 fs/open.c:1020
> [< inline >] SYSC_open fs/open.c:1038
> [] SyS_open+0x2d/0x40 fs/open.c:1033
> 
> 
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include 
> #include 
> #include 
> 
> void* thr(void* arg)
> {
>   syscall(SYS_open, "/dev/console", 0, 0, 0, 0);
>   return 0;
> }
> 
> int main()
> {
>   int i;
>   pthread_t th[4];
> 
>   for (i = 0; i < 4; i++)
> pthread_create(&th[i], 0, thr, 0);
>   usleep(100);
>   return 0;
> }
> 
> 
> On commit 05cf8077e54b20dddb756eaa26f3aeb5c38dd3cf (Apr 1).
> 



Re: [bisect] Merge tag 'mmc-v4.6' of git://git.linaro.org/people/ulf.hansson/mmc (was [GIT PULL] MMC for v.4.6)

2016-04-04 Thread Peter Hurley
On 04/04/2016 11:59 AM, Linus Torvalds wrote:
> On Mon, Apr 4, 2016 at 4:29 AM, Ulf Hansson  wrote:
>>
>> The commit that's likely to cause the regression is:
>> 520bd7a8b415 ("mmc: core: Optimize boot time by detecting cards
>> simultaneously").
> 
> Peter, mind testing if you can revert that and get the old behavior
> back? It seems to still revert cleanly, although I didn't check if the
> revert actually then builds..

Yeah, a straight revert of 520bd7a8b415 resumes normal service:

[2.710232] mmc0: host does not support reading read-only switch, assuming 
write-enable
[2.718437] mmc0: new high speed SDHC card at address e624
[2.724801] mmcblk0: mmc0:e624 SU08G 7.40 GiB
[2.730314]  mmcblk0: p1 p2
...
[2.808938] mmc1: new high speed MMC card at address 0001
[2.816352] mmcblk1: mmc1:0001 MMC04G 3.60 GiB
[2.822075] mmcblk1boot0: mmc1:0001 MMC04G partition 1 2.00 MiB
[2.829014] mmcblk1boot1: mmc1:0001 MMC04G partition 2 2.00 MiB
[2.842600]  mmcblk1: p1 p2

Should I send a proper revert?


>> This commit further enables asynchronous detection of (e)MMC/SD/SDIO
>> cards, by converting from an *ordered* work-queue to a *non-ordered*
>> work-queue for card detection.
>>
>> Although, one should know that there have *never* been any guarantees
>> to get a fixed mmcblk id for a card. I expect that's what has been
>> assumed here.
> 
> So quite frankly, for the whole "no regressions" issue, "documented
> behavior" simply isn't an issue. It doesn't matter one whit or not if
> something has been documented: if it has worked and people have
> depended on it, it's what we in the industry call "reality".
> 
> And reality trumps documentation. Every time.
> 
> So it sounds like either that just needs to be reverted, or some other
> way to get reliable device naming needs to happen.
> 
> So the *simple* model is to just scan the devices minimally serially,
> and allocate the names at that point (so the names are reliable
> between boots for the same hardware configuration). And then do the
> more expensive device setup asynchronously (ie querying device
> information, spinning up disks, whatever - things that can take
> anything from milliseonds to several seconds, because they are doing
> actual IO). So you'd do some very basic (and _often_ fairly quick)
> operations serially, but then try to do the expensive parts
> concurrently.
> 
> The SCSI layer actually goes a bit further than that: it has a fairly
> asynchronous scanning thing, but it does allocate the actual host
> device nodes serially, and then it even has an ordered list of
> "scanning_hosts" that is used to complete the scanning in-order, so
> that the sysfs devices show up in the right order even if things
> actually got scanned out-of-order. So scans that finished early will
> wait for other scans that are for "earlier" devices, and you end up
> with what *looks* ordered to the outside, even if internally it was
> all done out-of-order.
> 
> So there are multiple approaches to handling this, while still
> allowing fairly asynchronous IO.
> 
>  Linus
> 



  1   2   3   4   5   6   7   8   9   10   >