from:"Joakim Tjernlund"

Re: [RFC PATCH 0/4] Remove some e300/MPC83xx evaluation platforms

2023-02-28 Thread Joakim Tjernlund

On Mon, 2023-02-27 at 14:48 -0600, Li Yang wrote:
> On Sat, Feb 25, 2023 at 10:52 AM Paul Gortmaker
>  wrote:
> > 
> > [RE: [RFC PATCH 0/4] Remove some e300/MPC83xx evaluation platforms] On 
> > 24/02/2023 (Fri 21:16) Leo Li wrote:
> > 
> > > 
> > > 
> > > > -Original Message-
> > > > From: Paul Gortmaker 
> > > > Sent: Monday, February 20, 2023 5:59 AM
> > > > To: linuxppc-dev@lists.ozlabs.org
> > > > Cc: Leo Li ; Claudiu Manoil 
> > > > ;
> > > > Paul Gortmaker ; Scott Wood
> > > > ; Michael Ellerman ; Benjamin
> > > > Herrenschmidt ; Paul Mackerras
> > > > 
> > > > Subject: [RFC PATCH 0/4] Remove some e300/MPC83xx evaluation platforms
> > > > 
> > > > [This RFC is proposed for v6.4 and hence is based off linux-next.]
> > > > 
> > > > This series removes support for four e300 (MPC83xx) Freescale processor
> > > > family evaluation boards that were added to the kernel in the 2006 era.
> > > 
> > > Hi Paul,
> > > 
> > > I talked with our marketing team on this.  Although we do not recommend 
> > > any new design with these SoCs, they are still being shipped in large 
> > > amount to customers now.  Plus it is possible for the bigger amount of 
> > > existing devices to be updating their software that includes a new 
> > > kernel.  So we should definitely keep all the common SoC code that might 
> > > be needed to support their own design.
> > 
> > Thanks for confirming with your marketing team that they "do not
> > recommend any new design with these SoCs" -- it also confirms the
> > information I read on the web pages for the platforms.
> > 
> > As those of us immersed in this world all know from the 101 basics of
> > Product Life Cycle lessons, it doesn't matter if it is a phone or a
> > set-top-box/PVR or whatever else kind of non-PC consumer device --
> > kernel uprevs never happen in that product space.
> 
> One thing is that the QorIQ platforms are not for the consumer
> devices.  They are mostly used in networking or communication
> equipment.  I think their product life cycle would be more like the
> server or data center scenario.
> 
> Regards,
> Leo
> > 
> > So with the best interests of the mainline kernel in mind, I think we
> > are good to proceed with this for summer 2023.  And of course as I've
> > said many times before - the kernel is in git, so really you can't
> > delete anything anyway - it remains in history forever.
> > 
> > Thanks,
> > Paul.
> > --
> > 
> > > 
> > > > 
> > > > These boards were all of a very similar form factor, a largish PCI or 
> > > > PCI-X card
> > > > that could also be used standalone with an external power brick, and all
> > > > shared the Modular Development System (MDS) designation.
> > > > 
> > > > These platforms were made in limited quantity and were generally 
> > > > designed
> > > > to get early silicon into the hands of OEMs who would later develop 
> > > > their
> > > > own boards/platforms.  As such, availability was limited to those who 
> > > > would
> > > > be working on boards and/or BSP support.
> > > > 
> > > > Many early revision MDS platforms used a mechanical clamping system to
> > > > hold the BGA CPU in place to facilitate CPU updates -- something not
> > > > normally possible for a soldered down BGA in a COTS system.
> > > > 
> > > > The point of these details is to give context that reflects that these 
> > > > four
> > > > boards were made in limited quantities, were not in a form factor that 
> > > > is
> > > > really "hobbyist" friendly and hence make sense for removal 17 years 
> > > > later.
> > > 
> > > We would agree with you that the MDS platforms are only used by a limited 
> > > number of customers for evaluation purpose, so it should be fine to be 
> > > removed.  So for this series:
> > > 
> > > Acked-by: Li Yang 
> > > 
> > > > 
> > > > Here, we remove the MPC8548E-MDS[1], the MPC8360E-MDS[2], the
> > > > MPC837xE-MDS[3], and the MPC832x-MDS[4] board support from the kernel.
> > > > 
> > > > There will still exist several e300 Freescale Reference Design System 
> > > > (RDS)
> > > > boards[5] and mini-ITX boards[6] with support in the kernel.  While 
> > > > these
> > > > were more of a COTS "ready to deploy" design more suited to hobbyists, 
> > > > it
> > > > probably makes sense to consider removing these as well, based on age.
> > > 
> > > These boards are mass market boards that sold in larger amount and are 
> > > much more likely to still be used.  We would suggest we keep them for now.

Agreed, the RDS design is still used here.

> >

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2022-02-18 Thread Joakim Tjernlund

I was happy with commit msgs and I don't know what the criticism was.

From: gre...@linuxfoundation.org 
Sent: 18 February 2022 11:39
To: Joakim Tjernlund
Cc: Thorsten Leemhuis; Leo Li; eugene_bordenkirc...@selinc.com; 
linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; ba...@kernel.org
Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to 
unrecoverable loop.

On Fri, Feb 18, 2022 at 10:21:12AM +, Joakim Tjernlund wrote:
> I think you could apply them as is, only criticism was the commit msgs.

That is always a good reason to reject a change.  Please resubmit them
with the commit message cleaned up and I will be glad to review it
again.

thanks,

greg k-h

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2022-02-18 Thread Joakim Tjernlund

I think you could apply them as is, only criticism was the commit msgs.

 Jocke


From: Thorsten Leemhuis 
Sent: 18 February 2022 08:11
To: Leo Li; Joakim Tjernlund; eugene_bordenkirc...@selinc.com; 
linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org
Cc: gre...@linuxfoundation.org; ba...@kernel.org
Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to 
unrecoverable loop.

Hi, this is your Linux kernel regression tracker speaking. Top-posting
for once, to make this easy accessible to everyone.

Sadly it looks to me like nobody is going to address this (quite old)
regression (that afaic only very few people will hit), despite the rough
patch to fix it that was already posted and tested in this thread.

Well, guess that's how it is sometimes. Marking it as "on back burner"
in regzbot to reduce the noise there:

#regzbot backburner: Tested patch available, but things nevertheless got
stuck

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

#regzbot poke



On 20.01.22 13:54, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker speaking.
>
> On 04.12.21 01:40, Leo Li wrote:
>>> -Original Message-
>>> From: Joakim Tjernlund 
>>> Sent: Thursday, December 2, 2021 4:45 PM
>>> To: regressi...@leemhuis.info; Leo Li ;
>>> eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org; linuxppc-
>>> d...@lists.ozlabs.org
>>> Cc: gre...@linuxfoundation.org; ba...@kernel.org
>>> Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to
>>> unrecoverable loop.
>>>
>>> On Thu, 2021-12-02 at 20:35 +, Leo Li wrote:
>>>>
>>>>> -Original Message-
>>>>> From: Joakim Tjernlund 
>>>>> Sent: Wednesday, December 1, 2021 8:19 AM
>>>>> To: regressi...@leemhuis.info; Leo Li ;
>>>>> eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org;
>>>>> linuxppc- d...@lists.ozlabs.org
>>>>> Cc: gre...@linuxfoundation.org; ba...@kernel.org
>>>>> Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list
>>>>> leads to unrecoverable loop.
>>>>>
>>>>> On Tue, 2021-11-30 at 12:56 +0100, Joakim Tjernlund wrote:
>>>>>> On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
>>>>>>> Agreed,
>>>>>>>
>>>>>>> We are happy pick up the torch on this, but I'd like to try and
>>>>>>> hear from
>>>>> Joakim first before we do.  The patch set is his, so I'd like to
>>>>> give him the opportunity.  I think he's the only one that can add a
>>>>> truly proper description as well because he mentioned that this
>>>>> includes a "few more fixes" than just the one we ran into.  I'd
>>>>> rather hear from him than try to reverse engineer what was being
>>> addressed.
>>>>>>>
>>>>>>> Joakim, if you are still watching the thread, would you like to
>>>>>>> take a stab
>>>>> at it?  If I don't hear from you in a couple days, we'll pick up the
>>>>> torch and do what we can.
>
> Did anything happen? Sure, it's a old regression from the v3.4-rc4 days,
> but there iirc was already a tested proto-patch in that thread that
> fixes the issue. Or was progress made and I just missed it?
>
> Ciao, Thorsten
>
> P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
> on my table. I can only look briefly into most of them. Unfortunately
> therefore I sometimes will get things wrong or miss something important.
> I hope that's not the case here; if you think it is, don't hesitate to
> tell me about it in a public reply, that's in everyone's interest.
>
> BTW, I have no personal interest in this issue, which is tracked using
> regzbot, my Linux kernel regression tracking bot
> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinux-regtracking.leemhuis.info%2Fregzbot%2Fdata=04%7C01%7Cjoakim.tjernlund%40infinera.com%7C8784242cb55d4627e61608d9f2adec23%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637807651100768999%7CUnknown%7CTWFpbGZsb3d8ey

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-12-02 Thread Joakim Tjernlund

On Thu, 2021-12-02 at 20:35 +, Leo Li wrote:
> 
> > -Original Message-
> > From: Joakim Tjernlund 
> > Sent: Wednesday, December 1, 2021 8:19 AM
> > To: regressi...@leemhuis.info; Leo Li ;
> > eugene_bordenkirc...@selinc.com; linux-...@vger.kernel.org; linuxppc-
> > d...@lists.ozlabs.org
> > Cc: gre...@linuxfoundation.org; ba...@kernel.org
> > Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to
> > unrecoverable loop.
> > 
> > On Tue, 2021-11-30 at 12:56 +0100, Joakim Tjernlund wrote:
> > > On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
> > > > Agreed,
> > > > 
> > > > We are happy pick up the torch on this, but I'd like to try and hear 
> > > > from
> > Joakim first before we do.  The patch set is his, so I'd like to give him 
> > the
> > opportunity.  I think he's the only one that can add a truly proper 
> > description
> > as well because he mentioned that this includes a "few more fixes" than just
> > the one we ran into.  I'd rather hear from him than try to reverse engineer
> > what was being addressed.
> > > > 
> > > > Joakim, if you are still watching the thread, would you like to take a 
> > > > stab
> > at it?  If I don't hear from you in a couple days, we'll pick up the torch 
> > and do
> > what we can.
> > > > 
> > > 
> > > I am far away from this now and still on 4.19. I don't mind if you tweak
> > tweak the patches for better "upstreamability"
> > 
> > Even better would be to migrate to the chipidea driver, I am told just a few
> > tweaks are needed but this is probably something NXP should do as they
> > have access to other SOC's using chipidea.
> 
> I agree with this direction but the problem was with bandwidth.  As this 
> controller was only used on legacy platforms, it is harder to justify new 
> effort on it now.
> 

Legacy? All PPC is legacy and not supported now? 

  Jocke

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-12-01 Thread Joakim Tjernlund

On Tue, 2021-11-30 at 12:56 +0100, Joakim Tjernlund wrote:
> On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
> > Agreed,
> > 
> > We are happy pick up the torch on this, but I'd like to try and hear from 
> > Joakim first before we do.  The patch set is his, so I'd like to give him 
> > the opportunity.  I think he's the only one that can add a truly proper 
> > description as well because he mentioned that this includes a "few more 
> > fixes" than just the one we ran into.  I'd rather hear from him than try to 
> > reverse engineer what was being addressed.  
> > 
> > Joakim, if you are still watching the thread, would you like to take a stab 
> > at it?  If I don't hear from you in a couple days, we'll pick up the torch 
> > and do what we can.
> > 
> 
> I am far away from this now and still on 4.19. I don't mind if you tweak 
> tweak the patches for better "upstreamability" 

Even better would be to migrate to the chipidea driver, I am told just a few 
tweaks are needed but this is probably
something NXP should do as they have access to other SOC's using chipidea.

Leo ?

 Joakim


> 
>   Regards
>Joakim
> 
> > Eugene T. Bordenkircher

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-11-30 Thread Joakim Tjernlund

On Mon, 2021-11-29 at 23:48 +, Eugene Bordenkircher wrote:
> Agreed,
> 
> We are happy pick up the torch on this, but I'd like to try and hear from 
> Joakim first before we do.  The patch set is his, so I'd like to give him the 
> opportunity.  I think he's the only one that can add a truly proper 
> description as well because he mentioned that this includes a "few more 
> fixes" than just the one we ran into.  I'd rather hear from him than try to 
> reverse engineer what was being addressed.  
> 
> Joakim, if you are still watching the thread, would you like to take a stab 
> at it?  If I don't hear from you in a couple days, we'll pick up the torch 
> and do what we can.
> 

I am far away from this now and still on 4.19. I don't mind if you tweak tweak 
the patches for better "upstreamability" 

  Regards
   Joakim

> Eugene T. Bordenkircher
> 
> -Original Message-
> From: Leo Li  
> Sent: Monday, November 29, 2021 3:37 PM
> To: Eugene Bordenkircher ; Thorsten Leemhuis 
> ; jo...@infinera.com 
> ; linuxppc-dev@lists.ozlabs.org; 
> linux-...@vger.kernel.org
> Cc: gre...@linuxfoundation.org; ba...@kernel.org
> Subject: RE: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to 
> unrecoverable loop.
> 
> [Caution - External]
> 
> > -Original Message-
> > From: Eugene Bordenkircher 
> > Sent: Monday, November 29, 2021 11:25 AM
> > To: Thorsten Leemhuis ; jo...@infinera.com 
> > ; linuxppc-dev@lists.ozlabs.org; linux- 
> > u...@vger.kernel.org
> > Cc: Leo Li ; gre...@linuxfoundation.org; 
> > ba...@kernel.org
> > Subject: RE: bug: usb: gadget: FSL_UDC_CORE Corrupted request list 
> > leads to unrecoverable loop.
> > 
> > The final result of our testing is that the patch set posted seems to 
> > address all known defects in the Linux kernel.  The mentioned 
> > additional problems are entirely caused by the antivirus solution on 
> > the windows box.  The antivirus solution blocks the disconnect 
> > messages from reaching the RNDIS driver so it has no idea the USB 
> > device went away.  There is nothing we can do to address this in the Linux 
> > kernel.
> 
> Thanks for the confirmation.
> 
> > 
> > I propose we move forward with the patchset.
> 
> I think that we should proceed to merge the patchset but it seems to need 
> some cleanup for coding style issues and better description before submitted 
> formally.
> 
> > 
> > Eugene T. Bordenkircher
> > 
> > -Original Message-
> > From: Thorsten Leemhuis 
> > Sent: Thursday, November 25, 2021 5:59 AM
> > To: Eugene Bordenkircher ; Thorsten 
> > Leemhuis ; Joakim Tjernlund 
> > ; linuxppc-dev@lists.ozlabs.org; linux- 
> > u...@vger.kernel.org
> > Cc: leoyang...@nxp.com; gre...@linuxfoundation.org; ba...@kernel.org
> > Subject: Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list 
> > leads to unrecoverable loop.
> > 
> > Hi, this is your Linux kernel regression tracker speaking.
> > 
> > Top-posting for once, to make this easy to process for everyone:
> > 
> > Li Yang and Felipe Balbi: how to move on with this? It's quite an old 
> > regression, but nevertheless it is one and thus should be fixed. Part 
> > of my position is to make that happen and thus remind developers and 
> > maintainers about this until the regression is resolved.
> > 
> > Ciao, Thorsten
> > 
> > On 16.11.21 20:11, Eugene Bordenkircher wrote:
> > > On 02.11.21 22:15, Joakim Tjernlund wrote:
> > > > On Sat, 2021-10-30 at 14:20 +, Joakim Tjernlund wrote:
> > > > > On Fri, 2021-10-29 at 17:14 +, Eugene Bordenkircher wrote:
> > > > 
> > > > > > We've discovered a situation where the FSL udc driver
> > (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating 
> > over the request queue, but the queue has been corrupted at some point 
> > so it loops infinitely.  I believe we have narrowed into the offending 
> > code, but we are in need of assistance trying to find an appropriate 
> > fix for the problem.  The identified code appears to be in all 
> > versions of the Linux kernel the driver exists in.
> > > > > > 
> > > > > > The problem appears to be when handling a USB_REQ_GET_STATUS
> > request.  The driver gets this request and then calls the 
> > ch9getstatus() function.  In this function, it starts a request by 
> > "borrowing" the per device status_req, filling it in, and then queuing 
> > it with a call to list_add_tail() to add

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-11-02 Thread Joakim Tjernlund

On Sat, 2021-10-30 at 14:20 +, Joakim Tjernlund wrote:
> On Fri, 2021-10-29 at 17:14 +, Eugene Bordenkircher wrote:
> > Hello all,
> > 
> > We've discovered a situation where the FSL udc driver 
> > (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over 
> > the request queue, but the queue has been corrupted at some point so it 
> > loops infinitely.  I believe we have narrowed into the offending code, but 
> > we are in need of assistance trying to find an appropriate fix for the 
> > problem.  The identified code appears to be in all versions of the Linux 
> > kernel the driver exists in.
> > 
> > The problem appears to be when handling a USB_REQ_GET_STATUS request.  The 
> > driver gets this request and then calls the ch9getstatus() function.  In 
> > this function, it starts a request by "borrowing" the per device 
> > status_req, filling it in, and then queuing it with a call to 
> > list_add_tail() to add the request to the endpoint queue.  Right before it 
> > exits the function however, it's calling ep0_prime_status(), which is 
> > filling out that same status_req structure and then queuing it with another 
> > call to list_add_tail() to add the request to the endpoint queue.  This 
> > adds two instances of the exact same LIST_HEAD to the endpoint queue, which 
> > breaks the list since the prev and next pointers end up pointing to the 
> > wrong things.  This ends up causing a hard loop the next time nuke() gets 
> > called, which happens on the next setup IRQ.
> > 
> > I'm not sure what the appropriate fix to this problem is, mostly due to my 
> > lack of expertise in USB and this driver stack.  The code has been this way 
> > in the kernel for a very long time, which suggests that it has been 
> > working, unless USB_REQ_GET_STATUS requests are never made.  This further 
> > suggests that there is something else going on that I don't understand.  
> > Deleting the call to ep0_prime_status() and the following ep0stall() call 
> > appears, on the surface, to get the device working again, but may have side 
> > effects that I'm not seeing.
> > 
> > I'm hopeful someone in the community can help provide some information on 
> > what I may be missing or help come up with a solution to the problem.  A 
> > big thank you to anyone who would like to help out.
> > 
> > Eugene
> 
> Run into this to a while ago. Found the bug and a few more fixes.
> This is against 4.19 so you may have to tweak them a bit.
> Feel free to upstream them.
> 
>  Jocke 

Curious, did my patches help? Good to known once we upgrade as well.

 Jocke

Re: bug: usb: gadget: FSL_UDC_CORE Corrupted request list leads to unrecoverable loop.

2021-10-30 Thread Joakim Tjernlund

On Fri, 2021-10-29 at 17:14 +, Eugene Bordenkircher wrote:
> Hello all,
> 
> We've discovered a situation where the FSL udc driver 
> (drivers/usb/gadget/udc/fsl_udc_core.c) will enter a loop iterating over the 
> request queue, but the queue has been corrupted at some point so it loops 
> infinitely.  I believe we have narrowed into the offending code, but we are 
> in need of assistance trying to find an appropriate fix for the problem.  The 
> identified code appears to be in all versions of the Linux kernel the driver 
> exists in.
> 
> The problem appears to be when handling a USB_REQ_GET_STATUS request.  The 
> driver gets this request and then calls the ch9getstatus() function.  In this 
> function, it starts a request by "borrowing" the per device status_req, 
> filling it in, and then queuing it with a call to list_add_tail() to add the 
> request to the endpoint queue.  Right before it exits the function however, 
> it's calling ep0_prime_status(), which is filling out that same status_req 
> structure and then queuing it with another call to list_add_tail() to add the 
> request to the endpoint queue.  This adds two instances of the exact same 
> LIST_HEAD to the endpoint queue, which breaks the list since the prev and 
> next pointers end up pointing to the wrong things.  This ends up causing a 
> hard loop the next time nuke() gets called, which happens on the next setup 
> IRQ.
> 
> I'm not sure what the appropriate fix to this problem is, mostly due to my 
> lack of expertise in USB and this driver stack.  The code has been this way 
> in the kernel for a very long time, which suggests that it has been working, 
> unless USB_REQ_GET_STATUS requests are never made.  This further suggests 
> that there is something else going on that I don't understand.  Deleting the 
> call to ep0_prime_status() and the following ep0stall() call appears, on the 
> surface, to get the device working again, but may have side effects that I'm 
> not seeing.
> 
> I'm hopeful someone in the community can help provide some information on 
> what I may be missing or help come up with a solution to the problem.  A big 
> thank you to anyone who would like to help out.
> 
> Eugene

Run into this to a while ago. Found the bug and a few more fixes.
This is against 4.19 so you may have to tweak them a bit.
Feel free to upstream them.

 Jocke 
From a7ed9cffbfc90371b570ebef698d96c39adbaf77 Mon Sep 17 00:00:00 2001
From: Joakim Tjernlund 
Date: Mon, 11 May 2020 11:18:14 +0200
Subject: [PATCH 5/5] fsl_udc_core: Init max_pipes for reset_queues()

Signed-off-by: Joakim Tjernlund 
---
 drivers/usb/gadget/udc/fsl_udc_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c
index bd3825d9f1d2..92136dff8373 100644
--- a/drivers/usb/gadget/udc/fsl_udc_core.c
+++ b/drivers/usb/gadget/udc/fsl_udc_core.c
@@ -2441,6 +2441,7 @@ static int fsl_udc_probe(struct platform_device *pdev)
 	/* Get max device endpoints */
 	/* DEN is bidirectional ep number, max_ep doubles the number */
 	udc_controller->max_ep = (dccparams & DCCPARAMS_DEN_MASK) * 2;
+	udc_controller->max_pipes = udc_controller->max_ep;
 
 	udc_controller->irq = platform_get_irq(pdev, 0);
 	if (!udc_controller->irq) {
-- 
2.32.0

From b98fa0dd384f17fee0c1283b91f855b97d1976f4 Mon Sep 17 00:00:00 2001
From: Joakim Tjernlund 
Date: Mon, 11 May 2020 10:38:07 +0200
Subject: [PATCH 4/5] fsl_udc_stop: Use list_for_each_entry_safe() when
 deleting

Signed-off-by: Joakim Tjernlund 
---
 drivers/usb/gadget/udc/fsl_udc_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c
index 4f835332af45..bd3825d9f1d2 100644
--- a/drivers/usb/gadget/udc/fsl_udc_core.c
+++ b/drivers/usb/gadget/udc/fsl_udc_core.c
@@ -1984,7 +1984,7 @@ static int fsl_udc_start(struct usb_gadget *g,
 /* Disconnect from gadget driver */
 static int fsl_udc_stop(struct usb_gadget *g)
 {
-	struct fsl_ep *loop_ep;
+	struct fsl_ep *loop_ep, *tmp_loop;
 	unsigned long flags;
 
 	if (!IS_ERR_OR_NULL(udc_controller->transceiver))
@@ -2002,8 +2002,8 @@ static int fsl_udc_stop(struct usb_gadget *g)
 	spin_lock_irqsave(_controller->lock, flags);
 	udc_controller->gadget.speed = USB_SPEED_UNKNOWN;
 	nuke(_controller->eps[0], -ESHUTDOWN);
-	list_for_each_entry(loop_ep, _controller->gadget.ep_list,
-			ep.ep_list)
+	list_for_each_entry_safe(loop_ep, tmp_loop, _controller->gadget.ep_list,
+ ep.ep_list)
 		nuke(loop_ep, -ESHUTDOWN);
 	spin_unlock_irqrestore(_controller->lock, flags);
 
-- 
2.32.0

From a90a89d06bd008f606404ec613b4f2343b9dda1a Mon Sep 17 00:00:00 2001
From: Joakim Tjernlund 
Date: Thu, 7 May 2020 22:35:14 +0200
Subject: [PATCH 3/5] fsl_ep_dequeue

Signed-off-by: Joakim Tjern

Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Joakim Tjernlund

On Wed, 2021-05-19 at 10:22 -0500, Segher Boessenkool wrote:
> On Wed, May 19, 2021 at 03:06:49PM +0000, Joakim Tjernlund wrote:
> > On Wed, 2021-05-19 at 09:38 -0500, Segher Boessenkool wrote:
> > > On Wed, May 19, 2021 at 06:42:40PM +1000, Nicholas Piggin wrote:
> > > > Excerpts from Joakim Tjernlund's message of May 19, 2021 6:08 pm:
> > > > > I always figured the ppc way was superior. It begs the question if 
> > > > > not the other archs should
> > > > > change instead?
> > > > 
> > > > It is superior in some ways, not enough to be worth being different.
> > > 
> > > The PowerPC syscall ABI *requires* using cr0.3 for indicating errors,
> > > you will have to do that whether you conflate the concepts of return
> > > code and error indicator or not!
> > > 
> > > > Other archs are unlikely to change because it would be painful for
> > > > not much benefit.
> > > 
> > > Other archs cannot easily change for much the same reason :-)
> > 
> > Really? I figured you could just add extra error indication in kernel 
> > syscall I/F.
> > Eventually user space could migrate to the new indication.
> 
> You seem to assume all user space uses glibc, or *any* libc even?  This
> is false.  Some programs do system calls directly.  Do not break the
> kernel ABI :-)

Adding an extra indicator does not break ABI, does it ?
W.r.t breaking ABI, isn't that what PowerPC is trying to do with the new 
syscall I/F? 

 Jocke

Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Joakim Tjernlund

On Wed, 2021-05-19 at 09:38 -0500, Segher Boessenkool wrote:
> On Wed, May 19, 2021 at 06:42:40PM +1000, Nicholas Piggin wrote:
> > Excerpts from Joakim Tjernlund's message of May 19, 2021 6:08 pm:
> > > I always figured the ppc way was superior. It begs the question if not 
> > > the other archs should
> > > change instead?
> > 
> > It is superior in some ways, not enough to be worth being different.
> 
> The PowerPC syscall ABI *requires* using cr0.3 for indicating errors,
> you will have to do that whether you conflate the concepts of return
> code and error indicator or not!
> 
> > Other archs are unlikely to change because it would be painful for
> > not much benefit.
> 
> Other archs cannot easily change for much the same reason :-)

Really? I figured you could just add extra error indication in kernel syscall 
I/F.
Eventually user space could migrate to the new indication.

 Jocke

Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Joakim Tjernlund

On Wed, 2021-05-19 at 17:55 +1000, Nicholas Piggin wrote:
> Excerpts from Joakim Tjernlund's message of May 19, 2021 5:33 pm:
> > On Wed, 2021-05-19 at 02:13 +0300, Dmitry V. Levin wrote:
> > > Hi,
> > > 
> > > On Thu, Jun 11, 2020 at 06:12:01PM +1000, Nicholas Piggin wrote:
> > > [...]
> > > > - Error handling: The consensus among kernel, glibc, and musl is to 
> > > > move to
> > > >   using negative return values in r3 rather than CR0[SO]=1 to indicate 
> > > > error,
> > > >   which matches most other architectures, and is closer to a function 
> > > > call.
> > 
> > What about syscalls like times(2) which can return -1 without it being an 
> > error?
> 
> They do become errors / indistinguishable and have to be dealt with by 
> libc or userspace. Which does follow what most architectures do (all 
> except ia64, mips, sparc, and powerpc actually).
> 
> Interesting question though, it should have been noted.
> 
> Thanks,
> Nick

I always figured the ppc way was superior. It begs the question if not the 
other archs should
change instead?

 Jocke

Re: Linux powerpc new system call instruction and ABI

2021-05-19 Thread Joakim Tjernlund

On Wed, 2021-05-19 at 02:13 +0300, Dmitry V. Levin wrote:
> Hi,
> 
> On Thu, Jun 11, 2020 at 06:12:01PM +1000, Nicholas Piggin wrote:
> [...]
> > - Error handling: The consensus among kernel, glibc, and musl is to move to
> >   using negative return values in r3 rather than CR0[SO]=1 to indicate 
> > error,
> >   which matches most other architectures, and is closer to a function call.

What about syscalls like times(2) which can return -1 without it being an error?

 Jocke

Re: [PATCH v3 0/4] P2040/P2041 i2c recovery erratum

2021-05-12 Thread Joakim Tjernlund

On Wed, 2021-05-12 at 01:48 +, Chris Packham wrote:
> On 12/05/21 10:10 am, Joakim Tjernlund wrote:
> > On Wed, 2021-05-12 at 09:20 +1200, Chris Packham wrote:
> > > The P2040/P2041 has an erratum where the i2c recovery scheme
> > > documented in the reference manual (and currently implemented
> > > in the i2c-mpc.c driver) does not work. The errata document
> > > provides an alternative that does work. This series implements
> > > that alternative and uses a property in the devicetree to
> > > decide when the alternative mechanism is needed.
> > > 
> > > Chris Packham (4):
> > >    dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
> > >    powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
> > >  controllers
> > >    powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c
> > >  controllers
> > >    i2c: mpc: implement erratum A-004447 workaround
> > > 
> > >   .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
> > >   arch/powerpc/boot/dts/fsl/p1010si-post.dtsi   |  8 ++
> > >   arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
> > >   drivers/i2c/busses/i2c-mpc.c  | 81 ++-
> > >   4 files changed, 110 insertions(+), 2 deletions(-)
> > > 
> > This now reminds me about the current I2C reset procedure, it didn't work 
> > for us and I came up with this one:
> >    
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Flinux-i2c%2Fmsg29490.htmldata=04%7C01%7CJoakim.Tjernlund%40infinera.com%7Cb85a6e9c3c8b469572da08d914e816b5%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637563809322419998%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=4cwujNmAVlBa08Tt79hLYGJfJtn7wdz1Kgz0eW2VX9U%3Dreserved=0
> > it never got in but we are still using it.
> 
> For those reading along the v2 mentioned in that thread was posted as 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-i2c%2F20170511122033.22471-1-joakim.tjernlund%40infinera.com%2Fdata=04%7C01%7CJoakim.Tjernlund%40infinera.com%7Cb85a6e9c3c8b469572da08d914e816b5%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637563809322419998%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=YDTX5L6J2ocHep5XutVN46jUpvJj7h1aDbHHwMqlrAs%3Dreserved=0
>  
> there was a bit of discussion but it seemed to die out without reaching 
> a conclusion.
> 
> The i2c-mpc driver is now using the generic recovery mechanism so that 
> addresses one bit of feedback from the original thread.
> 
> I do wonder if the reason the recovery wasn't working for your case was 
> because of the erratum. Do you happen to remember which SoC your issue 
> was on?

It could only be P2010 or MPC8321, I think it was MPC8321, you could try my 
solution on your
CPU if you want to make sure.

> 
> I've been doing my recent work with a P2040 and prior to that I did test 
> out the recovery on a T2081 (which isn't documented to have this 
> erratum) when I was re-working the driver. The "new" recovery actually 
> seems better but I don't have a reliably faulty i2c device so that's 
> only based on me writing some code to manually trigger the recovery 
> (using the snippet below) and observing it with an oscilloscope.

You don't need a faulty device, just an aborted I2C read/write op.
You could force one such I2C op. by py pulling down the clock/SDA in the middle 
of a byte transfer.

 Jocke

Re: [PATCH v3 0/4] P2040/P2041 i2c recovery erratum

2021-05-11 Thread Joakim Tjernlund

On Wed, 2021-05-12 at 09:20 +1200, Chris Packham wrote:
> The P2040/P2041 has an erratum where the i2c recovery scheme
> documented in the reference manual (and currently implemented
> in the i2c-mpc.c driver) does not work. The errata document
> provides an alternative that does work. This series implements
> that alternative and uses a property in the devicetree to
> decide when the alternative mechanism is needed.
> 
> Chris Packham (4):
>   dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
>   powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
> controllers
>   powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c
> controllers
>   i2c: mpc: implement erratum A-004447 workaround
> 
>  .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
>  arch/powerpc/boot/dts/fsl/p1010si-post.dtsi   |  8 ++
>  arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
>  drivers/i2c/busses/i2c-mpc.c  | 81 ++-
>  4 files changed, 110 insertions(+), 2 deletions(-)
> 

This now reminds me about the current I2C reset procedure, it didn't work for 
us and I came up with this one:
  https://www.spinics.net/lists/linux-i2c/msg29490.html
it never got in but we are still using it.

  Jocke

Re: [PATCH v2 3/3] i2c: mpc: implement erratum A-004447 workaround

2021-05-07 Thread Joakim Tjernlund

On Fri, 2021-05-07 at 14:46 +0300, Andy Shevchenko wrote:
> On Fri, May 7, 2021 at 3:40 AM Chris Packham
>  wrote:
> > 
> > The P2040/P2041 has an erratum where the normal i2c recovery mechanism
> > does not work. Implement the alternative recovery mechanism documented
> > in the P2040 Chip Errata Rev Q.
> 
> Thanks.
> 
> > +static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
> > +{
> > +   int ret;
> > +   u8 val;
> > +
> > +   ret = readb_poll_timeout(i2c->base + MPC_I2C_SR, val,
> > +val & mask, 0, 100);
> > +
> > +   return ret;
> > +}
> 
> So, now you may shrink it even further, i.e.
> 
>    void __iomem *sr = i2c->base + MPC_I2C_SR;
>    u8 val;
> 
>    return readb_poll_timeout(sr, val, val & mask, 0, 100);
> 

val looks uninitialised before use?

Re: [PATCH v2 2/3] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-07 Thread Joakim Tjernlund

On Fri, 2021-05-07 at 10:04 +0200, Joakim Tjernlund wrote:
> On Fri, 2021-05-07 at 12:40 +1200, Chris Packham wrote:
> > The i2c controllers on the P2040/P2041 have an erratum where the
> > documented scheme for i2c bus recovery will not work (A-004447). A
> > different mechanism is needed which is documented in the P2040 Chip
> > Errata Rev Q (latest available at the time of writing).
> 
> From what I can tell this Erratum also applies to P1010
> 
>  Jocke

Reference: https://media.digikey.com/pdf/PCNs/Freescale/P1010CE_RevL.pdf

Also, I think this series should go to stable.

 Jocke

Re: [PATCH v2 2/3] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-07 Thread Joakim Tjernlund

On Fri, 2021-05-07 at 12:40 +1200, Chris Packham wrote:
> The i2c controllers on the P2040/P2041 have an erratum where the
> documented scheme for i2c bus recovery will not work (A-004447). A
> different mechanism is needed which is documented in the P2040 Chip
> Errata Rev Q (latest available at the time of writing).

From what I can tell this Erratum also applies to P1010

 Jocke

Re: [PATCH 01/20] ethernet: ucc_geth: set dev->max_mtu to 1518

2021-01-05 Thread Joakim Tjernlund

On Tue, 2021-01-05 at 15:33 +0100, Andrew Lunn wrote:
> On Tue, Jan 05, 2021 at 02:17:42PM +0000, Joakim Tjernlund wrote:
> > On Thu, 2020-12-10 at 02:25 +0100, Andrew Lunn wrote:
> > > On Sat, Dec 05, 2020 at 08:17:24PM +0100, Rasmus Villemoes wrote:
> > > > All the buffers and registers are already set up appropriately for an
> > > > MTU slightly above 1500, so we just need to expose this to the
> > > > networking stack. AFAICT, there's no need to implement .ndo_change_mtu
> > > > when the receive buffers are always set up to support the max_mtu.
> > > > 
> > > > This fixes several warnings during boot on our mpc8309-board with an
> > > > embedded mv88e6250 switch:
> > > > 
> > > > mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port > > > > 0
> > > > ...
> > > > mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port 
> > > > 4
> > > > ucc_geth e0102000.ethernet eth1: error -22 setting MTU to 1504 to 
> > > > include DSA overhead
> > > > 
> > > > The last line explains what the DSA stack tries to do: achieving an MTU
> > > > of 1500 on-the-wire requires that the master netdevice connected to
> > > > the CPU port supports an MTU of 1500+the tagging overhead.
> > > > 
> > > > Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
> > > > Cc: Vladimir Oltean 
> > > > Signed-off-by: Rasmus Villemoes 
> > > 
> > > Reviewed-by: Andrew Lunn 
> > > 
> > > Andrew
> > 
> > I don't see this in any kernel, seems stuck? Maybe because the series as a 
> > whole is not approved?
> 
> Hi Joakim
> 
> When was it posted? If it was while netdev was closed during the merge
> window, you need to repost.
> 
> You should be able to see the status in the netdev patchwork instance
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fproject%2Fnetdevbpf%2Flist%2Fdata=04%7C01%7CJoakim.Tjernlund%40infinera.com%7C25615f4c00a44959810208d8b186e496%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637454540217112252%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=00l%2FBYyxnAoshH1aZMwEznVFQZXwaYGe3pTo6R3RG3Q%3Dreserved=0
> 
>   Andrew

Hi Andrew

I found here: 
https://patchwork.kernel.org/project/netdevbpf/patch/20201218105538.30563-2-rasmus.villem...@prevas.dk/

Seem to be underway but stable isn't included, I think it should be.

 Jocke

Re: [PATCH 01/20] ethernet: ucc_geth: set dev->max_mtu to 1518

2021-01-05 Thread Joakim Tjernlund

On Thu, 2020-12-10 at 02:25 +0100, Andrew Lunn wrote:
> On Sat, Dec 05, 2020 at 08:17:24PM +0100, Rasmus Villemoes wrote:
> > All the buffers and registers are already set up appropriately for an
> > MTU slightly above 1500, so we just need to expose this to the
> > networking stack. AFAICT, there's no need to implement .ndo_change_mtu
> > when the receive buffers are always set up to support the max_mtu.
> > 
> > This fixes several warnings during boot on our mpc8309-board with an
> > embedded mv88e6250 switch:
> > 
> > mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port 0
> > ...
> > mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU 1500 on port 4
> > ucc_geth e0102000.ethernet eth1: error -22 setting MTU to 1504 to include 
> > DSA overhead
> > 
> > The last line explains what the DSA stack tries to do: achieving an MTU
> > of 1500 on-the-wire requires that the master netdevice connected to
> > the CPU port supports an MTU of 1500+the tagging overhead.
> > 
> > Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
> > Cc: Vladimir Oltean 
> > Signed-off-by: Rasmus Villemoes 
> 
> Reviewed-by: Andrew Lunn 
> 
> Andrew

I don't see this in any kernel, seems stuck? Maybe because the series as a 
whole is not approved?

 Jocke

Re: [PATCH] powerpc: Send SIGBUS from machine_check

2020-10-23 Thread Joakim Tjernlund

On Fri, 2020-10-23 at 11:57 +1100, Michael Ellerman wrote:
> 
> 
> Joakim Tjernlund  writes:
> > Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Yeah, but it's not clear that it's applicable in all cases.
> 
> At least I need some reasoning for why it's safe in all cases below to
> just send a SIGBUS and take no other action.

For me this came from an User SDK accessing a PCI device(also using PCI IRQs) 
and this
SDK did some strange stuff during shutdown which disabled the device before SW 
was done.
This caused PCI accesses, both from User Space and kernel PCI IRQs access) to 
the device
which caused an Machine Check(PCI transfer failed). Without this patch, the 
kernel
would just OOPS and hang/do strange things even for an access made by User 
space.
Now the User app just gets a SIGBUS and the kernel still works as it should.

Perhaps a SIGBUS and recover isn't right in all cases but without it there will 
be a
system break down.


> Is there a particular CPU you're working on? Can we start with that and
> look at all the machine check causes and which can be safely handled.

This was a T1042(e5500) but we have e500 and mpc832x as well.

> 
> Some comments below ...
> 
> 
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 0381242920d9..12715d24141c 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
> 
> At the beginning of the function we have:
> 
> printk("Machine check in kernel mode.\n");
> 
> Which should be updated.

Sure, just remove the "in kernel mode" perhaps?

> 
> >  reason & MCSR_MEA ? "Effective" : "Physical", addr);
> >   }
> > 
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + recoverable = 1;
> > + }
> 
> For most of the error causes we take no action and set recoverable = 0.
> 
> Then you just declare that it is recoverable because it hit in
> userspace. Depending on the cause that might be OK, but it's not
> obviously correct in all cases.

Not so familiar with PPC that I can make out what is OK or not.
I do think you stand a better chance now that before though.  

> 
> 
> > +
> >  silent_out:
> >   mtspr(SPRN_MCSR, mcsr);
> >   return mfspr(SPRN_MCSR) == 0 && recoverable;
> > @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
> 
> Same comment about the printk().
> 
> >   if (reason & MCSR_BUS_RPERR)
> >   printk("Bus - Read Parity Error\n");
> > 
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
> 
> And same comment more or less.
> 
> Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
> function does nothing to clear the cause of the machine check.
> 
> >   return 0;
> >  }
> > 
> > @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
> >   if (reason & MCSR_BUS_WRERR)
> >   printk("Bus - Write Bus Error on buffered store or cache line 
> > push\n");
> > 
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
> 
> Same.
> 
> >   return 0;
> >  }
> >  #elif defined(CONFIG_PPC32)
> > @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
> >   default:
> >   printk("Unknown values in msr\n");
> >   }
> > + if ((user_mode(regs))) {
> > + _exception(SIGBUS, regs, reason, regs->nip);
> > + return 1;
> > + }
> 
> Same.
> 
> >   return 0;
> >  }
> >  #endif /* everything else */
> > --
> > 2.26.2
> 
> 
> cheers

Re: [PATCH] powerpc: Send SIGBUS from machine_check

2020-10-22 Thread Joakim Tjernlund

ping

Also Cc: sta...@vger.kernel.org

On Thu, 2020-10-01 at 19:05 +0200, Joakim Tjernlund wrote:
> Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Signed-off-by: Joakim Tjernlund 
> ---
>  arch/powerpc/kernel/traps.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
>  reason & MCSR_MEA ? "Effective" : "Physical", addr);
>   }
>  
> 
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + recoverable = 1;
> + }
> +
>  silent_out:
>   mtspr(SPRN_MCSR, mcsr);
>   return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
>   if (reason & MCSR_BUS_RPERR)
>   printk("Bus - Read Parity Error\n");
>  
> 
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
>   return 0;
>  }
>  
> 
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>   if (reason & MCSR_BUS_WRERR)
>   printk("Bus - Write Bus Error on buffered store or cache line 
> push\n");
>  
> 
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
>   return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>   default:
>   printk("Unknown values in msr\n");
>   }
> + if ((user_mode(regs))) {
> + _exception(SIGBUS, regs, reason, regs->nip);
> + return 1;
> + }
>   return 0;
>  }
>  #endif /* everything else */

[PATCH] powerpc: Send SIGBUS from machine_check

2020-10-01 Thread Joakim Tjernlund

Embedded PPC CPU should send SIGBUS to user space when applicable.

Signed-off-by: Joakim Tjernlund 
---
 arch/powerpc/kernel/traps.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 0381242920d9..12715d24141c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
   reason & MCSR_MEA ? "Effective" : "Physical", addr);
}
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   recoverable = 1;
+   }
+
 silent_out:
mtspr(SPRN_MCSR, mcsr);
return mfspr(SPRN_MCSR) == 0 && recoverable;
@@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
if (reason & MCSR_BUS_RPERR)
printk("Bus - Read Parity Error\n");
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   return 1;
+   }
return 0;
 }
 
@@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
if (reason & MCSR_BUS_WRERR)
printk("Bus - Write Bus Error on buffered store or cache line 
push\n");
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   return 1;
+   }
return 0;
 }
 #elif defined(CONFIG_PPC32)
@@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
default:
printk("Unknown values in msr\n");
}
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   return 1;
+   }
return 0;
 }
 #endif /* everything else */
-- 
2.26.2

Re: [PATCH] i2c: cpm: Fix i2c_ram structure

2020-09-22 Thread Joakim Tjernlund

On Tue, 2020-09-22 at 11:04 +0200, nico.vi...@gmail.com wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> From: Nicolas VINCENT 
> 
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
> 
> Signed-off-by: Nicolas VINCENT 
> ---
>  drivers/i2c/busses/i2c-cpm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..c5700addbf65 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -64,7 +64,8 @@ struct i2c_ram {
> uinttxtmp;  /* Internal */
> charres1[4];/* Reserved */
> ushort  rpbase; /* Relocation pointer */
> -   charres2[2];/* Reserved */
> +   charres2[6];/* Reserved */
> +   uintsdmatmp;/* Internal */
>  };

Not sure if this will fit on 8xx CPUs, Leroy ?

 Jocke

Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-14 Thread Joakim Tjernlund

On Mon, 2020-09-14 at 12:28 +1000, Nicholas Piggin wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Excerpts from Chris Packham's message of September 14, 2020 8:03 am:
> > Hi All,
> > 
> > On 4/09/20 12:28 pm, Chris Packham wrote:
> > > The SPIE register contains counts for the TX FIFO so any time the irq
> > > handler was invoked we would attempt to process the RX/TX fifos. Use the
> > > SPIM value to mask the events so that we only process interrupts that
> > > were expected.
> > > 
> > > This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
> > > Implement soft interrupt replay in C").
> > > 
> > > Signed-off-by: Chris Packham 
> > > Cc: sta...@vger.kernel.org
> > > ---
> > ping?
> 
> I don't know the code/hardware but thanks for tracking this down.
> 
> Was there anything more to be done with Jocke's observations, or would
> that be a follow-up patch if anything?

Patch is good IMHO, there may be more to fix w.r.t clearing the IRQs

> 
> If this patch fixes your problem it should probably go in, unless there
> are any objections.

It should go in I think.

 Jocke

> 
> Thanks,
> Nick
> 
> > > 
> > > Notes:
> > >  I've tested this on a T2080RDB and a custom board using the T2081 
> > > SoC. With
> > >  this change I don't see any spurious instances of the "Transfer done 
> > > but
> > >  SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't 
> > > empty!" messages
> > >  and the updates to spi flash are successful.
> > > 
> > >  I think this should go into the stable trees that contain 
> > > 3282a3da25bd but I
> > >  haven't added a Fixes: tag because I think 3282a3da25bd exposed the 
> > > issue as
> > >  opposed to causing it.
> > > 
> > >   drivers/spi/spi-fsl-espi.c | 5 +++--
> > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
> > > index 7e7c92cafdbb..cb120b68c0e2 100644
> > > --- a/drivers/spi/spi-fsl-espi.c
> > > +++ b/drivers/spi/spi-fsl-espi.c
> > > @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, 
> > > u32 events)
> > >   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
> > >   {
> > >  struct fsl_espi *espi = context_data;
> > > -u32 events;
> > > +u32 events, mask;
> > > 
> > >  spin_lock(>lock);
> > > 
> > >  /* Get interrupt events(tx/rx) */
> > >  events = fsl_espi_read_reg(espi, ESPI_SPIE);
> > > -if (!events) {
> > > +mask = fsl_espi_read_reg(espi, ESPI_SPIM);
> > > +if (!(events & mask)) {
> > >  spin_unlock(>lock);
> > >  return IRQ_NONE;
> > >  }

Re: fsl_espi errors on v5.7.15

2020-09-07 Thread Joakim Tjernlund

[SNIP]
> > 

> > > Would you be able to ftrace the interrupt handler function and see if you
> > > can see a difference in number or timing of interrupts? I'm at a bit of
> > > a loss.
> > 
> > I tried ftrace but I really wasn't sure what I was looking for.
> > Capturing a "bad" case was pretty tricky. But I think I've identified a
> > fix (I'll send it as a proper patch shortly). The gist is
> > 
> > diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
> > index 7e7c92cafdbb..cb120b68c0e2 100644
> > --- a/drivers/spi/spi-fsl-espi.c
> > +++ b/drivers/spi/spi-fsl-espi.c
> > @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi
> > *espi, u32 events)
> >   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
> >   {
> >  struct fsl_espi *espi = context_data;
> > -   u32 events;
> > +   u32 events, mask;
> > 
> >  spin_lock(>lock);
> > 
> >  /* Get interrupt events(tx/rx) */
> >  events = fsl_espi_read_reg(espi, ESPI_SPIE);
> > -   if (!events) {
> > +   mask = fsl_espi_read_reg(espi, ESPI_SPIM);
> > +   if (!(events & mask)) {
> >  spin_unlock(>lock);
> >  return IRQ_NONE;
> >  }
> > 
> > The SPIE register contains the TXCNT so events is pretty much always
> > going to have something set. By checking events against what we've
> > actually requested interrupts for we don't see any spurious events.
> > 
> > I've tested this on the T2080RDB and on our custom hardware and it seems
> > to resolve the problem.
> > 
> 
> I looked at the fsl_espi_irq() too and noticed that clearing of the IRQ events
> are after processing TX/RX. That looks a bit odd to me.

I should have been more specific. I think you can loose IRQs as fsl_espi_irq() 
works now.
Consider this:
1) You get TX IRQ and enter fsl_espi_irq()
2) Enter fsl_espi_fill_tx_fifo() to process any chars until done.
3) Now you get one more TX IRQ
4) fsl_espi_irq() clear events -> IRQ from 3) is lost.

 Jocke

Re: fsl_espi errors on v5.7.15

2020-09-07 Thread Joakim Tjernlund

On Thu, 2020-09-03 at 23:58 +, Chris Packham wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> On 1/09/20 6:14 pm, Nicholas Piggin wrote:
> > Excerpts from Chris Packham's message of September 1, 2020 11:25 am:
> > > On 1/09/20 12:33 am, Heiner Kallweit wrote:
> > > > On 30.08.2020 23:59, Chris Packham wrote:
> > > > > On 31/08/20 9:41 am, Heiner Kallweit wrote:
> > > > > > On 30.08.2020 23:00, Chris Packham wrote:
> > > > > > > On 31/08/20 12:30 am, Nicholas Piggin wrote:
> > > > > > > > Excerpts from Chris Packham's message of August 28, 2020 8:07 
> > > > > > > > am:
> > > > > > > 
> > > > > > > 
> > > > > > > > > > > > > I've also now seen the RX FIFO not empty error on the 
> > > > > > > > > > > > > T2080RDB
> > > > > > > > > > > > > 
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but rx/tx 
> > > > > > > > > > > > > fifo's aren't empty!
> > > > > > > > > > > > > fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 
> > > > > > > > > > > > > 32
> > > > > > > > > > > > > 
> > > > > > > > > > > > > With my current workaround of emptying the RX FIFO. 
> > > > > > > > > > > > > It seems
> > > > > > > > > > > > > survivable. Interestingly it only ever seems to be 1 
> > > > > > > > > > > > > extra byte in the
> > > > > > > > > > > > > RX FIFO and it seems to be after either a READ_SR or 
> > > > > > > > > > > > > a READ_FSR.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > fsl_espi ffe11.spi: tx 70
> > > > > > > > > > > > > fsl_espi ffe11.spi: rx 03
> > > > > > > > > > > > > fsl_espi ffe11.spi: Extra RX 00
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but rx/tx 
> > > > > > > > > > > > > fifo's aren't empty!
> > > > > > > > > > > > > fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 
> > > > > > > > > > > > > 32
> > > > > > > > > > > > > fsl_espi ffe11.spi: tx 05
> > > > > > > > > > > > > fsl_espi ffe11.spi: rx 00
> > > > > > > > > > > > > fsl_espi ffe11.spi: Extra RX 03
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but SPIE_DON 
> > > > > > > > > > > > > isn't set!
> > > > > > > > > > > > > fsl_espi ffe11.spi: Transfer done but rx/tx 
> > > > > > > > > > > > > fifo's aren't empty!
> > > > > > > > > > > > > fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 
> > > > > > > > > > > > > 32
> > > > > > > > > > > > > fsl_espi ffe11.spi: tx 05
> > > > > > > > > > > > > fsl_espi ffe11.spi: rx 00
> > > > > > > > > > > > > fsl_espi ffe11.spi: Extra RX 03
> > > > > > > > > > > > > 
> > > > > > > > > > > > >  From all the Micron SPI-NOR datasheets I've got 
> > > > > > > > > > > > > access to it is
> > > > > > > > > > > > > possible to continually read the SR/FSR. But I've no 
> > > > > > > > > > > > > idea why it
> > > > > > > > > > > > > happens some times and not others.
> > > > > > > > > > > > So I think I've got a reproduction and I think I've 
> > > > > > > > > > > > bisected the problem
> > > > > > > > > > > > to commit 3282a3da25bd ("powerpc/64: Implement soft 
> > > > > > > > > > > > interrupt replay in
> > > > > > > > > > > > C"). My day is just finishing now so I haven't applied 
> > > > > > > > > > > > too much scrutiny
> > > > > > > > > > > > to this result. Given the various rabbit holes I've 
> > > > > > > > > > > > been down on this
> > > > > > > > > > > > issue already I'd take this information with a good 
> > > > > > > > > > > > degree of skepticism.
> > > > > > > > > > > > 
> > > > > > > > > > > OK, so an easy test should be to re-test with a 5.4 
> > > > > > > > > > > kernel.
> > > > > > > > > > > It doesn't have yet the change you're referring to, and 
> > > > > > > > > > > the fsl-espi driver
> > > > > > > > > > > is basically the same as in 5.7 (just two small changes 
> > > > > > > > > > > in 5.7).
> > > > > > > > > > There's 6cc0c16d82f88 and maybe also other interrupt 
> > > > > > > > > > related patches
> > > > > > > > > > around this time that could affect book E, so it's good if 
> > > > > > > > > > that exact
> > > > > > > > > > patch is confirmed.
> > > > > > > > > My confirmation is basically that I can induce the issue in a 
> > > > > > > > > 5.4 kernel
> > > > > > > > > by cherry-picking 3282a3da25bd. I'm also able to "fix" the 
> > > > > > > > > issue in
> > > > > > > > >

Re: [PATCH] usb: gadget: fsl: Fix unsigned expression compared with zero in fsl_udc_probe

2020-08-25 Thread Joakim Tjernlund

On Tue, 2020-08-25 at 11:53 +0300, Felipe Balbi wrote:
Joakim Tjernlund 
mailto:joakim.tjernl...@infinera.com>> writes:

> On Mon, 2020-08-24 at 16:58 +0300, Felipe Balbi wrote:
>> Joakim Tjernlund 
>> mailto:joakim.tjernl...@infinera.com>> writes:
>>
>> > On Mon, 2020-08-24 at 10:21 +0200, Greg KH wrote:
>> > >
>> > > On Mon, Aug 24, 2020 at 04:04:37PM +0800, Ye Bin wrote:
>> > > > Signed-off-by: Ye Bin mailto:yebi...@huawei.com>>
>> > >
>> > > I can't take patches without any changelog text, sorry.
>> >
>> > Still taking patches for fsl_udc_core.c ?
>> > I figured this driver was obsolete and should be moved to one of the 
>> > Chipidea drivers.
>>
>> Nobody sent any patches to switch over the users of this driver to
>> chipidea. I would love to delete this driver :-)
>
> Me too, I got a few local patches here as the driver is quite buggy.
> Got to little USB knowledge to switch it over though :(

this wouldn't require USB knowledge. It only requires some minor DTS
knowledge and HW for testing.

hmm, OK. If it is that simple I may take a crack at it(but then why hasn't NXP 
already done that ?)
I would need some guidance as to what the involved files are?

Jocke

Re: [PATCH] usb: gadget: fsl: Fix unsigned expression compared with zero in fsl_udc_probe

2020-08-24 Thread Joakim Tjernlund

On Mon, 2020-08-24 at 16:58 +0300, Felipe Balbi wrote:
> Joakim Tjernlund  writes:
> 
> > On Mon, 2020-08-24 at 10:21 +0200, Greg KH wrote:
> > > 
> > > On Mon, Aug 24, 2020 at 04:04:37PM +0800, Ye Bin wrote:
> > > > Signed-off-by: Ye Bin 
> > > 
> > > I can't take patches without any changelog text, sorry.
> > 
> > Still taking patches for fsl_udc_core.c ?
> > I figured this driver was obsolete and should be moved to one of the 
> > Chipidea drivers.
> 
> Nobody sent any patches to switch over the users of this driver to
> chipidea. I would love to delete this driver :-)

Me too, I got a few local patches here as the driver is quite buggy.
Got to little USB knowledge to switch it over though :(

 Jocke

Re: [PATCH] usb: gadget: fsl: Fix unsigned expression compared with zero in fsl_udc_probe

2020-08-24 Thread Joakim Tjernlund

On Mon, 2020-08-24 at 10:21 +0200, Greg KH wrote:
> 
> On Mon, Aug 24, 2020 at 04:04:37PM +0800, Ye Bin wrote:
> > Signed-off-by: Ye Bin 
> 
> I can't take patches without any changelog text, sorry.

Still taking patches for fsl_udc_core.c ?
I figured this driver was obsolete and should be moved to one of the Chipidea 
drivers.

 Jocke

Memory: 880608K/983040K .... 36896K reserved ?

2020-07-01 Thread Joakim Tjernlund

I cannot figure out how the xxxK reserved item works in:
 Memory: 880608K/983040K available (9532K kernel code, 1104K rwdata, 3348K 
rodata, 1088K init, 1201K bss, 36896K reserved ...

Is there a way to tune(lower it) this memory?

 Jocke

Re: hardcoded SIGSEGV in __die() ?

2020-03-30 Thread Joakim Tjernlund

On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Joakim Tjernlund  writes:
> > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote:
> > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit :
> > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit :
> > > > > In __die(), see below, there is this call to notify_send() with
> > > > > SIGSEGV hardcoded, this seems odd
> > > > > to me as the variable "err" holds the true signal(in my case SIGBUS)
> > > > > Should not SIGSEGV be replaced with the true signal no.?
> > > > 
> > > > As far as I can see, comes from
> > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3Dreserved=0
> > > > 
> > > 
> > > And
> > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2addata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3Dreserved=0
> > > shows it is (was?) similar on x86.
> > > 
> > 
> > I tried to follow that chain thinking it would end up sending a signal to 
> > user space but I cannot see
> > that happens. Seems to be related to debugging.
> > 
> > In short, I cannot see any signal being delivered to user space. If so that 
> > would explain why
> > our user space process never dies.
> > Is there a signal hidden in machine_check handler for SIGBUS I cannot see?
> 
> It's platform specific. What platform are you on?
> 
> See the ppc_md & cur_cpu_spec calls here:
> 
> void machine_check_exception(struct pt_regs *regs)
> {
> int recover = 0;
> bool nested = in_nmi();
> if (!nested)
> nmi_enter();
> 
> __this_cpu_inc(irq_stat.mce_exceptions);
> 
> add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> 
> /* See if any machine dependent calls. In theory, we would want
>  * to call the CPU first, and call the ppc_md. one if the CPU
>  * one returns a positive number. However there is existing code
>  * that assumes the board gets a first chance, so let's keep it
>  * that way for now and fix things later. --BenH.
>  */
> if (ppc_md.machine_check_exception)
> recover = ppc_md.machine_check_exception(regs);
> else if (cur_cpu_spec->machine_check)
> recover = cur_cpu_spec->machine_check(regs);
> 
> if (recover > 0)
> goto bail;
> 
> 
> Either the ppc_md or cpu_spec handlers can send a signal, but after a
> bit of grepping I think only the pseries and powernv ones do.
> 
> If you get into die() then it's an oops, which is not the same as a
> normal signal.

I had a look at opal_machine_check and friends and came up with:

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 0381242920d9..12715d24141c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
   reason & MCSR_MEA ? "Effective" : "Physical", addr);
}
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   recoverable = 1;
+   }
+
 silent_out:
mtspr(SPRN_MCSR, mcsr);
return mfspr(SPRN_MCSR) == 0 && recoverable;
@@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
if (reason & MCSR_BUS_RPERR)
printk("Bus - Read Parity Error\n");
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip);
+   return 1;
+   }
return 0;
 }
 
@@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
if (reason & MCSR_BUS_WRERR)
printk("Bus - Write Bus Error on buffered store or cache line 
push\n");
 
+   if ((user_mode(regs))) {
+   _exception(SIGBUS, regs, reason, regs->nip)

Re: hardcoded SIGSEGV in __die() ?

2020-03-27 Thread Joakim Tjernlund

On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Joakim Tjernlund  writes:
> > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote:
> > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit :
> > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit :
> > > > > In __die(), see below, there is this call to notify_send() with
> > > > > SIGSEGV hardcoded, this seems odd
> > > > > to me as the variable "err" holds the true signal(in my case SIGBUS)
> > > > > Should not SIGSEGV be replaced with the true signal no.?
> > > > 
> > > > As far as I can see, comes from
> > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3Dreserved=0
> > > > 
> > > 
> > > And
> > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2addata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3Dreserved=0
> > > shows it is (was?) similar on x86.
> > > 
> > 
> > I tried to follow that chain thinking it would end up sending a signal to 
> > user space but I cannot see
> > that happens. Seems to be related to debugging.
> > 
> > In short, I cannot see any signal being delivered to user space. If so that 
> > would explain why
> > our user space process never dies.
> > Is there a signal hidden in machine_check handler for SIGBUS I cannot see?
> 
> It's platform specific. What platform are you on?

I am on e500, e5500(e500mc) and 83xx :)


> 
> See the ppc_md & cur_cpu_spec calls here:
> 
> void machine_check_exception(struct pt_regs *regs)
> {
> int recover = 0;
> bool nested = in_nmi();
> if (!nested)
> nmi_enter();
> 
> __this_cpu_inc(irq_stat.mce_exceptions);
> 
> add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> 
> /* See if any machine dependent calls. In theory, we would want
>  * to call the CPU first, and call the ppc_md. one if the CPU
>  * one returns a positive number. However there is existing code
>  * that assumes the board gets a first chance, so let's keep it
>  * that way for now and fix things later. --BenH.
>  */
> if (ppc_md.machine_check_exception)
> recover = ppc_md.machine_check_exception(regs);
> else if (cur_cpu_spec->machine_check)
> recover = cur_cpu_spec->machine_check(regs);
> 
> if (recover > 0)
> goto bail;
> 
> 
> Either the ppc_md or cpu_spec handlers can send a signal, but after a
> bit of grepping I think only the pseries and powernv ones do.

Seems so

> 
> If you get into die() then it's an oops, which is not the same as a
> normal signal.

Exactly, and the die/OOPS does not seem work as intended either. The system 
tries to limp along
and generates more similar OOPses and may even hang.

> 
> cheers

Re: hardcoded SIGSEGV in __die() ?

2020-03-25 Thread Joakim Tjernlund

On Wed, 2020-03-25 at 17:02 +, David Laight wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you recognize the sender
> and know the content is safe.
> 
> 
> From: Joakim Tjernlund
> > Sent: 23 March 2020 15:45
> ...
> > > > I tried to follow that chain thinking it would end up sending a
> > > > signal to user space but I cannot
> > see
> > > > that happens. Seems to be related to debugging.
> > > > 
> > > > In short, I cannot see any signal being delivered to user
> > > > space. If so that would explain why
> > > > our user space process never dies.
> > > > Is there a signal hidden in machine_check handler for SIGBUS I
> > > > cannot see?
> > > > 
> > > 
> > > Isn't it done in do_exit(), called from oops_end() ?
> > 
> > hmm, so it seems. The odd thing though is that do_exit takes an
> > exit code, not signal number.
> > Also, feels a bit odd to force an exit(that we haven't seen
> > happening) rather than just a signal.
> 
> Isn't there something 'magic' that converts EFAULT into SIGSEGV?

I have tried to find out and I cannot see a signal beeing sent.
Also, SEGV is wrong, this is a SIGBUS fault.

> 
> David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
> MK1 1PT, UK
> Registration No: 1397386 (Wales)

Re: hardcoded SIGSEGV in __die() ?

2020-03-23 Thread Joakim Tjernlund

On Mon, 2020-03-23 at 16:31 +0100, Christophe Leroy wrote:
> 
> Le 23/03/2020 à 16:08, Joakim Tjernlund a écrit :
> > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote:
> > > CAUTION: This email originated from outside of the organization. Do not 
> > > click links or open attachments unless you recognize the sender and know 
> > > the content is safe.
> > > 
> > > 
> > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit :
> > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit :
> > > > > In __die(), see below, there is this call to notify_send() with
> > > > > SIGSEGV hardcoded, this seems odd
> > > > > to me as the variable "err" holds the true signal(in my case SIGBUS)
> > > > > Should not SIGSEGV be replaced with the true signal no.?
> > > > 
> > > > As far as I can see, comes from
> > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cefe6d37a85e1494658ec08d7cf3f513f%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205743206770599sdata=k8%2Bs7ifiCyuNzXuOhykjXUEtWzD62q3HGIIiavqE6%2FA%3Dreserved=0
> > > > 
> > > 
> > > And
> > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2addata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cefe6d37a85e1494658ec08d7cf3f513f%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205743206770599sdata=oCU%2FMelrWDOCjmGOfVuNp2tM%2BwQ%2BRD25jzRWoGbHAew%3Dreserved=0
> > > shows it is (was?) similar on x86.
> > > 
> > 
> > I tried to follow that chain thinking it would end up sending a signal to 
> > user space but I cannot see
> > that happens. Seems to be related to debugging.
> > 
> > In short, I cannot see any signal being delivered to user space. If so that 
> > would explain why
> > our user space process never dies.
> > Is there a signal hidden in machine_check handler for SIGBUS I cannot see?
> > 
> 
> Isn't it done in do_exit(), called from oops_end() ?

hmm, so it seems. The odd thing though is that do_exit takes an exit code, not 
signal number.
Also, feels a bit odd to force an exit(that we haven't seen happening) rather 
than just a signal.

 Jocke

Re: hardcoded SIGSEGV in __die() ?

2020-03-23 Thread Joakim Tjernlund

On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Le 23/03/2020 à 15:43, Christophe Leroy a écrit :
> > 
> > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit :
> > > In __die(), see below, there is this call to notify_send() with
> > > SIGSEGV hardcoded, this seems odd
> > > to me as the variable "err" holds the true signal(in my case SIGBUS)
> > > Should not SIGSEGV be replaced with the true signal no.?
> > 
> > As far as I can see, comes from
> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995sdata=Z2bFsmDlD2MKhLACQvayk9ejz0dqgMEOlBTlocAmtTg%3Dreserved=0
> > 
> 
> And
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2addata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7C4291ac1b501e4296869a08d7cf38cdb4%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637205715189366995sdata=97kyz3Ur88BhDUUYzya5t%2FFQVhXYu6qiHoW8hsEg81s%3Dreserved=0
> shows it is (was?) similar on x86.
> 

I tried to follow that chain thinking it would end up sending a signal to user 
space but I cannot see
that happens. Seems to be related to debugging.

In short, I cannot see any signal being delivered to user space. If so that 
would explain why
our user space process never dies.
Is there a signal hidden in machine_check handler for SIGBUS I cannot see?

 Jocke

hardcoded SIGSEGV in __die() ?

2020-03-23 Thread Joakim Tjernlund

In __die(), see below, there is this call to notify_send() with SIGSEGV 
hardcoded, this seems odd
to me as the variable "err" holds the true signal(in my case SIGBUS)
Should not SIGSEGV be replaced with the true signal no.?

  Jocke

static int __die(const char *str, struct pt_regs *regs, long err)
{
printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter);

if (IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN))
printk("LE ");
else
printk("BE ");

if (IS_ENABLED(CONFIG_PREEMPT))
pr_cont("PREEMPT ");

if (IS_ENABLED(CONFIG_SMP))
pr_cont("SMP NR_CPUS=%d ", NR_CPUS);

if (debug_pagealloc_enabled())
pr_cont("DEBUG_PAGEALLOC ");

if (IS_ENABLED(CONFIG_NUMA))
pr_cont("NUMA ");

pr_cont("%s\n", ppc_md.name ? ppc_md.name : "");

if (notify_die(DIE_OOPS, str, regs, err, 255, SIGSEGV) == NOTIFY_STOP)
return 1;

print_modules();
show_regs(regs);

return 0;
}

CDC ethernet gadget: complete system freeze

2020-03-10 Thread Joakim Tjernlund

We have an embedded T1042 NXP CDC ethernet gadget which seems to completely 
freeze when an
usb0 I/F is established and one do 1 of two things:
 1) reboot the connected Linux laptop -> CDC gadget appears to enter complete 
system freeze.
 2) on laptop, ifconfig usb0 down; rmmod cdc_ether -> CDC gaget freezes

I cannot finns and OOPS or other trace in console or syslog. I could really use 
som ideas here.
Linux kernel 4.19.108 (4.19.76 has the same issue)

 Jocke

Re: [PATCH] powerpc/32s: Slenderize _tlbia() for powerpc 603/603e

2020-02-03 Thread Joakim Tjernlund

On Mon, 2020-02-03 at 16:47 +, Christophe Leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> _tlbia() is a function used only on 603/603e core, ie on CPUs which
> don't have a hash table.
> 
> _tlbia() uses the tlbia macro which implements a loop of 1024 tlbie.
> 
> On the 603/603e core, flushing the entire TLB requires no more than
> 32 tlbie.
> 
> Replace tlbia by a loop of 32 tlbie.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/mm/book3s32/hash_low.S | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
> b/arch/powerpc/mm/book3s32/hash_low.S
> index c11b0a005196..a5039ad10429 100644
> --- a/arch/powerpc/mm/book3s32/hash_low.S
> +++ b/arch/powerpc/mm/book3s32/hash_low.S
> @@ -696,18 +696,21 @@ _GLOBAL(_tlbia)
> bne-10b
> stwcx.  r8,0,r9
> bne-10b
> +#endif /* CONFIG_SMP */
> +   li  r5, 32
> +   lis r4, KERNELBASE@h
> +   mtctr   r5
> sync
> -   tlbia
> +0: tlbie   r4
> +   addir4, r4, 0x1000

Is page size always 4096 here or does it not matter ?

> +   bdnz0b
> sync
> +#ifdef CONFIG_SMP
> TLBSYNC
> li  r0,0
> stw r0,0(r9)/* clear mmu_hash_lock */
> mtmsr   r10
> SYNC_601
> isync
> -#else /* CONFIG_SMP */
> -   sync
> -   tlbia
> -   sync
>  #endif /* CONFIG_SMP */
> blr
> --
> 2.25.0
>

Re: [PATCH] net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info

2020-01-15 Thread Joakim Tjernlund

On Tue, 2020-01-14 at 14:54 +, Colin King wrote:
> 
> From: Colin Ian King 
> 
> Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements
> however up to UCC_MAX_NUM (8) elements are potentially being written
> to it.  Currently we have an array out-of-bounds write error on the
> last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in
> size.
> 
> Addresses-Coverity: ("Out-of-bounds write")
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Colin Ian King 

This should be sent to stable as well
Cc:  # 4.19.x+

> ---
>  drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
> index 94e870f48e21..9edd94679283 100644
> --- a/drivers/net/wan/fsl_ucc_hdlc.c
> +++ b/drivers/net/wan/fsl_ucc_hdlc.c
> @@ -73,7 +73,7 @@ static struct ucc_tdm_info utdm_primary_info = {
> },
>  };
> 
> -static struct ucc_tdm_info utdm_info[MAX_HDLC_NUM];
> +static struct ucc_tdm_info utdm_info[UCC_MAX_NUM];
> 
>  static int uhdlc_init(struct ucc_hdlc_private *priv)
>  {
> --
> 2.24.0
>

Re: [PATCH v3] spi: fsl: simplify error path in of_fsl_spi_probe()

2020-01-14 Thread Joakim Tjernlund

On Tue, 2020-01-14 at 16:02 +, Christophe Leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> No need to 'goto err;' for just doing a return.
> return directly from where the error happens.
> 
> Signed-off-by: Christophe Leroy 
> ---
> v3: rebase on today's spi/for-next and using PTR_ERR_OR_ZERO() in one place.
> ---
>  drivers/spi/spi-fsl-spi.c | 27 ---
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/spi/spi-fsl-spi.c b/drivers/spi/spi-fsl-spi.c
> index fb4159ad6bf6..3b81772fea0d 100644
> --- a/drivers/spi/spi-fsl-spi.c
> +++ b/drivers/spi/spi-fsl-spi.c
> @@ -706,8 +706,8 @@ static int of_fsl_spi_probe(struct platform_device *ofdev)
> struct device_node *np = ofdev->dev.of_node;
> struct spi_master *master;
> struct resource mem;
> -   int irq = 0, type;
> -   int ret = -ENOMEM;
> +   int irq, type;
> +   int ret;
> 
> ret = of_mpc8xxx_spi_probe(ofdev);
> if (ret)
> @@ -722,10 +722,8 @@ static int of_fsl_spi_probe(struct platform_device 
> *ofdev)
> 
> if (spisel_boot) {
> pinfo->immr_spi_cs = ioremap(get_immrbase() + 
> IMMR_SPI_CS_OFFSET, 4);
> -   if (!pinfo->immr_spi_cs) {
> -   ret = -ENOMEM;
> -   goto err;
> -   }
> +   if (!pinfo->immr_spi_cs)
> +   return -ENOMEM;
> }
>  #endif
> /*
> @@ -744,24 +742,15 @@ static int of_fsl_spi_probe(struct platform_device 
> *ofdev)
> 
> ret = of_address_to_resource(np, 0, );
> if (ret)
> -   goto err;
> +   return ret;
> 
> irq = platform_get_irq(ofdev, 0);
> -   if (irq < 0) {
> -   ret = irq;
> -   goto err;
> -   }
> +   if (irq < 0)
> +   return irq;
> 
> master = fsl_spi_probe(dev, , irq);
> -   if (IS_ERR(master)) {
> -   ret = PTR_ERR(master);
> -   goto err;
> -   }
> -

Don't you need to "undo" ioremap, irq etc. in case of later errors?

> -   return 0;
> 
> -err:
> -   return ret;
> +   return PTR_ERR_OR_ZERO(master);
>  }
> 
>  static int of_fsl_spi_remove(struct platform_device *ofdev)
> --
> 2.13.3
>

Re: [PATCH] powerpc/8xx: drop unused self-modifying code alternative to FixupDAR.

2019-08-22 Thread Joakim Tjernlund

On Wed, 2019-08-21 at 20:00 +, Christophe Leroy wrote:
> 
> The code which fixups the DAR on TLB errors for dbcX instructions
> has a self-modifying code alternative that has never been used.
> 
> Drop it.

Argh, my master piece from way back :)
But it is time for it to go.

Reviewed-by: Joakim Tjernlund 

 Jocke
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/head_8xx.S | 24 
>  1 file changed, 24 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index b8ca5b43e587..19f583e18402 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -575,8 +575,6 @@ InstructionBreakpoint:
>   * by decoding the registers used by the dcbx instruction and adding them.
>   * DAR is set to the calculated address.
>   */
> - /* define if you don't want to use self modifying code */
> -#define NO_SELF_MODIFYING_CODE
>  FixupDAR:/* Entry point for dcbx workaround. */
> mtspr   SPRN_M_TW, r10
> /* fetch instruction from memory. */
> @@ -640,27 +638,6 @@ FixupDAR:/* Entry point for dcbx workaround. */
> rlwinm  r10, r10,0,7,5  /* Clear store bit for buggy dcbst insn */
> mtspr   SPRN_DSISR, r10
>  142:   /* continue, it was a dcbx, dcbi instruction. */
> -#ifndef NO_SELF_MODIFYING_CODE
> -   andis.  r10,r11,0x1f/* test if reg RA is r0 */
> -   li  r10,modified_instr@l
> -   dcbtst  r0,r10  /* touch for store */
> -   rlwinm  r11,r11,0,0,20  /* Zero lower 10 bits */
> -   orisr11,r11,640 /* Transform instr. to a "add r10,RA,RB" */
> -   ori r11,r11,532
> -   stw r11,0(r10)  /* store add/and instruction */
> -   dcbf0,r10   /* flush new instr. to memory. */
> -   icbi0,r10   /* invalidate instr. cache line */
> -   mfspr   r11, SPRN_SPRG_SCRATCH1 /* restore r11 */
> -   mfspr   r10, SPRN_SPRG_SCRATCH0 /* restore r10 */
> -   isync   /* Wait until new instr is loaded from memory 
> */
> -modified_instr:
> -   .space  4   /* this is where the add instr. is stored */
> -   bne+143f
> -   subfr10,r0,r10  /* r10=r10-r0, only if reg RA is r0 */
> -143:   mtdar   r10 /* store faulting EA in DAR */
> -   mfspr   r10,SPRN_M_TW
> -   b   DARFixed/* Go back to normal TLB handling */
> -#else
> mfctr   r10
> mtdar   r10 /* save ctr reg in DAR */
> rlwinm  r10, r11, 24, 24, 28/* offset into jump table for reg RB 
> */
> @@ -724,7 +701,6 @@ modified_instr:
> add r10, r10, r11   /* add it */
> mfctr   r11 /* restore r11 */
> b   151b
> -#endif
> 
>  /*
>   * This is where the main kernel code starts.
> --
> 2.13.3
>

Re: [PATCH v1 4/8] soc/fsl/qbman: Use index when accessing device tree properties

2019-05-13 Thread Joakim Tjernlund

On Mon, 2019-05-13 at 17:40 +, Roy Pledge wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> On 5/13/2019 12:40 PM, Joakim Tjernlund wrote:
> > On Mon, 2019-05-13 at 16:09 +, Roy Pledge wrote:
> > > The index value should be passed to the of_parse_phandle()
> > > function to ensure the correct property is read.
> > Is this a bug fix? Maybe for stable too?
> > 
> >  Jocke
> Yes this could go to stable.  I will include sta...@vger.kernel.org when
> I send the next version.

I think you need to send this patch separately then. The whole series cannot go 
to stable.

 Jocke

> > > Signed-off-by: Roy Pledge 
> > > ---
> > >  drivers/soc/fsl/qbman/dpaa_sys.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/soc/fsl/qbman/dpaa_sys.c 
> > > b/drivers/soc/fsl/qbman/dpaa_sys.c
> > > index 3e0a7f3..0b901a8 100644
> > > --- a/drivers/soc/fsl/qbman/dpaa_sys.c
> > > +++ b/drivers/soc/fsl/qbman/dpaa_sys.c
> > > @@ -49,7 +49,7 @@ int qbman_init_private_mem(struct device *dev, int idx, 
> > > dma_addr_t *addr,
> > > idx, ret);
> > > return -ENODEV;
> > > }
> > > -   mem_node = of_parse_phandle(dev->of_node, "memory-region", 0);
> > > +   mem_node = of_parse_phandle(dev->of_node, "memory-region", idx);
> > > if (mem_node) {
> > > ret = of_property_read_u64(mem_node, "size", );
> > > if (ret) {
> > > --
> > > 2.7.4
> > >

Re: [PATCH v1 4/8] soc/fsl/qbman: Use index when accessing device tree properties

2019-05-13 Thread Joakim Tjernlund

On Mon, 2019-05-13 at 16:09 +, Roy Pledge wrote:
> 
> The index value should be passed to the of_parse_phandle()
> function to ensure the correct property is read.

Is this a bug fix? Maybe for stable too?

 Jocke

> 
> Signed-off-by: Roy Pledge 
> ---
>  drivers/soc/fsl/qbman/dpaa_sys.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.c 
> b/drivers/soc/fsl/qbman/dpaa_sys.c
> index 3e0a7f3..0b901a8 100644
> --- a/drivers/soc/fsl/qbman/dpaa_sys.c
> +++ b/drivers/soc/fsl/qbman/dpaa_sys.c
> @@ -49,7 +49,7 @@ int qbman_init_private_mem(struct device *dev, int idx, 
> dma_addr_t *addr,
> idx, ret);
> return -ENODEV;
> }
> -   mem_node = of_parse_phandle(dev->of_node, "memory-region", 0);
> +   mem_node = of_parse_phandle(dev->of_node, "memory-region", idx);
> if (mem_node) {
> ret = of_property_read_u64(mem_node, "size", );
> if (ret) {
> --
> 2.7.4
>

Re: [PATCH v3 4/6] dt-bindings: soc/fsl: qe: document new fsl,qe-snums binding

2019-05-13 Thread Joakim Tjernlund

On Mon, 2019-05-13 at 11:14 +, Rasmus Villemoes wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Reading table 4-30, and its footnotes, of the QUICC Engine Block
> Reference Manual shows that the set of snum _values_ is not
> necessarily just a function of the _number_ of snums, as given in the
> fsl,qe-num-snums property.
> 
> As an alternative, to make it easier to add support for other variants
> of the QUICC engine IP, this introduces a new binding fsl,qe-snums,
> which automatically encodes both the number of snums and the actual
> values to use.
> 
> Signed-off-by: Rasmus Villemoes 
> ---
> Rob, thanks for the review of v2. However, since I moved the example
> from the commit log to the binding (per Joakim's request), I didn't

Thanks, looks good now.

> add a Reviewed-by tag for this revision.
> 
>  .../devicetree/bindings/soc/fsl/cpm_qe/qe.txt   | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt 
> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> index d7afaff5faff..05ec2a838c54 100644
> --- a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> @@ -18,7 +18,8 @@ Required properties:
>  - reg : offset and length of the device registers.
>  - bus-frequency : the clock frequency for QUICC Engine.
>  - fsl,qe-num-riscs: define how many RISC engines the QE has.
> -- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for 
> the
> +- fsl,qe-snums: This property has to be specified as '/bits/ 8' value,
> +  defining the array of serial number (SNUM) values for the virtual
>threads.
> 
>  Optional properties:
> @@ -34,6 +35,11 @@ Recommended properties
>  - brg-frequency : the internal clock source frequency for baud-rate
>generators in Hz.
> 
> +Deprecated properties
> +- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use
> +  for the threads. Use fsl,qe-snums instead to not only specify the
> +  number of snums, but also their values.
> +
>  Example:
>   qe@e010 {
> #address-cells = <1>;
> @@ -44,6 +50,11 @@ Example:
> reg = ;
> brg-frequency = <0>;
> bus-frequency = <179A7B00>;
> +   fsl,qe-snums = /bits/ 8 <
> +   0x04 0x05 0x0C 0x0D 0x14 0x15 0x1C 0x1D
> +   0x24 0x25 0x2C 0x2D 0x34 0x35 0x88 0x89
> +   0x98 0x99 0xA8 0xA9 0xB8 0xB9 0xC8 0xC9
> +   0xD8 0xD9 0xE8 0xE9>;
>   }
> 
>  * Multi-User RAM (MURAM)
> --
> 2.20.1
>

Re: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Joakim Tjernlund

On Thu, 2019-05-02 at 12:58 +, Laurentiu Tudor wrote:
> 
> > -Original Message-
> > From: Joakim Tjernlund 
> > Sent: Thursday, May 2, 2019 1:37 PM
> > 
> > On Thu, 2019-05-02 at 09:05 +, Laurentiu Tudor wrote:
> > > Hi Joakim,
> > > 
> > > > -Original Message-
> > > > From: Joakim Tjernlund 
> > > > Sent: Saturday, April 27, 2019 8:11 PM
> > > > 
> > > > On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > > > > From: Laurentiu Tudor 
> > > > > 
> > > > > Fix issue with the entry indexing in the sg frame cleanup code being
> > > > > off-by-1. This problem showed up when doing some basic iperf tests
> > and
> > > > > manifested in traffic coming to a halt.
> > > > > 
> > > > > Signed-off-by: Laurentiu Tudor 
> > > > > Acked-by: Madalin Bucur 
> > > > 
> > > > Wasn't this a stable candidate too?
> > > 
> > > Yes, it is. I forgot to add the cc:stable tag, sorry about that.
> > 
> > Then this is a bug fix that should go directly to linus/stable.
> > 
> > I note that
> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fstable%2Flinux.git%2Flog%2Fdrivers%2Fnet%2Fethernet%2Ffreescale%2Fdpaa%3Fh%3Dlinux-4.19.ydata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Cb88ecc951de649e5a55808d6cefdd286%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C636923986895133037sdata=ueUWI1%2BmNBHtlCoY9%2B1FreOUM8bHGiTYWhISy5nRoJk%3Dreserved=0
> 
> Not sure I understand ... I don't see the patch in the link.

Sorry, I copied the wrong link:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/ethernet/freescale/dpaa?h=linux-4.19.y=0aafea5d4b22fe9403e89d82e02597e4493d5d0f

> 
> > is in 4.19 but not in 4.14 , is it not appropriate for 4.14?
> 
> I think it makes sense to go in both stable trees.
> 
> ---
> Best Regards, Laurentiu
> 
> > > > > ---
> > > > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > index daede7272768..40420edc9ce6 100644
> > > > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > > > @@ -1663,7 +1663,7 @@ static struct sk_buff
> > *dpaa_cleanup_tx_fd(const
> > > > struct dpaa_priv *priv,
> > > > >  qm_sg_entry_get_len([0]),
> > dma_dir);
> > > > > /* remaining pages were mapped with
> > skb_frag_dma_map()
> > > > */
> > > > > -   for (i = 1; i < nr_frags; i++) {
> > > > > +   for (i = 1; i <= nr_frags; i++) {
> > > > > WARN_ON(qm_sg_entry_is_ext([i]));
> > > > > 
> > > > > dma_unmap_page(dev, qm_sg_addr([i]),
> > > > > --
> > > > > 2.17.1
> > > > >

Re: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-05-02 Thread Joakim Tjernlund

On Thu, 2019-05-02 at 09:05 +, Laurentiu Tudor wrote:
> Hi Joakim,
> 
> > -Original Message-
> > From: Joakim Tjernlund 
> > Sent: Saturday, April 27, 2019 8:11 PM
> > 
> > On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> > > From: Laurentiu Tudor 
> > > 
> > > Fix issue with the entry indexing in the sg frame cleanup code being
> > > off-by-1. This problem showed up when doing some basic iperf tests and
> > > manifested in traffic coming to a halt.
> > > 
> > > Signed-off-by: Laurentiu Tudor 
> > > Acked-by: Madalin Bucur 
> > 
> > Wasn't this a stable candidate too?
> 
> Yes, it is. I forgot to add the cc:stable tag, sorry about that.

Then this is a bug fix that should go directly to linus/stable.

I note that 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/ethernet/freescale/dpaa?h=linux-4.19.y
is in 4.19 but not in 4.14 , is it not appropriate for 4.14?

 Jocke

> 
> ---
> Best Regards, Laurentiu
> 
> > > ---
> > >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > index daede7272768..40420edc9ce6 100644
> > > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > > @@ -1663,7 +1663,7 @@ static struct sk_buff *dpaa_cleanup_tx_fd(const
> > struct dpaa_priv *priv,
> > >  qm_sg_entry_get_len([0]), dma_dir);
> > > 
> > > /* remaining pages were mapped with skb_frag_dma_map()
> > */
> > > -   for (i = 1; i < nr_frags; i++) {
> > > +   for (i = 1; i <= nr_frags; i++) {
> > > WARN_ON(qm_sg_entry_is_ext([i]));
> > > 
> > > dma_unmap_page(dev, qm_sg_addr([i]),
> > > --
> > > 2.17.1
> > >

Re: [PATCH v2 4/6] dt-bindings: soc/fsl: qe: document new fsl,qe-snums binding

2019-05-01 Thread Joakim Tjernlund

On Wed, 2019-05-01 at 09:29 +, Rasmus Villemoes wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Reading table 4-30, and its footnotes, of the QUICC Engine Block
> Reference Manual shows that the set of snum _values_ is not
> necessarily just a function of the _number_ of snums, as given in the
> fsl,qe-num-snums property.
> 
> As an alternative, to make it easier to add support for other variants
> of the QUICC engine IP, this introduces a new binding fsl,qe-snums,
> which automatically encodes both the number of snums and the actual
> values to use.
> 
> For example, for the MPC8309, one would specify the property as
> 
>fsl,qe-snums = /bits/ 8 <
>0x88 0x89 0x98 0x99 0xa8 0xa9 0xb8 0xb9
>0xc8 0xc9 0xd8 0xd9 0xe8 0xe9>;

I think you need add this example to the qe.txt doc itselft. BTW, what is 
/bits/ ?
> 
> Signed-off-by: Rasmus Villemoes 
> ---
>  Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt 
> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> index d7afaff5faff..05f5f485562a 100644
> --- a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/qe.txt
> @@ -18,7 +18,8 @@ Required properties:
>  - reg : offset and length of the device registers.
>  - bus-frequency : the clock frequency for QUICC Engine.
>  - fsl,qe-num-riscs: define how many RISC engines the QE has.
> -- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for 
> the
> +- fsl,qe-snums: This property has to be specified as '/bits/ 8' value,
> +  defining the array of serial number (SNUM) values for the virtual
>threads.
> 
>  Optional properties:
> @@ -34,6 +35,11 @@ Recommended properties
>  - brg-frequency : the internal clock source frequency for baud-rate
>generators in Hz.
> 
> +Deprecated properties
> +- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use
> +  for the threads. Use fsl,qe-snums instead to not only specify the
> +  number of snums, but also their values.
> +
>  Example:
>   qe@e010 {
> #address-cells = <1>;
> --
> 2.20.1
>

Re: [PATCH v2 9/9] dpaa_eth: fix SG frame cleanup

2019-04-27 Thread Joakim Tjernlund

On Sat, 2019-04-27 at 10:10 +0300, laurentiu.tu...@nxp.com wrote:
> From: Laurentiu Tudor 
> 
> Fix issue with the entry indexing in the sg frame cleanup code being
> off-by-1. This problem showed up when doing some basic iperf tests and
> manifested in traffic coming to a halt.
> 
> Signed-off-by: Laurentiu Tudor 
> Acked-by: Madalin Bucur 

Wasn't this a stable candidate too?

> ---
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
> b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> index daede7272768..40420edc9ce6 100644
> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> @@ -1663,7 +1663,7 @@ static struct sk_buff *dpaa_cleanup_tx_fd(const struct 
> dpaa_priv *priv,
>  qm_sg_entry_get_len([0]), dma_dir);
> 
> /* remaining pages were mapped with skb_frag_dma_map() */
> -   for (i = 1; i < nr_frags; i++) {
> +   for (i = 1; i <= nr_frags; i++) {
> WARN_ON(qm_sg_entry_is_ext([i]));
> 
> dma_unmap_page(dev, qm_sg_addr([i]),
> --
> 2.17.1
>

Re: [PATCH stable v4.14 13/32] powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E

2019-04-02 Thread Joakim Tjernlund

On Wed, 2019-04-03 at 11:53 +1100, Michael Ellerman wrote:
> 
> Joakim Tjernlund  writes:
> > On Tue, 2019-04-02 at 17:19 +1100, Michael Ellerman wrote:
> > > Joakim Tjernlund  writes:
> ...
> > > > Can I compile it away?
> > > 
> > > You can't actually, but you can disable it at runtime with
> > > "nospectre_v1" on the kernel command line.
> > > 
> > > We could make it a user selectable compile time option if you really
> > > want it to be.
> > 
> > I think yes. Considering that these patches are fairly untested and the 
> > impact
> > in the wild unknown. Requiring systems to change their boot config over 
> > night is
> > too fast.
> 
> OK. Just to be clear, you're actually using 4.14 on an NXP board and
> would actually use this option? I don't want to add another option just
> for a theoretical use case.

Correct, we use 4.14 on several custom boards using NXP CPUs and would 
appreciate
if I could control spectre with a build switch.

Thanks a lot!
   Jocke

Re: [PATCH stable v4.14 13/32] powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E

2019-04-02 Thread Joakim Tjernlund

On Tue, 2019-04-02 at 17:19 +1100, Michael Ellerman wrote:
> 
> Joakim Tjernlund  writes:
> > On Fri, 2019-03-29 at 22:26 +1100, Michael Ellerman wrote:
> > > From: Diana Craciun 
> > > 
> > > commit ebcd1bfc33c7a90df941df68a6e5d4018c022fba upstream.
> > > 
> > > Implement the barrier_nospec as a isync;sync instruction sequence.
> > > The implementation uses the infrastructure built for BOOK3S 64.
> > > 
> > > Signed-off-by: Diana Craciun 
> > > [mpe: Split out of larger patch]
> > > Signed-off-by: Michael Ellerman 
> > 
> > What is the performanc impact of these spectre fixes?
> 
> I've not seen any numbers from anyone.

Thanks for getting back to me.

> 
> It will depend on the workload, it's copy to/from user that is most
> likely to show an impact.
> 
> We have a context switch benchmark in
> tools/testing/selftests/powerpc/benchmarks/context_switch.c.
> 
> Running that with "--no-vector --no-altivec --no-fp --test=pipe" shows
> about a 2.3% slow down vs booting with "nospectre_v1".
> 
> > Can I compile it away?
> 
> You can't actually, but you can disable it at runtime with
> "nospectre_v1" on the kernel command line.
> 
> We could make it a user selectable compile time option if you really
> want it to be.

I think yes. Considering that these patches are fairly untested and the impact
in the wild unknown. Requiring systems to change their boot config over night is
too fast.

 Jocke

Re: [PATCH stable v4.14 13/32] powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E

2019-03-29 Thread Joakim Tjernlund

On Fri, 2019-03-29 at 22:26 +1100, Michael Ellerman wrote:
> 
> From: Diana Craciun 
> 
> commit ebcd1bfc33c7a90df941df68a6e5d4018c022fba upstream.
> 
> Implement the barrier_nospec as a isync;sync instruction sequence.
> The implementation uses the infrastructure built for BOOK3S 64.
> 
> Signed-off-by: Diana Craciun 
> [mpe: Split out of larger patch]
> Signed-off-by: Michael Ellerman 

What is the performanc impact of these spectre fixes?
Can I compile it away?

  Jocke

Re: [PATCH 13/13] dpaa_eth: fix SG frame cleanup

2019-03-29 Thread Joakim Tjernlund


Should this one go stable 4.14/4.19 too?

On Fri, 2019-03-29 at 16:00 +0200, laurentiu.tu...@nxp.com wrote:
> 
> From: Laurentiu Tudor 
> 
> Fix issue with the entry indexing in the sg frame cleanup code being
> off-by-1. This problem showed up when doing some basic iperf tests and
> manifested in traffic coming to a halt.
> 
> Signed-off-by: Laurentiu Tudor 
> Acked-by: Madalin Bucur 
> ---
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
> b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> index daede7272768..40420edc9ce6 100644
> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> @@ -1663,7 +1663,7 @@ static struct sk_buff *dpaa_cleanup_tx_fd(const struct 
> dpaa_priv *priv,
>  qm_sg_entry_get_len([0]), dma_dir);
> 
> /* remaining pages were mapped with skb_frag_dma_map() */
> -   for (i = 1; i < nr_frags; i++) {
> +   for (i = 1; i <= nr_frags; i++) {
> WARN_ON(qm_sg_entry_is_ext([i]));
> 
> dma_unmap_page(dev, qm_sg_addr([i]),
> --
> 2.17.1
>

Re: ethernet "bus" number in DTS ?

2018-10-26 Thread Joakim Tjernlund

On Wed, 2018-10-24 at 08:22 +0200, Michal Suchánek wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> On Tue, 23 Oct 2018 11:20:36 -0700
> Florian Fainelli  wrote:
> 
> > On 10/23/18 11:02 AM, Joakim Tjernlund wrote:
> > > On Tue, 2018-10-23 at 10:03 -0700, Florian Fainelli wrote:
> > > I also noted that using status = "disabled" didn't work either to
> > > create a fix name scheme. Even worse, all the eth I/F after gets
> > > renumbered. It seems to me there is value in having stability in
> > > eth I/F naming at boot. Then userspace(udev) can rename if need be.
> > > 
> > > Sure would like to known more about why this feature is not wanted ?
> > > 
> > > I found
> > >   https://patchwork.kernel.org/patch/4122441/
> > > You quote policy as reason but surely it must be better to
> > > have something stable, connected to the hardware name, than
> > > semirandom naming?
> > 
> > If the Device Tree nodes are ordered by ascending base register
> > address, my understanding is that you get the same order as far as
> > platform_device creation goes, this may not be true in the future if
> > Rob decides to randomize that, but AFAICT this is still true. This
> > may not work well with status = disabled properties being inserted
> > here and there, but we have used that here and it has worked for as
> > far as I can remember doing it.
> 
> So this is unstable in several respects. First is changing the
> enabled/disabled status in the deivecetrees provided by the kernel.
> 
> Second is if you have hardware hotplug mechanism either by firmware or
> by loading device overlays.
> 
> Third is the case when the devicetree is not built as part of the
> kernel but is instead provided by firmware that initializes the
> low-level hardware details. Then the ordering by address is not
> guaranteed nor is that the same address will be used to access the same
> interface every time. There might be multiple ways to configure the
> hardware depending on firmware configuration and/or version.
> 
> 
> > Second, you might want to name network devices ethX, but what if I
> > want to name them ethernetX or fooX or barX? Should we be accepting a
> > mechanism in the kernel that would allow someone to name the
> > interfaces the way they want straight from a name being provided in
> > Device Tree?

Just to be clear, I am saying that we don't need to control the full
name of the Ethernet device, just the numerical id so one can tie eth0
to a fixed physical device.

> 
> Clearly if there is text Ethernet1 printed above the Ethernet port we
> should provide a mechanism to name the port Ethernet1 by default.
> 
> > Aliases are fine for providing relative stability within the Device
> > Tree itself and boot programs that might need to modify the Device
> > Tree (e.g: inserting MAC addresses) such that you don't have to
> > encode logic to search for nodes by compatible strings etc. but
> > outside of that use case, it seems to me that you can resolve every
> > naming decision in user-space.
> 
> However, this is pushing platform-specific knowledge to userspace. The
> way the Ethernet interface is attached and hence the device properties
> usable for identifying the device uniquely are platform-specific.
> 
> On the other hand, aliases are universal when provided. If they are
> good enough to assign a MAC address they are good enough to provide a
> stable default name.
> 
> I think this is indeed forcing the userspace to reinvent several wheels
> for no good reason.
> 
> What is the problem with adding the aliases?

Well put above, thanks.

   Jocke

Re: ethernet "bus" number in DTS ?

2018-10-23 Thread Joakim Tjernlund

On Tue, 2018-10-23 at 13:07 -0700, Florian Fainelli wrote:
> 
> On 10/23/18 1:02 PM, Joakim Tjernlund wrote:
> > On Tue, 2018-10-23 at 11:20 -0700, Florian Fainelli wrote:
> > > On 10/23/18 11:02 AM, Joakim Tjernlund wrote:
> > > > On Tue, 2018-10-23 at 10:03 -0700, Florian Fainelli wrote:
> > > > > 
> > > > > On 10/23/18 9:49 AM, Joakim Tjernlund wrote:
> > > > > > SPI (and others) has a way to define bus number in a aliases:
> > > > > >   aliases {
> > > > > >   ethernet4 = 
> > > > > >   ethernet0 = 
> > > > > >   ethernet1 = 
> > > > > >   ethernet2 = 
> > > > > >   ethernet3 = 
> > > > > >   spi0 = 
> > > > > >   };
> > > > > > The 0 in the spi0 alias will translate to bus num 0 so one can 
> > > > > > control the /dev nodes, like /dev/spidev0
> > > > > > I am looking for the same for ethernet devices:
> > > > > >  ethernet4 =   /* should become eth4 */
> > > > > >  ethernet0 =   /* should become eth0 */
> > > > > > but I cannot find something like that for eth devices.
> > > > > > 
> > > > > > Could such functionality be added?
> > > > > 
> > > > > It could, do we want and need to, no. You have the Ethernet alias in
> > > > > /sys/class/net/*/device/uevent already that would allow you to perform
> > > > > that (re)naming in user-space:
> > > > > 
> > > > > # cat /sys/class/net/eth0/device/uevent
> > > > > DRIVER=bcmgenet
> > > > > OF_NAME=ethernet
> > > > > OF_FULLNAME=/rdb/ethernet@f048
> > > > > OF_TYPE=network
> > > > > OF_COMPATIBLE_0=brcm,genet-v5
> > > > > OF_COMPATIBLE_N=1
> > > > > OF_ALIAS_0=eth0 <==
> > > > > MODALIAS=of:NethernetTnetworkCbrcm,genet-v5
> > > > 
> > > > Yes, one can if one uses udev and can find something to identify the hw 
> > > > I/F with, my
> > > > cat /sys/class/net/eth0/device/uevent looks like:
> > > > DRIVER=fsl_dpa
> > > > MODALIAS=platform:dpaa-ethernet
> > > 
> > > Does not dpaa have a notion of Ethernet ports and those should have
> > > proper information? Maybe that is part of your problem here, it should
> > > have the OF_ALIAS information somehow available.
> > 
> > I cannot say ATM, but this lack of standard does not make it easier to 
> > rename I/F's in udev.
> > 
> > > > not sure mdev supports this, does it?
> > > > Our simple installer FS(initramfs) doesn't have either udev or mdev.
> > > 
> > > I don't know, but you could have a simple shell script that looks at
> > > specific network device properties to decide on the naming and call
> > > ifrename.
> > 
> > This reinventing of the wheel is what I am trying to avoid.
> 
> Embedded is all about being a special snowflake and re-inventing the
> wheel, but having some desktop-like distribution user-space would
> certainly allow you to re-invent other parts of the wheel.
> 
> > > > I also noted that using status = "disabled" didn't work either to 
> > > > create a fix name scheme.
> > > > Even worse, all the eth I/F after gets renumbered. It seems to me there
> > > > is value in having stability in eth I/F naming at boot.
> > > > Then userspace(udev) can rename if need be.
> > > > 
> > > > Sure would like to known more about why this feature is not wanted ?
> > > > 
> > > > I found
> > > >   https://patchwork.kernel.org/patch/4122441/
> > > > You quote policy as reason but surely it must be better to
> > > > have something stable, connected to the hardware name, than semirandom 
> > > > naming?
> > > 
> > > If the Device Tree nodes are ordered by ascending base register address,
> > > my understanding is that you get the same order as far as
> > > platform_device creation goes, this may not be true in the future if Rob
> > > decides to randomize that, but AFAICT this is still true. This may not
> > > work well with status = disabled properties being inserted here and
> > > there, but we have used that here and it has worked for as far as I can
> > > remember doing it.
>

Re: ethernet "bus" number in DTS ?

2018-10-23 Thread Joakim Tjernlund

On Tue, 2018-10-23 at 11:20 -0700, Florian Fainelli wrote:
> 
> On 10/23/18 11:02 AM, Joakim Tjernlund wrote:
> > On Tue, 2018-10-23 at 10:03 -0700, Florian Fainelli wrote:
> > > 
> > > 
> > > On 10/23/18 9:49 AM, Joakim Tjernlund wrote:
> > > > SPI (and others) has a way to define bus number in a aliases:
> > > >   aliases {
> > > >   ethernet4 = 
> > > >   ethernet0 = 
> > > >   ethernet1 = 
> > > >   ethernet2 = 
> > > >   ethernet3 = 
> > > >   spi0 = 
> > > >   };
> > > > The 0 in the spi0 alias will translate to bus num 0 so one can control 
> > > > the /dev nodes, like /dev/spidev0
> > > > I am looking for the same for ethernet devices:
> > > >  ethernet4 =   /* should become eth4 */
> > > >  ethernet0 =   /* should become eth0 */
> > > > but I cannot find something like that for eth devices.
> > > > 
> > > > Could such functionality be added?
> > > 
> > > It could, do we want and need to, no. You have the Ethernet alias in
> > > /sys/class/net/*/device/uevent already that would allow you to perform
> > > that (re)naming in user-space:
> > > 
> > > # cat /sys/class/net/eth0/device/uevent
> > > DRIVER=bcmgenet
> > > OF_NAME=ethernet
> > > OF_FULLNAME=/rdb/ethernet@f048
> > > OF_TYPE=network
> > > OF_COMPATIBLE_0=brcm,genet-v5
> > > OF_COMPATIBLE_N=1
> > > OF_ALIAS_0=eth0 <==
> > > MODALIAS=of:NethernetTnetworkCbrcm,genet-v5
> > 
> > Yes, one can if one uses udev and can find something to identify the hw I/F 
> > with, my
> > cat /sys/class/net/eth0/device/uevent looks like:
> > DRIVER=fsl_dpa
> > MODALIAS=platform:dpaa-ethernet
> 
> Does not dpaa have a notion of Ethernet ports and those should have
> proper information? Maybe that is part of your problem here, it should
> have the OF_ALIAS information somehow available.

I cannot say ATM, but this lack of standard does not make it easier to rename 
I/F's in udev.

> 
> > not sure mdev supports this, does it?
> > Our simple installer FS(initramfs) doesn't have either udev or mdev.
> 
> I don't know, but you could have a simple shell script that looks at
> specific network device properties to decide on the naming and call
> ifrename.

This reinventing of the wheel is what I am trying to avoid.

> 
> > I also noted that using status = "disabled" didn't work either to create a 
> > fix name scheme.
> > Even worse, all the eth I/F after gets renumbered. It seems to me there
> > is value in having stability in eth I/F naming at boot.
> > Then userspace(udev) can rename if need be.
> > 
> > Sure would like to known more about why this feature is not wanted ?
> > 
> > I found
> >   https://patchwork.kernel.org/patch/4122441/
> > You quote policy as reason but surely it must be better to
> > have something stable, connected to the hardware name, than semirandom 
> > naming?
> 
> If the Device Tree nodes are ordered by ascending base register address,
> my understanding is that you get the same order as far as
> platform_device creation goes, this may not be true in the future if Rob
> decides to randomize that, but AFAICT this is still true. This may not
> work well with status = disabled properties being inserted here and
> there, but we have used that here and it has worked for as far as I can
> remember doing it.

I recall it is the order in which the eth alias appear that controls the naming,
not 100% sure though.

> 
> Second, you might want to name network devices ethX, but what if I want
> to name them ethernetX or fooX or barX? Should we be accepting a
> mechanism in the kernel that would allow someone to name the interfaces
> the way they want straight from a name being provided in Device Tree?

I just want to have stable boot names, aka ethX, which can defined in
the platforms DT. Then userspace can go from there to whatever it needs,
udev could possibly use these stable boot names to identify the I/F's to rename.

ATM, it is pretty hard to even use udev when /sys/class/net/eth0/device/uevent
can look different even for OF created interfaces.

> 
> Aliases are fine for providing relative stability within the Device Tree
> itself and boot programs that might need to modify the Device Tree (e.g:
> inserting MAC addresses) such that you don't have to encode logic to
> search for nodes by compatible strings etc. but outside of that use
> case, it seems to me that you can resolve every naming decision in
> user-space.

Well, you can resolve MAC address assignment in user space too but most
will agree that is not convenient. I suggest it is also handy to have
some control of I/F enumeration(ethX that is) from platform code like DT.

 Jocke

Re: ethernet "bus" number in DTS ?

2018-10-23 Thread Joakim Tjernlund

On Tue, 2018-10-23 at 10:03 -0700, Florian Fainelli wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> On 10/23/18 9:49 AM, Joakim Tjernlund wrote:
> > SPI (and others) has a way to define bus number in a aliases:
> >   aliases {
> >   ethernet4 = 
> >   ethernet0 = 
> >   ethernet1 = 
> >   ethernet2 = 
> >   ethernet3 = 
> >   spi0 = 
> >   };
> > The 0 in the spi0 alias will translate to bus num 0 so one can control the 
> > /dev nodes, like /dev/spidev0
> > I am looking for the same for ethernet devices:
> >  ethernet4 =   /* should become eth4 */
> >  ethernet0 =   /* should become eth0 */
> > but I cannot find something like that for eth devices.
> > 
> > Could such functionality be added?
> 
> It could, do we want and need to, no. You have the Ethernet alias in
> /sys/class/net/*/device/uevent already that would allow you to perform
> that (re)naming in user-space:
> 
> # cat /sys/class/net/eth0/device/uevent
> DRIVER=bcmgenet
> OF_NAME=ethernet
> OF_FULLNAME=/rdb/ethernet@f048
> OF_TYPE=network
> OF_COMPATIBLE_0=brcm,genet-v5
> OF_COMPATIBLE_N=1
> OF_ALIAS_0=eth0 <==
> MODALIAS=of:NethernetTnetworkCbrcm,genet-v5

Yes, one can if one uses udev and can find something to identify the hw I/F 
with, my
cat /sys/class/net/eth0/device/uevent looks like:
DRIVER=fsl_dpa
MODALIAS=platform:dpaa-ethernet

not sure mdev supports this, does it?
Our simple installer FS(initramfs) doesn't have either udev or mdev.

I also noted that using status = "disabled" didn't work either to create a fix 
name scheme.
Even worse, all the eth I/F after gets renumbered. It seems to me there
is value in having stability in eth I/F naming at boot.
Then userspace(udev) can rename if need be. 

Sure would like to known more about why this feature is not wanted ?

I found
  https://patchwork.kernel.org/patch/4122441/
You quote policy as reason but surely it must be better to
have something stable, connected to the hardware name, than semirandom naming?

 Jocke

ethernet "bus" number in DTS ?

2018-10-23 Thread Joakim Tjernlund

SPI (and others) has a way to define bus number in a aliases:
aliases {
ethernet4 = 
ethernet0 = 
ethernet1 = 
ethernet2 = 
ethernet3 = 
spi0 = 
};
The 0 in the spi0 alias will translate to bus num 0 so one can control the /dev 
nodes, like /dev/spidev0
I am looking for the same for ethernet devices:
 ethernet4 =   /* should become eth4 */
 ethernet0 =   /* should become eth0 */
but I cannot find something like that for eth devices.

Could such functionality be added?

 Jocke

Re: [PATCH] powerpc/io: remove old GCC version implementation

2018-10-16 Thread Joakim Tjernlund

On Tue, 2018-10-16 at 12:33 +, Christophe Leroy wrote:
> 
> 
> GCC 4.6 is the minimum supported now.

Ouch, from kernel 4.19 or earlier even ?

 Jocke

Re: [PATCH] powerpc/time: Calculate proper wday

2018-09-18 Thread Joakim Tjernlund

On Tue, 2018-09-18 at 10:08 +0200, Mathieu Malaterre wrote:
> 
> 
> On Wed, Aug 29, 2018 at 10:03 AM Joakim Tjernlund 
>  wrote:
> >
> > to_tm() hardcodes wday to -1 as "No-one uses the day of the week".
> > But recently rtc driver ds1307 does care and tries to correct wday.
> >
> > Add wday calculation(stolen from rtc_time64_to_tm) to to_tm() to please 
> > ds1307.
> 
> Is this still an issue after:
> 
> 34efabe41895 powerpc: remove unused to_tm() helper

No, it is not an issue anymore. You can drop this patch.

 Jocke

> 
> ?
> 
> > Signed-off-by: Joakim Tjernlund 
> > ---
> >  arch/powerpc/kernel/time.c | 8 +++-
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
> > index fe6f3a285455..f4a09ee01944 100644
> > --- a/arch/powerpc/kernel/time.c
> > +++ b/arch/powerpc/kernel/time.c
> > @@ -1160,6 +1160,9 @@ void to_tm(int tim, struct rtc_time * tm)
> > day = tim / SECDAY;
> > hms = tim % SECDAY;
> >
> > +   /* day of the week, 1970-01-01 was a Thursday */
> > +   tm->tm_wday = (day + 4) % 7;
> > +
> > /* Hours, minutes, seconds are easy */
> > tm->tm_hour = hms / 3600;
> > tm->tm_min = (hms % 3600) / 60;
> > @@ -1180,11 +1183,6 @@ void to_tm(int tim, struct rtc_time * tm)
> >
> > /* Days are what is left over (+1) from all that. */
> > tm->tm_mday = day + 1;
> > -
> > -   /*
> > -* No-one uses the day of the week.
> > -*/
> > -   tm->tm_wday = -1;
> >  }
> >  EXPORT_SYMBOL(to_tm);
> >
> > --
> > 2.16.4

[PATCH] powerpc/time: Calculate proper wday

2018-08-29 Thread Joakim Tjernlund

to_tm() hardcodes wday to -1 as "No-one uses the day of the week".
But recently rtc driver ds1307 does care and tries to correct wday.

Add wday calculation(stolen from rtc_time64_to_tm) to to_tm() to please ds1307.

Signed-off-by: Joakim Tjernlund 
---
 arch/powerpc/kernel/time.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index fe6f3a285455..f4a09ee01944 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -1160,6 +1160,9 @@ void to_tm(int tim, struct rtc_time * tm)
day = tim / SECDAY;
hms = tim % SECDAY;
 
+   /* day of the week, 1970-01-01 was a Thursday */
+   tm->tm_wday = (day + 4) % 7;
+
/* Hours, minutes, seconds are easy */
tm->tm_hour = hms / 3600;
tm->tm_min = (hms % 3600) / 60;
@@ -1180,11 +1183,6 @@ void to_tm(int tim, struct rtc_time * tm)
 
/* Days are what is left over (+1) from all that. */
tm->tm_mday = day + 1;
-
-   /*
-* No-one uses the day of the week.
-*/
-   tm->tm_wday = -1;
 }
 EXPORT_SYMBOL(to_tm);
 
-- 
2.16.4

Re: [PATCH 4/6] net/wan/fsl_ucc_hdlc: default hmask value

2018-08-29 Thread Joakim Tjernlund

On Wed, 2018-08-29 at 09:18 +0200, Christophe LEROY wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Le 29/08/2018 à 04:54, Qiang Zhao a écrit :
> > From: David Gounaris 
> > Date: 2018/8/28 19:09
> > > Subject: [PATCH 4/6] net/wan/fsl_ucc_hdlc: default hmask value
> > > 
> > > Set default HMASK to 0x to use
> > > promiscuous mode in the hdlc controller.
> 
> Why do you need to do that ?
> 
> An HDLC frame encapsulating Ethernet should look like:
> 
> HDLC Frame includes:
> – Opening Flag  7E hex
> – Address field FF hex
> – Control field 03 hex
> – Information field Original Ethernet Packet, 1522 octets max.
> – FCS CRC16 (2 bytes)
> – Closing Flag  7E hex

In our case the Eth frame starts directly after the Opening Flag so
there is no FF address field.

> 
> What do you mean by 'promiscuous mode' ?

Just that we don't want any address filtering of packets. I don't think this 
would
be a problem for pure HDLC frames either, do you?

> 
> In any case, should a filter mask be changed for promiscuous mode, I
> believe the change should be done at the time you enter promiscuous
> mode, not at all time.
> 
> Christophe
> 
> > > 
> > > Signed-off-by: David Gounaris 
> > > ---
> > >   drivers/net/wan/fsl_ucc_hdlc.h | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/wan/fsl_ucc_hdlc.h 
> > > b/drivers/net/wan/fsl_ucc_hdlc.h
> > > index c21134c1f180..5bc3d1a6ca6e 100644
> > > --- a/drivers/net/wan/fsl_ucc_hdlc.h
> > > +++ b/drivers/net/wan/fsl_ucc_hdlc.h
> > > @@ -134,7 +134,7 @@ struct ucc_hdlc_private {
> > > 
> > >   #define HDLC_HEAD_MASK 0x
> > >   #define DEFAULT_HDLC_HEAD  0xff44
> > > -#define DEFAULT_ADDR_MASK   0x00ff
> > > +#define DEFAULT_ADDR_MASK   0x
> > >   #define DEFAULT_HDLC_ADDR  0x00ff
> > > 
> > >   #define BMR_GBL0x2000
> > > --
> > 
> > It is not proper to set default HMASK to 0x, how about to add a new 
> > property standing for hmask into device tree,
> > If get this property from dtb, then set it with the value from dtb, 
> > otherwise, set it with default HMASK ox00ff?
> > 
> > Best Regards
> > Qiang Zhao
> >

11 minute NTP hw clock update racy?

2018-08-27 Thread Joakim Tjernlund

We see corrupt HW clock time every now and then(really hard to reproduce)
Our RTC is a DS1388 on an I2C bus.

Looking at ntp_notify_cmos_timer() and it's delayed work queue impl. I wonder
if there could be a race here w.r.t reboot ?

Could the 11 minute update kick in just as the system is about to reset
the CPU?

I am on 4.14.51, ppc32 and using the ppc_md.restart() hook which will
reset the CPU immediately.

Question, is safe to call ntp_notify_cmos_timer() when the work queue is already
armed(like do_adjtimex() does) ? 

 Jocke

Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple

2018-08-02 Thread Joakim Tjernlund

Leo, did this go anywhere ?

Jocke

On Tue, 2018-07-03 at 16:35 +, Leo Li wrote:
> 
> > -Original Message-
> > From: York Sun
> > Sent: Tuesday, July 3, 2018 10:27 AM
> > To: jo...@infinera.com ; Leo Li
> > 
> > Cc: linuxppc-dev@lists.ozlabs.org; m...@ellerman.id.au; Qiang Zhao
> > 
> > Subject: Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple
> > 
> > +Leo
> > 
> > On 07/03/2018 03:30 AM, Joakim Tjernlund wrote:
> > > On Tue, 2018-06-26 at 23:41 +1000, Michael Ellerman wrote:
> > > > 
> > > > Joakim Tjernlund  writes:
> > > > > On Thu, 2018-06-21 at 02:38 +, Qiang Zhao wrote:
> > > > > > On 06/19/2018 09:22 AM, Joakim Tjernlund wrote:
> > > > > > -Original Message-
> > > > > > From: Linuxppc-dev [mailto:linuxppc-dev-
> > 
> > bounces+qiang.zhao=nxp@lists.ozlabs.org] On Behalf Of Joakim
> > Tjernlund
> > > > > > Sent: 2018年6月20日 0:22
> > > > > > To: York Sun ; linuxppc-dev  > 
> > d...@lists.ozlabs.org>
> > > > > > Subject: [PATCH] QE GPIO: Add qe_gpio_set_multiple
> > > > > > 
> > > > > > This cousin to gpio-mpc8xxx was lacking a multiple pins method, add
> > 
> > one.
> > > > > > 
> > > > > > Signed-off-by: Joakim Tjernlund 
> > > > > > ---
> > > > > >  drivers/soc/fsl/qe/gpio.c | 28 
> > > > > >  1 file changed, 28 insertions(+)
> > > > 
> > > > ...
> > > > > >  static int qe_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio) 
> > > > > >  {
> > > > > > struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc); @@ -
> > 
> > 298,6 +325,7 @@ static int __init qe_add_gpiochips(void)
> > > > > > gc->direction_output = qe_gpio_dir_out;
> > > > > > gc->get = qe_gpio_get;
> > > > > > gc->set = qe_gpio_set;
> > > > > > +   gc->set_multiple = qe_gpio_set_multiple;
> > > > > > 
> > > > > > ret = of_mm_gpiochip_add_data(np, mm_gc, qe_gc);
> > > > > > if (ret)
> > > > > > 
> > > > > > Reviewed-by: Qiang Zhao 
> > > > > > 
> > > > > 
> > > > > Who picks up this patch ? Is it queued somewhere already?
> > > > 
> > > > Not me.
> > > 
> > > York? You seem to be the only one left.
> > > 
> > 
> > I am not a Linux maintainer. Even I want to, I can't merge this patch.
> > 
> > Leo, who can merge this patch and request a pull?
> 
> Since it falls under the driver/soc/fsl/ folder.  I can take it.
> 
> Regards,
> Leo

Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple

2018-07-03 Thread Joakim Tjernlund

On Tue, 2018-06-26 at 23:41 +1000, Michael Ellerman wrote:
> 
> Joakim Tjernlund  writes:
> > On Thu, 2018-06-21 at 02:38 +, Qiang Zhao wrote:
> > > On 06/19/2018 09:22 AM, Joakim Tjernlund wrote:
> > > -Original Message-
> > > From: Linuxppc-dev 
> > > [mailto:linuxppc-dev-bounces+qiang.zhao=nxp....@lists.ozlabs.org] On 
> > > Behalf Of Joakim Tjernlund
> > > Sent: 2018年6月20日 0:22
> > > To: York Sun ; linuxppc-dev 
> > > 
> > > Subject: [PATCH] QE GPIO: Add qe_gpio_set_multiple
> > > 
> > > This cousin to gpio-mpc8xxx was lacking a multiple pins method, add one.
> > > 
> > > Signed-off-by: Joakim Tjernlund 
> > > ---
> > >  drivers/soc/fsl/qe/gpio.c | 28 
> > >  1 file changed, 28 insertions(+)
> 
> ...
> > >  static int qe_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)  {
> > > struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc); @@ -298,6 
> > > +325,7 @@ static int __init qe_add_gpiochips(void)
> > > gc->direction_output = qe_gpio_dir_out;
> > > gc->get = qe_gpio_get;
> > > gc->set = qe_gpio_set;
> > > +   gc->set_multiple = qe_gpio_set_multiple;
> > > 
> > > ret = of_mm_gpiochip_add_data(np, mm_gc, qe_gc);
> > > if (ret)
> > > 
> > > Reviewed-by: Qiang Zhao 
> > > 
> > 
> > Who picks up this patch ? Is it queued somewhere already?
> 
> Not me.

York? You seem to be the only one left.

 Jocke

Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple

2018-06-25 Thread Joakim Tjernlund

On Thu, 2018-06-21 at 02:38 +, Qiang Zhao wrote:
> 
> On 06/19/2018 09:22 AM, Joakim Tjernlund wrote:
> -Original Message-
> From: Linuxppc-dev 
> [mailto:linuxppc-dev-bounces+qiang.zhao=nxp@lists.ozlabs.org] On Behalf 
> Of Joakim Tjernlund
> Sent: 2018年6月20日 0:22
> To: York Sun ; linuxppc-dev 
> Subject: [PATCH] QE GPIO: Add qe_gpio_set_multiple
> 
> This cousin to gpio-mpc8xxx was lacking a multiple pins method, add one.
> 
> Signed-off-by: Joakim Tjernlund 
> ---
>  drivers/soc/fsl/qe/gpio.c | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/soc/fsl/qe/gpio.c b/drivers/soc/fsl/qe/gpio.c index 
> 3b27075c21a7..819bed0f5667 100644
> --- a/drivers/soc/fsl/qe/gpio.c
> +++ b/drivers/soc/fsl/qe/gpio.c
> @@ -83,6 +83,33 @@ static void qe_gpio_set(struct gpio_chip *gc, unsigned int 
> gpio, int val)
> spin_unlock_irqrestore(_gc->lock, flags);  }
> 
> +static void qe_gpio_set_multiple(struct gpio_chip *gc,
> +unsigned long *mask, unsigned long *bits) {
> +   struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
> +   struct qe_gpio_chip *qe_gc = gpiochip_get_data(gc);
> +   struct qe_pio_regs __iomem *regs = mm_gc->regs;
> +   unsigned long flags;
> +   int i;
> +
> +   spin_lock_irqsave(_gc->lock, flags);
> +
> +   for (i = 0; i < gc->ngpio; i++) {
> +   if (*mask == 0)
> +   break;
> +   if (__test_and_clear_bit(i, mask)) {
> +   if (test_bit(i, bits))
> +   qe_gc->cpdata |= (1U << (QE_PIO_PINS - 1 - 
> i));
> +   else
> +   qe_gc->cpdata &= ~(1U << (QE_PIO_PINS - 1 - 
> i));
> +   }
> +   }
> +
> +   out_be32(>cpdata, qe_gc->cpdata);
> +
> +   spin_unlock_irqrestore(_gc->lock, flags); }
> +
>  static int qe_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)  {
> struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc); @@ -298,6 
> +325,7 @@ static int __init qe_add_gpiochips(void)
> gc->direction_output = qe_gpio_dir_out;
> gc->get = qe_gpio_get;
> gc->set = qe_gpio_set;
> +   gc->set_multiple = qe_gpio_set_multiple;
> 
> ret = of_mm_gpiochip_add_data(np, mm_gc, qe_gc);
> if (ret)
> 
> Reviewed-by: Qiang Zhao 
> 

Who picks up this patch ? Is it queued somewhere already?

  Jocke

[PATCH] spi/fsl-espi: Add missing cell-index OF property

2018-06-19 Thread Joakim Tjernlund

espi does not look for a OF cell-index property which
makes the bus numbering dynamic only. This add an
optional cell-index.

Signed-off-by: Joakim Tjernlund 
---
 drivers/spi/spi-fsl-espi.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
index 1d332e23f6ed..56b71c5e2f10 100644
--- a/drivers/spi/spi-fsl-espi.c
+++ b/drivers/spi/spi-fsl-espi.c
@@ -672,6 +672,14 @@ static int fsl_espi_probe(struct device *dev, struct 
resource *mem,
 
dev_set_drvdata(dev, master);
 
+   if (dev->of_node) {
+   u32 cell_index;
+
+   if (!of_property_read_u32(dev->of_node, "cell-index",
+ _index))
+   master->bus_num = cell_index;
+   }
+
master->mode_bits = SPI_RX_DUAL | SPI_CPOL | SPI_CPHA | SPI_CS_HIGH |
SPI_LSB_FIRST | SPI_LOOP;
master->dev.of_node = dev->of_node;
-- 
2.13.6

[PATCH] QE GPIO: Add qe_gpio_set_multiple

2018-06-19 Thread Joakim Tjernlund

This cousin to gpio-mpc8xxx was lacking a multiple pins method,
add one.

Signed-off-by: Joakim Tjernlund 
---
 drivers/soc/fsl/qe/gpio.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/soc/fsl/qe/gpio.c b/drivers/soc/fsl/qe/gpio.c
index 3b27075c21a7..819bed0f5667 100644
--- a/drivers/soc/fsl/qe/gpio.c
+++ b/drivers/soc/fsl/qe/gpio.c
@@ -83,6 +83,33 @@ static void qe_gpio_set(struct gpio_chip *gc, unsigned int 
gpio, int val)
spin_unlock_irqrestore(_gc->lock, flags);
 }
 
+static void qe_gpio_set_multiple(struct gpio_chip *gc,
+unsigned long *mask, unsigned long *bits)
+{
+   struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+   struct qe_gpio_chip *qe_gc = gpiochip_get_data(gc);
+   struct qe_pio_regs __iomem *regs = mm_gc->regs;
+   unsigned long flags;
+   int i;
+
+   spin_lock_irqsave(_gc->lock, flags);
+
+   for (i = 0; i < gc->ngpio; i++) {
+   if (*mask == 0)
+   break;
+   if (__test_and_clear_bit(i, mask)) {
+   if (test_bit(i, bits))
+   qe_gc->cpdata |= (1U << (QE_PIO_PINS - 1 - i));
+   else
+   qe_gc->cpdata &= ~(1U << (QE_PIO_PINS - 1 - i));
+   }
+   }
+
+   out_be32(>cpdata, qe_gc->cpdata);
+
+   spin_unlock_irqrestore(_gc->lock, flags);
+}
+
 static int qe_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)
 {
struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
@@ -298,6 +325,7 @@ static int __init qe_add_gpiochips(void)
gc->direction_output = qe_gpio_dir_out;
gc->get = qe_gpio_get;
gc->set = qe_gpio_set;
+   gc->set_multiple = qe_gpio_set_multiple;
 
ret = of_mm_gpiochip_add_data(np, mm_gc, qe_gc);
if (ret)
-- 
2.13.6

Re: [PATCH 12/17] powerpc/8xx: Remove PTE_ATOMIC_UPDATES

2018-05-04 Thread Joakim Tjernlund

On Fri, 2018-05-04 at 14:34 +0200, Christophe Leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss")
> introduced non atomic PTE updates and started the work of removing
> PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the
> 8xx with the following comment:
> /* Until my rework is finished, 8xx still needs atomic PTE updates */
> 
> commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux
> mm expects") removed all PTE updates done in TLB miss handlers

Is that my 7 year old commit ?

> 
> Therefore, atomic PTE updates are not needed anymore for the 8xx

About time removing atomic updates then :)

> 
> Signed-off-by: Christophe Leroy 
>

Re: [PATCH 0/5] DPAA Ethernet fixes

2018-03-14 Thread Joakim Tjernlund

On Wed, 2018-03-14 at 08:37 -0500, Madalin Bucur wrote:
> Hi,
> 
> This patch set is addressing several issues in the DPAA Ethernet
> driver suite:
> 
>  - module unload crash caused by wrong reference to device being left
>in the cleanup code after the DSA related changes
>  - scheduling wile atomic bug in QMan code revealed during dpaa_eth
>module unload
>  - a couple of error counter fixes, a duplicated init in dpaa_eth.

hmm, some of these(all?) bugs are in stable too, CC: stable perhaps? 

> 
> Madalin
> 
> Camelia Groza (3):
>   dpaa_eth: remove duplicate initialization
>   dpaa_eth: increment the RX dropped counter when needed
>   dpaa_eth: remove duplicate increment of the tx_errors counter
> 
> Madalin Bucur (2):
>   soc/fsl/qbman: fix issue in qman_delete_cgr_safe()
>   dpaa_eth: fix error in dpaa_remove()
> 
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c |  8 
>  drivers/soc/fsl/qbman/qman.c   | 28 
> +-
>  2 files changed, 9 insertions(+), 27 deletions(-)
> 
> --
> 2.1.0
>

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-19 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> > > commit 4d8ee1935bcd666360311dfdadeee235d682d69a
> > > Author: Florian Fainelli 
> > > Date: Tue Aug 22 15:24:47 2017 -0700
> > > fsl/man: Inherit parent device and of_node
> > > 
> > > and was later addressed by this patch set:
> > > 
> > > http://patchwork.ozlabs.org/project/netdev/list/?series=8462=*
> > > 
> > > Even with these errors printed, all is working fine, it's just the
> > > second probing that fails. Adding the latter patches or reverting
> > > the one above makes the errors prints dissapear.
> > 
> > Looking at the above patch seriers I see it is in state Accepted and has 
> > been there
> > since 2017-10-16
> > That seems like a awful long to wait in before getting into Linux, is there 
> > something
> > holding these patches back ?
> 
> They are in Linux, have been since October 16th. But at the moment,
> they are only in v4.15, not v4.14.

Now I see them in 4.15, must have looked in the wrong branch.

> 
> These patches probably don't fit the stable rules, for getting them
> added to v4.14.
> 
> https://github.com/torvalds/linux/blob/master/Documentation/process/stable-kernel-rules.rst

Stuff needs to work, whatever needed to make that happen is allowed. Even 
backporting
some new infra structure if need be to simplify fixing bugs.

> 
> What is needed is a minimal fix. Or just wait until Sunday, when there
> is a good chance v4.15 will be released.

You seem to think everyone always upgrade to linux latest but this is not so.
We do product development here and appreciate the stable kernels so we can work 
in
peace and not chasing the latest kernel.

 Jocke

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-19 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote:
> 
> > -Original Message-
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Tuesday, January 16, 2018 7:58 PM
> > To: and...@lunn.ch
> > Subject: Re: DPAA Ethernet traffice troubles with Linux kernel
> > 
> > On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> > > 
> > > Hi Joakim
> > > 
> > > You appear to be using an old kernel. Take a look at:
> > 
> > Not really, I am using 4.14.x and I don't think that is old. Seems like
> > this
> > patch hasn't been sent to 4.14.x.
> > 
> > I wonder if I might be missing something else, we just moved to 4.14 and
> > notic that all
> > our fixed PHYs are non functioning:
> > fsl_mac ffe4e2000.ethernet: FMan MEMAC
> > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> > fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.0 failed with error -16
> > fsl_mac ffe4e4000.ethernet: FMan MEMAC
> > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> > fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.1 failed with error -16
> > fsl_mac ffe4e6000.ethernet: FMan MEMAC
> > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> > fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.2 failed with error -16
> > fsl_mac ffe4e8000.ethernet: FMan MEMAC
> > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> > fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.3 failed with error -16
> > 
> > Feels like FMAN still think there are real PHYs there ?
> 
> Hi Joakim,
> 
> These errors are issued when trying to probe the second time the same
> MAC node. The issue was introduced by this commit:
> 
> commit 4d8ee1935bcd666360311dfdadeee235d682d69a
> Author: Florian Fainelli <f.faine...@gmail.com>
> Date: Tue Aug 22 15:24:47 2017 -0700
> fsl/man: Inherit parent device and of_node
> 
> and was later addressed by this patch set:
> 
> http://patchwork.ozlabs.org/project/netdev/list/?series=8462=*
> 
> Even with these errors printed, all is working fine, it's just the
> second probing that fails. Adding the latter patches or reverting
> the one above makes the errors prints dissapear.

Looking at the above patch seriers I see it is in state Accepted and has been 
there
since 2017-10-16
That seems like a awful long to wait in before getting into Linux, is there 
something
holding these patches back ?

 Jocke

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-18 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Joakim Tjernlund wrote:
> On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote:
> > CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you recognize the sender and know 
> > the content is safe.
> > 
> > 
> > > -----Original Message-
> > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > Sent: Tuesday, January 16, 2018 7:58 PM
> > > To: and...@lunn.ch
> > > Subject: Re: DPAA Ethernet traffice troubles with Linux kernel
> > > 
> > > On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> > > > 
> > > > Hi Joakim
> > > > 
> > > > You appear to be using an old kernel. Take a look at:
> > > 
> > > Not really, I am using 4.14.x and I don't think that is old. Seems like
> > > this
> > > patch hasn't been sent to 4.14.x.
> > > 
> > > I wonder if I might be missing something else, we just moved to 4.14 and
> > > notic that all
> > > our fixed PHYs are non functioning:
> > > fsl_mac ffe4e2000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> > > fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.0 failed with error -16
> > > fsl_mac ffe4e4000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> > > fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.1 failed with error -16
> > > fsl_mac ffe4e6000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> > > fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.2 failed with error -16
> > > fsl_mac ffe4e8000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> > > fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.3 failed with error -16
> > > 
> > > Feels like FMAN still think there are real PHYs there ?
> > 
> > Hi Joakim,
> > 
> > These errors are issued when trying to probe the second time the same
> > MAC node. The issue was introduced by this commit:
> > 
> > commit 4d8ee1935bcd666360311dfdadeee235d682d69a
> > Author: Florian Fainelli <f.faine...@gmail.com>
> > Date: Tue Aug 22 15:24:47 2017 -0700
> > fsl/man: Inherit parent device and of_node
> > 
> > and was later addressed by this patch set:
> > 
> > http://patchwork.ozlabs.org/project/netdev/list/?series=8462=*
> > 
> > Even with these errors printed, all is working fine, it's just the
> > second probing that fails. Adding the latter patches or reverting
> > the one above makes the errors prints dissapear.
> > 
> > Madalin
> 
> Ahh, now it starts to look better, reverting "fsl/man: Inherit parent device 
> and of_node" on 4.14 gives:
> libphy: Fixed MDIO Bus: probed
> tun: Universal TUN/TAP device driver, 1.6
> libphy: Freescale XGMAC MDIO Bus: probed
> iommu: Adding device ffe488000.port to group 10
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e1000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe489000.port to group 22
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e3000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48a000.port to group 23
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48b000.port to group 24
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48c000.port to group 25
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e9000: Error while reading PHY0 reg at 3.3
> fsl_mac ffe4e2000.ethernet: FMan MEMAC
> fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> fsl_mac ffe4e4000.ethernet: FMan MEMAC
> fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> fsl_mac ffe4e6000.ethernet: FMan MEMAC
> fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> fsl_mac ffe4e8000.ethernet: FMan MEMAC
> fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> fsl_mac ffe4e.ethernet: FMan MEMAC
> fsl_mac ffe4e.ethernet: FMan MAC address: 00:06:9c:0b:06:1f
> fsl_dpa dpaa-ethernet.0 eth0: Probed interface eth0
> fsl_dpa dpaa-ethernet.1 eth1: Probed i

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-17 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> > -Original Message-----
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Tuesday, January 16, 2018 7:58 PM
> > To: and...@lunn.ch
> > Subject: Re: DPAA Ethernet traffice troubles with Linux kernel
> > 
> > On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> > > 
> > > Hi Joakim
> > > 
> > > You appear to be using an old kernel. Take a look at:
> > 
> > Not really, I am using 4.14.x and I don't think that is old. Seems like
> > this
> > patch hasn't been sent to 4.14.x.
> > 
> > I wonder if I might be missing something else, we just moved to 4.14 and
> > notic that all
> > our fixed PHYs are non functioning:
> > fsl_mac ffe4e2000.ethernet: FMan MEMAC
> > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> > fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.0 failed with error -16
> > fsl_mac ffe4e4000.ethernet: FMan MEMAC
> > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> > fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.1 failed with error -16
> > fsl_mac ffe4e6000.ethernet: FMan MEMAC
> > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> > fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.2 failed with error -16
> > fsl_mac ffe4e8000.ethernet: FMan MEMAC
> > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> > fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed
> > fsl_mac: probe of dpaa-ethernet.3 failed with error -16
> > 
> > Feels like FMAN still think there are real PHYs there ?
> 
> Hi Joakim,
> 
> These errors are issued when trying to probe the second time the same
> MAC node. The issue was introduced by this commit:
> 
> commit 4d8ee1935bcd666360311dfdadeee235d682d69a
> Author: Florian Fainelli <f.faine...@gmail.com>
> Date: Tue Aug 22 15:24:47 2017 -0700
> fsl/man: Inherit parent device and of_node
> 
> and was later addressed by this patch set:
> 
> http://patchwork.ozlabs.org/project/netdev/list/?series=8462=*
> 
> Even with these errors printed, all is working fine, it's just the
> second probing that fails. Adding the latter patches or reverting
> the one above makes the errors prints dissapear.
> 
> Madalin

Ahh, now it starts to look better, reverting "fsl/man: Inherit parent device 
and of_node" on 4.14 gives:
libphy: Fixed MDIO Bus: probed
tun: Universal TUN/TAP device driver, 1.6
libphy: Freescale XGMAC MDIO Bus: probed
iommu: Adding device ffe488000.port to group 10
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e1000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe489000.port to group 22
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e3000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48a000.port to group 23
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48b000.port to group 24
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48c000.port to group 25
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e9000: Error while reading PHY0 reg at 3.3
fsl_mac ffe4e2000.ethernet: FMan MEMAC
fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
fsl_mac ffe4e4000.ethernet: FMan MEMAC
fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
fsl_mac ffe4e6000.ethernet: FMan MEMAC
fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
fsl_mac ffe4e8000.ethernet: FMan MEMAC
fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
fsl_mac ffe4e.ethernet: FMan MEMAC
fsl_mac ffe4e.ethernet: FMan MAC address: 00:06:9c:0b:06:1f
fsl_dpa dpaa-ethernet.0 eth0: Probed interface eth0
fsl_dpa dpaa-ethernet.1 eth1: Probed interface eth1
fsl_dpa dpaa-ethernet.2 eth2: Probed interface eth2
fsl_dpa dpaa-ethernet.3 eth3: Probed interface eth3
fsl_dpa dpaa-ethernet.4 eth4: Probed interface eth4

Still some minor errors: mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3
but this is going the right way(I have not had a chance to try if they work due
to external modules not ported/ready yet)

The other patch series is still to be tested though but I already now wanted 
to stress the importance of getting all upstream fixes into stable, ASAP.
You now what they are, I have no idea.

Thanks 
   Jocke

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-17 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> > > You appear to be using an old kernel. Take a look at:
> > 
> > Not really, I am using 4.14.x and I don't think that is old.
> 
> Development for 4.14 stopped somewhere around the beginning of
> September. So there has been over 4 months of work since then.  We are
> clearly interested in fixing bugs in that kernel, since it is the
> current stable version. But when reporting bugs, please let is know if
> the latest version of the network kernel,
> it://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git has
> the issue.
> 
> > Seems like this patch hasn't been sent to 4.14.x.
> 
> If it fixes a bug for you, please request the fix is added to stable.

That doesn't work really, having users to hit the bug, debug it, fix it and then
find it fixed already in upstream, then specifically request it to be 
backported to stable. 
I don't need this fix to be backported, already got it. Someone else might 
though.

I would be interested in bug fixes upstream which fixes:

libphy: Freescale XGMAC MDIO Bus: probed
iommu: Adding device ffe488000.port to group 10
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e1000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe489000.port to group 22
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e3000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48a000.port to group 23
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48b000.port to group 24
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3
iommu: Adding device ffe48c000.port to group 25
libphy: Freescale XGMAC MDIO Bus: probed
mdio_bus ffe4e9000: Error while reading PHY0 reg at 3.3
fsl_mac ffe4e2000.ethernet: FMan MEMAC
fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.0 failed with error -16
fsl_mac ffe4e4000.ethernet: FMan MEMAC
fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.1 failed with error -16
fsl_mac ffe4e6000.ethernet: FMan MEMAC
fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed

Every FMAN eth I/F with a fixed link fails mysteriously.
Custom board based on t1040rdb, been over the device tree and I cannot find any 
significant
changes.

 Jocke

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-16 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> > Hi, just saw this and thought of a small patch I just wrote for mdio bus, o 
> > idea
> > if it is relevant but here goes:
> > 
> > From fe0b98d54a79779482700676331b4d10a0f3cada Mon Sep 17 00:00:00 2001
> > From: Joakim Tjernlund <joakim.tjernl...@infinera.com>
> > Date: Sun, 14 Jan 2018 21:27:20 +0100
> > Subject: [PATCH] of_mdiobus_register: Continue after error
> > 
> > of_mdiobus_register unregister itself if one phy fails to register
> > which is bad for system having all its PHYs on the same MDIO bus.
> > Just log the error and continue with the remaining PHYs instead.
> > 
> > Signed-off-by: Joakim Tjernlund <joakim.tjernl...@infinera.com>
> 
> Hi Joakim
> 
> You appear to be using an old kernel. Take a look at:

Not really, I am using 4.14.x and I don't think that is old. Seems like this
patch hasn't been sent to 4.14.x.

I wonder if I might be missing something else, we just moved to 4.14 and notic 
that all
our fixed PHYs are non functioning:
fsl_mac ffe4e2000.ethernet: FMan MEMAC
fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.0 failed with error -16
fsl_mac ffe4e4000.ethernet: FMan MEMAC
fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.1 failed with error -16
fsl_mac ffe4e6000.ethernet: FMan MEMAC
fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.2 failed with error -16
fsl_mac ffe4e8000.ethernet: FMan MEMAC
fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed
fsl_mac: probe of dpaa-ethernet.3 failed with error -16

Feels like FMAN still think there are real PHYs there ?
> 
> commit 95f566de0269a0c59fd6a737a147731302136429
> Author: Madalin Bucur <madalin.bu...@nxp.com>
> Date:   Tue Jan 9 14:43:34 2018 +0200
> 
> of_mdio: avoid MDIO bus removal when a PHY is missing
> 
> If one of the child devices is missing the of_mdiobus_register_phy()
> call will return -ENODEV. When a missing device is encountered the
> registration of the remaining PHYs is stopped and the MDIO bus will
> fail to register. Propagate all errors except ENODEV to avoid it.
> 
> Signed-off-by: Madalin Bucur <madalin.bu...@nxp.com>
> Reviewed-by: Andrew Lunn <and...@lunn.ch>
> Signed-off-by: David S. Miller <da...@davemloft.net>
> 
> 
> Andrew

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-15 Thread Joakim Tjernlund

On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> > -Original Message-
> > From: Linuxppc-dev [mailto:linuxppc-dev-
> > bounces+madalin.bucur=nxp@lists.ozlabs.org] On Behalf Of mad skateman
> > Sent: Wednesday, January 10, 2018 10:39 PM
> > To: linuxppc-dev@lists.ozlabs.org
> > Subject: DPAA Ethernet traffice troubles with Linux kernel
> > 
> > Hi linux devs,
> > 
> > Like mentioned in this thread
> > https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-January/167630.html
> > i also experience the exact same issues.
> > I am also trying to find out why the network traffic is not flowing
> > the way it should (out for example ).
> > 
> > My linux knowledge is very basic but i hope i can contribute anyway.
> > 
> > I am using the AmigaOne X5000 with a P5020
> > Most detailed technical information regarding this issue can be found
> > in the Thread by Jamie Krueger mentioned above.
> > 
> > In this screenshot, the ETH0 and ETH1 seem up and running (probed) ..
> > even due to the FSL_DPAA_MAC error messages that DMESG shows.
> > http://www.skateman.nl/wp-content/uploads/2018/01/Screenshot-at-2018-01-08-21_22_06_ETH_NIC_ERROR.png
> > 
> > http://www.skateman.nl/wp-content/uploads/2018/01/Screenshot-at-2018-01-08-22_16_28.png
> > 
> > I was able to use some tooling like ETHTOOL to adjust some settings
> > and check if the interface responded. This all seems fine.
> > 
> > Hope that someone can find a fix, so the Ethernet adapter can be used.
> > 
> > Thanks!!
> 
> Hi,
> 
> Please use text logs instead of pictures next time, it's easier to read.
> The errors you see are related to missing MAC addresses for the unused
> interfaces, you can ignore these are they are not relevant for the issue
> you encounter. Normally the unused interfaces should have status disabled
> in the device tree but there is not a big deal if they fail like that.
> As I've advised Jamie on the other thread, please try to connect the device
> back 2 back to a known good machine and determine what is broken - Rx/Tx?
> Is there another software version that does work on these machines?

Hi, just saw this and thought of a small patch I just wrote for mdio bus, o idea
if it is relevant but here goes:

From fe0b98d54a79779482700676331b4d10a0f3cada Mon Sep 17 00:00:00 2001
From: Joakim Tjernlund <joakim.tjernl...@infinera.com>
Date: Sun, 14 Jan 2018 21:27:20 +0100
Subject: [PATCH] of_mdiobus_register: Continue after error

of_mdiobus_register unregister itself if one phy fails to register
which is bad for system having all its PHYs on the same MDIO bus.
Just log the error and continue with the remaining PHYs instead.

Signed-off-by: Joakim Tjernlund <joakim.tjernl...@infinera.com>
---
 drivers/of/of_mdio.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index 98258583abb0..76ff28a41dad 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -229,7 +229,8 @@ int of_mdiobus_register(struct mii_bus *mdio, struct 
device_node *np)
else
rc = of_mdiobus_register_device(mdio, child, addr);
if (rc)
-   goto unregister;
+   pr_warn(FW_WARN
+   "%pOF: Failed to register MDIO device.\n", 
child);
}
 
if (!scanphys)
@@ -253,7 +254,8 @@ int of_mdiobus_register(struct mii_bus *mdio, struct 
device_node *np)
if (of_mdiobus_child_is_phy(child)) {
rc = of_mdiobus_register_phy(mdio, child, addr);
if (rc)
-   goto unregister;
+   pr_warn(FW_WARN
+   "%pOF: Failed to register MDIO 
PHY.\n", child);
}
}
}
-- 
2.13.6

Re: Time to get rid of CPU6 ERRATA on powerpc/8xx ?

2018-01-07 Thread Joakim Tjernlund

On Sun, 2018-01-07 at 17:23 +0100, christophe leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Today, Linux kernel includes a workaround for CPU6 ERRATA on the 8xx
> powerpc.
> 
> This ERRATA exists on the 801, the 823, the 855/860 before revision C.0
> It doesn't concern any modern versions of the 8xx, neither the 860 past
> and including rev C.0, nor the 866 nor the 885
> 
> This workaround complicates the TLBmiss and TLBerror handlers and make
> the code more and more unreadable.
> 
> Since this workaround addresses very old versions of the 8xx, I'd like
> to get rid of it. Do you see any good reason to keep it today ? If not I
> will come with a cleanup patch in the coming weeks.
> 
> Another alternative would be to keep that workaround separated from the
> rest of the code (ie using different registers to avoid/limit code
> nesting), it would add a few cycles but would increase readability while
> keeping the ERRATA in, allthought my preference goes to a complete
> removal of the workaround.
> 
> So, thanks to let me know your opinion on that.

Having done work on TLB Miss/Error, I agree the CPU6 errata is a pain.
On all our 8xx (86x/850) we never had to use this errata so I am in favour for
removing CPU6 Errata from recent kernels.

  Jocke

Re: ppc32: semctl fails

2017-12-06 Thread Joakim Tjernlund

On Wed, 2017-12-06 at 23:56 +0100, Andreas Schwab wrote:
> 
> On Dez 06 2017, Joakim Tjernlund <joakim.tjernl...@infinera.com> wrote:
> 
> > st = semctl(sem, 0, IPC_STAT, );
> 
> This is not a valid use of IPC_STAT.  The fourth argument must be a
> object of type union semun (not a pointer), with the .buf member
> pointing to the struct semid_ds object.

Ahh, so perl had it wrong (Any perl guys here?). This is what I came up with 
and it works:

struct semid_ds arg;
  union semun {
int  val;/* Value for SETVAL */
struct semid_ds *buf;/* Buffer for IPC_STAT, IPC_SET */
unsigned short  *array;  /* Array for GETALL, SETALL */
struct seminfo  *__buf;  /* Buffer for IPC_INFO 
   
(Linux-specific) */
  } semopts = {0};
  semopts.buf = 

#if defined(IPC_PRIVATE) && defined(S_IRWXU) && defined(S_IRWXG) &&  
defined(S_IRWXO) && defined(IPC_CREAT)
  sem = semget(IPC_PRIVATE, 1, S_IRWXU|S_IRWXG|S_IRWXO|IPC_CREAT);
  if (sem > -1) {
#ifdef IPC_STAT
st = semctl(sem, 0, IPC_STAT, semopts);
if (st == 0)
  printf("semid_ds\n");
else
#endif /* IPC_STAT */
  printf("semctl IPC_STAT failed: errno = %s\n", strerror(errno));
#ifdef IPC_RMID
if (semctl(sem, 0, IPC_RMID, semopts) != 0)
#endif /* IPC_RMID */
  printf("semctl IPC_RMID failed: errno = %d\n", errno);
  } else
#endif /* IPC_PRIVATE && ... */
printf("semget failed: errno = %d\n", errno);
  return 0;

 Jocke

ppc32: semctl fails

2017-12-06 Thread Joakim Tjernlund

This test, taken from perl Configure, fails on my ppc32, should it?
  semctl IPC_STAT failed: errno = Bad Address 
is what I get, kernel is 4.1.43

-

#include 
#include 
#include 
#include 
#ifndef S_IRUSR
#   ifdef S_IREAD
#define S_IRUSR S_IREAD
#define S_IWUSR S_IWRITE
#define S_IXUSR S_IEXEC
#   else
#define S_IRUSR 0400
#define S_IWUSR 0200
#define S_IXUSR 0100
#   endif
#   define S_IRGRP (S_IRUSR>>3)
#   define S_IWGRP (S_IWUSR>>3)
#   define S_IXGRP (S_IXUSR>>3)
#   define S_IROTH (S_IRUSR>>6)
#   define S_IWOTH (S_IWUSR>>6)
#   define S_IXOTH (S_IXUSR>>6)
#endif
#ifndef S_IRWXU
#   define S_IRWXU (S_IRUSR|S_IWUSR|S_IXUSR)
#   define S_IRWXG (S_IRGRP|S_IWGRP|S_IXGRP)
#   define S_IRWXO (S_IROTH|S_IWOTH|S_IXOTH)
#endif

#include 
#include 
#include 
#ifndef errno
extern int errno;
#endif
int main() {
  int sem, st;
  struct semid_ds arg;

#if defined(IPC_PRIVATE) && defined(S_IRWXU) && defined(S_IRWXG) &&  
defined(S_IRWXO) && defined(IPC_CREAT)
  sem = semget(IPC_PRIVATE, 1, S_IRWXU|S_IRWXG|S_IRWXO|IPC_CREAT);
  if (sem > -1) {
#ifdef IPC_STAT
st = semctl(sem, 0, IPC_STAT, );
if (st == 0)
  printf("semid_ds\n");
else
#endif /* IPC_STAT */
  printf("semctl IPC_STAT failed: errno = %s\n", strerror(errno));
#ifdef IPC_RMID
if (semctl(sem, 0, IPC_RMID, ) != 0)
#endif /* IPC_RMID */
  printf("semctl IPC_RMID failed: errno = %d\n", errno);
  } else
#endif /* IPC_PRIVATE && ... */
printf("semget failed: errno = %d\n", errno);

  return 0;
}

Re: [PATCH] fsl_pci: Correct fsl_pci_mcheck_exception

2017-11-21 Thread Joakim Tjernlund

On Wed, 2017-09-06 at 19:19 +, Leo Li wrote:
> > -Original Message-
> > From: York Sun
> > Sent: Wednesday, September 06, 2017 10:34 AM
> > To: Leo Li <leoyang...@nxp.com>
> > Cc: Joakim Tjernlund <joakim.tjernl...@infinera.com>; linuxppc-dev linuxppc-
> > dev <linuxppc-dev@lists.ozlabs.org>
> > Subject: Re: [PATCH] fsl_pci: Correct fsl_pci_mcheck_exception
> > 
> > On 09/05/2017 04:59 AM, Joakim Tjernlund wrote:
> > > get_user() had it args reversed causing NIP to be NULL:ed instead of
> > > fixing up the PCI access.
> > > 
> > > Note: This still hangs my P1020 Freescale CPU hard, but at least I get
> > > a NIP now.
> > > 
> > > Signed-off-by: Joakim Tjernlund <joakim.tjernl...@infinera.com>
> > > ---
> > >   arch/powerpc/sysdev/fsl_pci.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/powerpc/sysdev/fsl_pci.c
> > > b/arch/powerpc/sysdev/fsl_pci.c index 7c8b779c329a..9e64c12dff6a
> > > 100644
> > > --- a/arch/powerpc/sysdev/fsl_pci.c
> > > +++ b/arch/powerpc/sysdev/fsl_pci.c
> > > @@ -996,7 +996,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
> > >   if (is_in_pci_mem_space(addr)) {
> > >   if (user_mode(regs)) {
> > >   pagefault_disable();
> > > - ret = get_user(regs->nip, );
> > > + ret = get_user(inst, (__u32 __user *)regs->nip);
> > >   pagefault_enable();
> > >   } else {
> > >   ret = probe_kernel_address(regs->nip, inst);
> > > 
> > 
> > Leo,
> > 
> > Can you take a look, or assign it to someone who is familiar with this code?
> 
> Acked-by: Li Yang <leoyang...@nxp.com>
> 
> Regards,
> Leo

I think this is forgotten, cannot se it in Linus tree.

Re: [EXTERNAL]Re: FSL serial 166550 errata broken?

2017-09-28 Thread Joakim Tjernlund

On Thu, 2017-09-28 at 17:54 +0200, Joakim Tjernlund wrote:
> On Wed, 2017-09-27 at 15:32 +, York Sun wrote:
> > On 09/27/2017 04:03 AM, Joakim Tjernlund wrote:
> > > On Mon, 2017-09-25 at 17:26 +, York Sun wrote:
> > > > On 09/25/2017 09:55 AM, Joakim Tjernlund wrote:
> > > > > We got some "broken" boards(mpx8321) where UART RX is held low(BREAK)
> > > > > There we get a few:
> > > > > serial8250: too much work for irq18
> > > > > and the board freezes.
> > > > > 
> > > > > Looking inte to driver/CPU there is an errtum(A-004737) w.r.t BREAK 
> > > > > handling
> > > > > and I can see we are hitting the irq function fsl8250_handle_irq() 
> > > > > added
> > > > > in commit: 9deaa53ac7fa373623123aa4f18828dd62292b1a
> > > > >"serial: add irq handler for Freescale 16550 errata."
> > > > > For all I can tell this workaround is broken and I cannot figure out 
> > > > > why.
> > > > > Any pointers?
> > > > > 
> > > > 
> > > > Jocke,
> > > > 
> > > > Did you mean MPC8321?
> > > > 
> > > > I personally don't have experience debugging this erratum. Have you
> > > > tried to contact the patch author Paul Gortmaker to see how he managed
> > > > to get it work?
> > > 
> > > No, but I just found out it is u-boot related. If I use an very old 
> > > u-boot it works.
> > > Between then and now we did a massive upgrade of u-boot and now if 
> > > breaks. And no,
> > > bisect not possible due to local u-boot mods :)
> > 
> > How old? It is a good sign that an old U-Boot works. Git bisect would be 
> > a good tool if practical. Sometimes I have to reapply some local changes 
> > for every step of bisect to make it useful. You mileage may vary though.
> > 
> > > 
> > > Any idea what could be different? I cannot see and I have tested
> > > what I could see/think of but now I am out of ideas.
> > 
> > I don't believe we have much change for MPC8321 for a long time. If any 
> > change has impact on kernel but not U-Boot itself, it may be other 
> > errata workarounds.
> > 
> > I presume this happens on your own board. If I am in your shoes, I would 
> > try to reduce the number of local commits and reapply them to various 
> > U-Boot tags to further narrow down the search scope. In the mean time, 
> > getting register dump to compare the good and bad may be another way to go.
> 
> God, this was no fun exercise but I think I found the offending commit: 
> 82dda962f0a6449672df3378bb6b5fe54372a927
> serial: Unconditionally enable CONFIG_SERIAL_MULTI
> 
> Enable CONFIG_SERIAL_MULTI for all builds of U-Boot. That includes
> both SPL builds and non-SPL builds, everything. To avoid poluting
> this patch with removal of ifdef-endif constructions containing
> CONFIG_SERIAL_MULTI, the CONFIG_SERIAL_MULTI is temporarily added
> into CPPFLAGS in config.mk . This will be again removed in following
> patch.
> 

Ok, unless I init ttyS1 in u-boot too, IRQ storm will ensue if BREAK is present
when opening ttyS1:
+   /* Must init the second RS232 or IRQ storm happens
+* when BREAK is constant on while opening ttyS1 */
+   NS16550_init((NS16550_t)CONFIG_SYS_NS16550_COM2,
+ns16550_calc_divisor((NS16550_t)CONFIG_SYS_NS16550_COM2,
+ CONFIG_SYS_NS16550_CLK,
+ CONFIG_BAUDRATE));

I guess this is a kernel bug too, the driver should clear/init needed state 
before
starting the device. Fixing that is not on my menu though :)

I also noted that FSL custom irq handler, fsl8250_handle_irq, does not handle
dma like the standard one does. I guess it needs some love too.

 Jocke

Re: [EXTERNAL]Re: FSL serial 166550 errata broken?

2017-09-28 Thread Joakim Tjernlund

On Wed, 2017-09-27 at 15:32 +, York Sun wrote:
> On 09/27/2017 04:03 AM, Joakim Tjernlund wrote:
> > On Mon, 2017-09-25 at 17:26 +, York Sun wrote:
> > > On 09/25/2017 09:55 AM, Joakim Tjernlund wrote:
> > > > We got some "broken" boards(mpx8321) where UART RX is held low(BREAK)
> > > > There we get a few:
> > > > serial8250: too much work for irq18
> > > > and the board freezes.
> > > > 
> > > > Looking inte to driver/CPU there is an errtum(A-004737) w.r.t BREAK 
> > > > handling
> > > > and I can see we are hitting the irq function fsl8250_handle_irq() added
> > > > in commit: 9deaa53ac7fa373623123aa4f18828dd62292b1a
> > > >"serial: add irq handler for Freescale 16550 errata."
> > > > For all I can tell this workaround is broken and I cannot figure out 
> > > > why.
> > > > Any pointers?
> > > > 
> > > 
> > > Jocke,
> > > 
> > > Did you mean MPC8321?
> > > 
> > > I personally don't have experience debugging this erratum. Have you
> > > tried to contact the patch author Paul Gortmaker to see how he managed
> > > to get it work?
> > 
> > No, but I just found out it is u-boot related. If I use an very old u-boot 
> > it works.
> > Between then and now we did a massive upgrade of u-boot and now if breaks. 
> > And no,
> > bisect not possible due to local u-boot mods :)
> 
> How old? It is a good sign that an old U-Boot works. Git bisect would be 
> a good tool if practical. Sometimes I have to reapply some local changes 
> for every step of bisect to make it useful. You mileage may vary though.
> 
> > 
> > Any idea what could be different? I cannot see and I have tested
> > what I could see/think of but now I am out of ideas.
> 
> I don't believe we have much change for MPC8321 for a long time. If any 
> change has impact on kernel but not U-Boot itself, it may be other 
> errata workarounds.
> 
> I presume this happens on your own board. If I am in your shoes, I would 
> try to reduce the number of local commits and reapply them to various 
> U-Boot tags to further narrow down the search scope. In the mean time, 
> getting register dump to compare the good and bad may be another way to go.

God, this was no fun exercise but I think I found the offending commit: 
82dda962f0a6449672df3378bb6b5fe54372a927
serial: Unconditionally enable CONFIG_SERIAL_MULTI

Enable CONFIG_SERIAL_MULTI for all builds of U-Boot. That includes
both SPL builds and non-SPL builds, everything. To avoid poluting
this patch with removal of ifdef-endif constructions containing
CONFIG_SERIAL_MULTI, the CONFIG_SERIAL_MULTI is temporarily added
into CPPFLAGS in config.mk . This will be again removed in following
patch.

Re: FSL serial 166550 errata broken?

2017-09-27 Thread Joakim Tjernlund

On Mon, 2017-09-25 at 17:26 +, York Sun wrote:
> On 09/25/2017 09:55 AM, Joakim Tjernlund wrote:
> > We got some "broken" boards(mpx8321) where UART RX is held low(BREAK)
> > There we get a few:
> >serial8250: too much work for irq18
> > and the board freezes.
> > 
> > Looking inte to driver/CPU there is an errtum(A-004737) w.r.t BREAK handling
> > and I can see we are hitting the irq function fsl8250_handle_irq() added
> > in commit: 9deaa53ac7fa373623123aa4f18828dd62292b1a
> >   "serial: add irq handler for Freescale 16550 errata."
> > For all I can tell this workaround is broken and I cannot figure out why.
> > Any pointers?
> > 
> 
> Jocke,
> 
> Did you mean MPC8321?
> 
> I personally don't have experience debugging this erratum. Have you 
> tried to contact the patch author Paul Gortmaker to see how he managed 
> to get it work?

No, but I just found out it is u-boot related. If I use an very old u-boot it 
works.
Between then and now we did a massive upgrade of u-boot and now if breaks. And 
no,
bisect not possible due to local u-boot mods :)

Any idea what could be different? I cannot see and I have tested
what I could see/think of but now I am out of ideas.

 Jocke

FSL serial 166550 errata broken?

2017-09-25 Thread Joakim Tjernlund

We got some "broken" boards(mpx8321) where UART RX is held low(BREAK)
There we get a few:
  serial8250: too much work for irq18
and the board freezes.

Looking inte to driver/CPU there is an errtum(A-004737) w.r.t BREAK handling
and I can see we are hitting the irq function fsl8250_handle_irq() added
in commit: 9deaa53ac7fa373623123aa4f18828dd62292b1a
 "serial: add irq handler for Freescale 16550 errata."
For all I can tell this workaround is broken and I cannot figure out why.
Any pointers?

 Jocke

irq 26: nobody cared, caused by mpc85xx_pci_isr on P2010 and T1042

2017-09-20 Thread Joakim Tjernlund

Some PCIe errors, don't know which(possibly by PCIe 4 in
   http://pdf1.solecsy.com/61/5af9fd2d-652c-4331-b49c-807c7c47f4f7.pdf)
causes endless IRQ for EDAC's PCIe routine:

[   17.690716] irq 26: nobody cared (try booting with the "irqpoll" option)
[   17.697417] CPU: 0 PID: 0 Comm: swapper Not tainted 4.1.43+ #24
[   17.703334] Call Trace:
[   17.705780] [df9edf10] [c0056990] __report_bad_irq.isra.7+0x34/0xdc 
(unreliable)
[   17.713181] [df9edf30] [c0056d20] note_interrupt+0x274/0x2c0
[   17.718840] [df9edf60] [c00549e0] handle_irq_event_percpu+0x1a0/0x208
[   17.725281] [df9edfa0] [c0054a80] handle_irq_event+0x38/0x5c
[   17.730940] [df9edfb0] [c0057668] handle_fasteoi_irq+0xb0/0x1e0
[   17.736867] [df9edfc0] [c005404c] generic_handle_irq+0x3c/0x5c
[   17.742703] [df9edfd0] [c0004be0] __do_irq+0x48/0x10c
[   17.747752] [df9edff0] [c000c398] call_do_irq+0x24/0x3c
[   17.752977] [c0453e80] [c0004d08] do_IRQ+0x64/0xc4
[   17.757768] [c0453ea0] [c000da3c] ret_from_except+0x0/0x18
[   17.763257] --- interrupt: 501 at arch_cpu_idle+0x4c/0x5c
[   17.763257] LR = cpu_startup_entry+0x17c/0x20c
[   17.773437] [c0453f60] [c004de1c] cpu_startup_entry+0x74/0x20c (unreliable)
[   17.780405] [c0453fb0] [c03fc988] start_kernel+0x34c/0x360
[   17.785889] [c0453ff0] [c380] set_ivor+0x10c/0x148
[   17.791024] handlers:
[   17.793296] [] mpc85xx_pci_isr
[   17.797213] Disabling IRQ #26

Most annoying and makes EDAC useless. Can it be fixed?

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-20 Thread Joakim Tjernlund

On Sat, 2017-09-09 at 14:45 +0200, Joakim Tjernlund wrote:
> On Fri, 2017-09-08 at 22:27 +, Leo Li wrote:
> > > -Original Message-
> > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > Sent: Friday, September 08, 2017 7:51 AM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > > <york@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > > 
> > > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > > On Thu, 2017-09-07 at 18:54 +, Leo Li wrote:
> > > > > > -Original Message-
> > > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>;
> > > > > > York Sun <york@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > 
> > > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > > On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > > > > > > > -Original Message-
> > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > 
> > > > > > > > > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > > > > > > > > -Original Message-
> > > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > 
> > > > > > > > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > > > > > > > -Original Message-
> > > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > > <joakim.tjernl...@infinera.com>;
> > > > > > > > > > > > > linuxppc- d...@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > > <leoyang...@nxp.com>
> > > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > > 
> > > > > > > > > *regs)
> > > > > > > > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > > >  if (user_mode(regs)) {
> > > > > > > > > > > > > >  pagefault_disable();
> > > > > > > > > > > > > > -   ret = get_user(regs->nip, 
> > > > > > > > > > > > > > );
> > > > > > > >

Re: Machine Check in P2010(e500v2)

2017-09-14 Thread Joakim Tjernlund

On Sat, 2017-09-09 at 14:59 +0200, Joakim Tjernlund wrote:
> On Sat, 2017-09-09 at 14:45 +0200, Joakim Tjernlund wrote:
> > On Fri, 2017-09-08 at 22:27 +, Leo Li wrote:
> > > > -Original Message-
> > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > Sent: Friday, September 08, 2017 7:51 AM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > > > <york@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > 
> > > > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > > > On Thu, 2017-09-07 at 18:54 +, Leo Li wrote:
> > > > > > > -Original Message-
> > > > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>;
> > > > > > > York Sun <york@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > 
> > > > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > > > On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > > > > > > > > -Original Message-
> > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > 
> > > > > > > > > > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > > > > > > > > > -Original Message-
> > > > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > <leoyang...@nxp.com>; York Sun <york....@nxp.com>
> > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > 
> > > > > > > > > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > > > > > > > > -Original Message-
> > > > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > > > <joakim.tjernl...@infinera.com>;
> > > > > > > > > > > > > > linuxppc- d...@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > > > <leoyang...@nxp.com>
> > > > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > > > 
> > > > > > > > > > *regs)
> > > > > > > > > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > > > &

Re: Machine Check in P2010(e500v2)

2017-09-09 Thread Joakim Tjernlund

On Fri, 2017-09-08 at 22:27 +, Leo Li wrote:
> > -Original Message-
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Friday, September 08, 2017 7:51 AM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > <york@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> > 
> > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > On Thu, 2017-09-07 at 18:54 +, Leo Li wrote:
> > > > > -Original Message-
> > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>;
> > > > > York Sun <york@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > 
> > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > > > > > > -Original Message-
> > > > > > > > From: Joakim Tjernlund
> > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > 
> > > > > > > > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > > > > > > > -Original Message-
> > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > 
> > > > > > > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > > > > > > -Original Message-
> > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > <joakim.tjernl...@infinera.com>;
> > > > > > > > > > > > linuxppc- d...@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > <leoyang...@nxp.com>
> > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > 
> > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > > > 
> > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > 
> > > > > > > > *regs)
> > > > > > > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > >  if (user_mode(regs)) {
> > > > > > > > > > > > >  pagefault_disable();
> > > > > > > > > > > > > -   ret = get_user(regs->nip, 
> > > > > > > > > > > > > );
> > > > > > > > > > > > > +   ret = get_user(inst,
> > > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > > >  pagefault_enable();
> > > > > > > > > > > > >

Re: Machine Check in P2010(e500v2)

2017-09-08 Thread Joakim Tjernlund

On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> On Thu, 2017-09-07 at 18:54 +, Leo Li wrote:
> > > -Original Message-
> > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > Sent: Thursday, September 07, 2017 3:41 AM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > > <york@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > > 
> > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > > > > -Original Message-
> > > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>;
> > > > > > York Sun <york@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > 
> > > > > > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > > > > > -Original Message-
> > > > > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > 
> > > > > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > > > > -Original Message-
> > > > > > > > > > From: York Sun
> > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>;
> > > > > > > > > > linuxppc- d...@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang...@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > 
> > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > 
> > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct
> > > > > > > > > > > pt_regs
> > > > > > 
> > > > > > *regs)
> > > > > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > >  if (user_mode(regs)) {
> > > > > > > > > > >  pagefault_disable();
> > > > > > > > > > > -   ret = get_user(regs->nip, );
> > > > > > > > > > > +   ret = get_user(inst, (__u32
> > > > > > > > > > > + __user *)regs->nip);
> > > > > > > > > > >  pagefault_enable();
> > > > > > > > > > >  } else {
> > > > > > > > > > >  ret =
> > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > 
> > > > > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > > > > Now I wonder why this fixup is there in the first place?
> > > > > > > > > > > The routine will not really fixup the insn, just return
> > > > > > > > > > > 0x for the failing read and then advance the 
> > > > > > > > > > > process NIP.
> > > > > > > > > 
> > > > > > > > > You are right.  The code here only gives 0x to the
> > > > > > > > > load instructions and
> > > > > > > > 
> > > > > > > > continue with the next instruction when the load instruction
&g

Re: Machine Check in P2010(e500v2)

2017-09-08 Thread Joakim Tjernlund

On Thu, 2017-09-07 at 18:54 +, Leo Li wrote:
> > -Original Message-
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Thursday, September 07, 2017 3:41 AM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > <york@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> > 
> > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > > > -Original Message-
> > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>;
> > > > > York Sun <york@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > 
> > > > > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > > > > -Original Message-
> > > > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > <leoyang...@nxp.com>; York Sun <york@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > 
> > > > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > > > -Original Message-
> > > > > > > > > From: York Sun
> > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>;
> > > > > > > > > linuxppc- d...@lists.ozlabs.org; Leo Li
> > > > > > > > > <leoyang...@nxp.com>
> > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > 
> > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > 
> > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct
> > > > > > > > > > pt_regs
> > > > > 
> > > > > *regs)
> > > > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > > > >  if (user_mode(regs)) {
> > > > > > > > > >  pagefault_disable();
> > > > > > > > > > -   ret = get_user(regs->nip, );
> > > > > > > > > > +   ret = get_user(inst, (__u32
> > > > > > > > > > + __user *)regs->nip);
> > > > > > > > > >  pagefault_enable();
> > > > > > > > > >  } else {
> > > > > > > > > >  ret =
> > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > 
> > > > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > > > Now I wonder why this fixup is there in the first place?
> > > > > > > > > > The routine will not really fixup the insn, just return
> > > > > > > > > > 0x for the failing read and then advance the 
> > > > > > > > > > process NIP.
> > > > > > > > 
> > > > > > > > You are right.  The code here only gives 0x to the
> > > > > > > > load instructions and
> > > > > > > 
> > > > > > > continue with the next instruction when the load instruction
> > > > > > > is causing the machine check.  This will prevent a system
> > > > > > > lockup when reading from PCI/RapidIO device which is link down.
> > > > > > > > 
> > > > > > > > I don't know what is actual problem in your case.  Maybe it
> > > > > > > > is a

Re: UIO memmap of PCi devices not working?

2017-09-07 Thread Joakim Tjernlund

On Thu, 2017-09-07 at 10:59 +0200, Joakim Tjernlund wrote:
> On Thu, 2017-09-07 at 18:33 +1000, Benjamin Herrenschmidt wrote:
> > On Thu, 2017-09-07 at 07:22 +0000, Joakim Tjernlund wrote:
> > > On Thu, 2017-09-07 at 17:16 +1000, Benjamin Herrenschmidt wrote:
> > > > On Wed, 2017-09-06 at 15:20 +, Joakim Tjernlund wrote:
> > > > > Having problems to mmap PCI UIO devices and stumbeled over this page:
> > > > >  
> > > > > http://billfarrow.blogspot.se/2010/09/userspace-access-to-pci-memory.html
> > > > > it claims some adjustments are needed for UIO mmap over PCI to work.
> > > > > These are #if 0 ATM and trying to enable them fails build.
> > > > > 
> > > > > Can this be fixed to at least build again ?
> > > > > The reason for having #if 0 in the first place appears to be old X 
> > > > > servers,
> > > > > is that still true? Can the special casing be removed now?
> > > > 
> > > > This article seems out of date... I *think* things should work without
> > > > change by just mmap'ing the appropriate sysfs files. I'm not sure why
> > > > the author thought that had to be ifdef'ed out...
> > > 
> > > Isn't that what the article is doing(mmaping sysfs files)?
> > > And the article author is #ifdefing it back, not out.
> > 
> > Yes sorry that's what I meant. It should work as-is.
> > 
> > > > 
> > > > Let me know if you have problems.
> > > 
> > > Sure, we still are looking 
> > > 
> > > > 
> > > > As far as I know, the generic code will call pci_resource_to_user()
> > > > which on powerpc will return a physical address that already includes
> > > > the offset, which is why we don't later add it.
> > > > 
> > > > Now we could probably tear all that out and use the new generic code
> > > > instead as I *think* X has (very) long been fixed but I'd have to spend
> > > > some time triple checking and testing on old HW which I don't have the
> > > > bandwidth for right now. 
> > > 
> > > Could you fixup the code which is now #if 0 ? I wan't to test the
> > > difference and I not sure how to fix the build problem after changing
> > > those two #if 0 to #if 1
> > > Even better if they could be a CONFIG option instead.
> > 
> > Hrm it's tricky, you shouldn't just turn that ifdef back on without
> > also changing pci_resource_to_user().
> 
> There are two ifdef to change:
> __pci_mmap_make_offset():
> #if 0 /* See comment in pci_resource_to_user() for why this is disabled */
>   *offset += hose->pci_mem_offset;
> #endif
> 
> and
> 
> pci_resource_to_user()
>   /* We pass a fully fixed up address to userland for MMIO instead of
>* a BAR value because X is lame and expects to be able to use that
>* to pass to /dev/mem !
>*
>* That means that we'll have potentially 64 bits values where some
>* userland apps only expect 32 (like X itself since it thinks only
>* Sparc has 64 bits MMIO) but if we don't do that, we break it on
>* 32 bits CHRPs :-(
>*
>* Hopefully, the sysfs insterface is immune to that gunk. Once X
>* has been fixed (and the fix spread enough), we can re-enable the
>* 2 lines below and pass down a BAR value to userland. In that case
>* we'll also have to re-enable the matching code in
>* __pci_mmap_make_offset().
>*
>* BenH.
>*/
> #if 0
>   else if (rsrc->flags & IORESOURCE_MEM)
>   offset = hose->pci_mem_offset;
> #endif

This seems to work, just a hack though:
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -314,8 +314,8 @@ static struct resource *__pci_mmap_make_offset(struct 
pci_dev *dev,
 
/* If memory, add on the PCI bridge address offset */
if (mmap_state == pci_mmap_mem) {
-#if 0 /* See comment in pci_resource_to_user() for why this is disabled */
-   *offset += hose->pci_mem_offset;
+#if 1  /* See comment in pci_resource_to_user() for why this is disabled */
+   *offset += hose->mem_offset[0];
 #endif
res_bit = IORESOURCE_MEM;
} else {
@@ -634,9 +634,9 @@ void pci_resource_to_user(const struct pci_dev *dev, int 
bar,
 *
 * BenH.
 */
-#if 0
+#if 1
else if (rsrc->flags & IORESOURCE_MEM)
-   offset = hose->pci_mem_offset;
+   offset = hose->mem_offset[0];
 #endif
 
*start = rsrc->start - offset;

> 
> Problem is that pci_mem_offset is gone, the closed I can find is mem_offset
> but that is an array,maybe just mem_offset[0] ?
> 
> > I'm not sure exactly what's going
> > on in your case, if you have a problem can you add printk to instrument
> > ?
> 
> Seems to be something else going on in out board. Anyhow, the mem_offset 
> should
> be fixed to compile, nice to have it behind a CONFIG option. Then
> one can start the process to remove the special casing easier.

After sorting the bugs in our app, it works with and without above patch.

 Jocke

Re: UIO memmap of PCi devices not working?

2017-09-07 Thread Joakim Tjernlund

On Thu, 2017-09-07 at 18:33 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2017-09-07 at 07:22 +0000, Joakim Tjernlund wrote:
> > On Thu, 2017-09-07 at 17:16 +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2017-09-06 at 15:20 +, Joakim Tjernlund wrote:
> > > > Having problems to mmap PCI UIO devices and stumbeled over this page:
> > > >  
> > > > http://billfarrow.blogspot.se/2010/09/userspace-access-to-pci-memory.html
> > > > it claims some adjustments are needed for UIO mmap over PCI to work.
> > > > These are #if 0 ATM and trying to enable them fails build.
> > > > 
> > > > Can this be fixed to at least build again ?
> > > > The reason for having #if 0 in the first place appears to be old X 
> > > > servers,
> > > > is that still true? Can the special casing be removed now?
> > > 
> > > This article seems out of date... I *think* things should work without
> > > change by just mmap'ing the appropriate sysfs files. I'm not sure why
> > > the author thought that had to be ifdef'ed out...
> > 
> > Isn't that what the article is doing(mmaping sysfs files)?
> > And the article author is #ifdefing it back, not out.
> 
> Yes sorry that's what I meant. It should work as-is.
> 
> > > 
> > > Let me know if you have problems.
> > 
> > Sure, we still are looking 
> > 
> > > 
> > > As far as I know, the generic code will call pci_resource_to_user()
> > > which on powerpc will return a physical address that already includes
> > > the offset, which is why we don't later add it.
> > > 
> > > Now we could probably tear all that out and use the new generic code
> > > instead as I *think* X has (very) long been fixed but I'd have to spend
> > > some time triple checking and testing on old HW which I don't have the
> > > bandwidth for right now. 
> > 
> > Could you fixup the code which is now #if 0 ? I wan't to test the
> > difference and I not sure how to fix the build problem after changing
> > those two #if 0 to #if 1
> > Even better if they could be a CONFIG option instead.
> 
> Hrm it's tricky, you shouldn't just turn that ifdef back on without
> also changing pci_resource_to_user().

There are two ifdef to change:
__pci_mmap_make_offset():
#if 0 /* See comment in pci_resource_to_user() for why this is disabled */
*offset += hose->pci_mem_offset;
#endif

and

pci_resource_to_user()
/* We pass a fully fixed up address to userland for MMIO instead of
 * a BAR value because X is lame and expects to be able to use that
 * to pass to /dev/mem !
 *
 * That means that we'll have potentially 64 bits values where some
 * userland apps only expect 32 (like X itself since it thinks only
 * Sparc has 64 bits MMIO) but if we don't do that, we break it on
 * 32 bits CHRPs :-(
 *
 * Hopefully, the sysfs insterface is immune to that gunk. Once X
 * has been fixed (and the fix spread enough), we can re-enable the
 * 2 lines below and pass down a BAR value to userland. In that case
 * we'll also have to re-enable the matching code in
 * __pci_mmap_make_offset().
 *
 * BenH.
 */
#if 0
else if (rsrc->flags & IORESOURCE_MEM)
offset = hose->pci_mem_offset;
#endif

Problem is that pci_mem_offset is gone, the closed I can find is mem_offset
but that is an array,maybe just mem_offset[0] ?

> I'm not sure exactly what's going
> on in your case, if you have a problem can you add printk to instrument
> ?
Seems to be something else going on in out board. Anyhow, the mem_offset should
be fixed to compile, nice to have it behind a CONFIG option. Then
one can start the process to remove the special casing easier.

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-07 Thread Joakim Tjernlund

On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > > -Original Message-
> > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > > <york@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > > 
> > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > -Original Message-
> > > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York
> > > > > Sun <york@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > 
> > > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > > -Original Message-
> > > > > > > From: York Sun
> > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>; linuxppc-
> > > > > > > d...@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > 
> > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > 
> > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > So after some debugging I found this bug:
> > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs
> > > 
> > > *regs)
> > > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > > >  if (user_mode(regs)) {
> > > > > > > >  pagefault_disable();
> > > > > > > > -   ret = get_user(regs->nip, );
> > > > > > > > +   ret = get_user(inst, (__u32 __user
> > > > > > > > + *)regs->nip);
> > > > > > > >  pagefault_enable();
> > > > > > > >  } else {
> > > > > > > >  ret = probe_kernel_address(regs->nip,
> > > > > > > > inst);
> > > > > > > > 
> > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > Now I wonder why this fixup is there in the first place? The
> > > > > > > > routine will not really fixup the insn, just return 0x
> > > > > > > > for the failing read and then advance the process NIP.
> > > > > > 
> > > > > > You are right.  The code here only gives 0x to the load
> > > > > > instructions and
> > > > > 
> > > > > continue with the next instruction when the load instruction is
> > > > > causing the machine check.  This will prevent a system lockup when
> > > > > reading from PCI/RapidIO device which is link down.
> > > > > > 
> > > > > > I don't know what is actual problem in your case.  Maybe it is a
> > > > > > write
> > > > > 
> > > > > instruction instead of read?   Or the code is in a infinite loop 
> > > > > waiting for a
> > > 
> > > valid
> > > > > read result?  Are you able to do some further debugging with the NIP
> > > > > correctly printed?
> > > > > > 
> > > > > 
> > > > > According to the MC it is a Read and the NIP also leads to a read in 
> > > > > the
> > > 
> > > program.
> > > > > ATM, I have disabled the fixup but I will enable that again.
> > > > > Question, is it safe add a small printk when this MC happens(after
> > > > > fixing up)? I need to see that it has happened as the error is 
> > > > > somewhat
> > > 
> > > random.
> > > > 
> > > > I think it is safe to add printk as the current machine check handlers 
> > > > are also
> > > 
> > > using printk.
> > > 
> > > I hope so, but if the fixup fires there is no printk at all so I was a 
> > > bit unsure.
> > > Don't like this fixup though, is there not a better way than faking a 
> > > read to user
> > > space(or kernel for that matter) ?
> > 
> > I don't have a better idea.  Without the fixup, the offending load 
> > instruction will never finish if there is anything wrong with the backing 
> > device and freeze the whole system.  Do you have any suggestion in mind?
> > 
> 
> But it never finishes the load, it just fakes a load of 0xf, for user 
> space I rather have it signal
> a SIGBUS but that does not seem to work either, at least not for us but that 
> could be a bug in general MC code
>  maybe.
> This fixup might be valid for kernel only as it has never worked for user 
> space due to the bug I found.
> 
> Where can I read about this errata ?

I have look high and low an cannot find an errata which maps to this fixup.
The closest I get is A-005125 which seems to have another workaround, I cannot 
find
any evidence that this workaround has been applied in Linux, can you?

 Jocke

Re: UIO memmap of PCi devices not working?

2017-09-07 Thread Joakim Tjernlund

On Thu, 2017-09-07 at 17:16 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2017-09-06 at 15:20 +0000, Joakim Tjernlund wrote:
> > Having problems to mmap PCI UIO devices and stumbeled over this page:
> >  http://billfarrow.blogspot.se/2010/09/userspace-access-to-pci-memory.html
> > it claims some adjustments are needed for UIO mmap over PCI to work.
> > These are #if 0 ATM and trying to enable them fails build.
> > 
> > Can this be fixed to at least build again ?
> > The reason for having #if 0 in the first place appears to be old X servers,
> > is that still true? Can the special casing be removed now?
> 
> This article seems out of date... I *think* things should work without
> change by just mmap'ing the appropriate sysfs files. I'm not sure why
> the author thought that had to be ifdef'ed out...

Isn't that what the article is doing(mmaping sysfs files)?
And the article author is #ifdefing it back, not out.

> 
> Let me know if you have problems.

Sure, we still are looking 

> 
> As far as I know, the generic code will call pci_resource_to_user()
> which on powerpc will return a physical address that already includes
> the offset, which is why we don't later add it.
> 
> Now we could probably tear all that out and use the new generic code
> instead as I *think* X has (very) long been fixed but I'd have to spend
> some time triple checking and testing on old HW which I don't have the
> bandwidth for right now. 

Could you fixup the code which is now #if 0 ? I wan't to test the
difference and I not sure how to fix the build problem after changing
those two #if 0 to #if 1
Even better if they could be a CONFIG option instead.

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-06 Thread Joakim Tjernlund

On Wed, 2017-09-06 at 21:13 +, Leo Li wrote:
> > -Original Message-
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Wednesday, September 06, 2017 3:54 PM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > <york@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> > 
> > On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > > > -Original Message-
> > > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York
> > > > Sun <york@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > 
> > > > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > > > -Original Message-
> > > > > > From: York Sun
> > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>; linuxppc-
> > > > > > d...@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > 
> > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > 
> > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > So after some debugging I found this bug:
> > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs
> > 
> > *regs)
> > > > > > >  if (is_in_pci_mem_space(addr)) {
> > > > > > >  if (user_mode(regs)) {
> > > > > > >  pagefault_disable();
> > > > > > > -   ret = get_user(regs->nip, );
> > > > > > > +   ret = get_user(inst, (__u32 __user
> > > > > > > + *)regs->nip);
> > > > > > >  pagefault_enable();
> > > > > > >  } else {
> > > > > > >  ret = probe_kernel_address(regs->nip,
> > > > > > > inst);
> > > > > > > 
> > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > Now I wonder why this fixup is there in the first place? The
> > > > > > > routine will not really fixup the insn, just return 0x
> > > > > > > for the failing read and then advance the process NIP.
> > > > > 
> > > > > You are right.  The code here only gives 0x to the load
> > > > > instructions and
> > > > 
> > > > continue with the next instruction when the load instruction is
> > > > causing the machine check.  This will prevent a system lockup when
> > > > reading from PCI/RapidIO device which is link down.
> > > > > 
> > > > > I don't know what is actual problem in your case.  Maybe it is a
> > > > > write
> > > > 
> > > > instruction instead of read?   Or the code is in a infinite loop 
> > > > waiting for a
> > 
> > valid
> > > > read result?  Are you able to do some further debugging with the NIP
> > > > correctly printed?
> > > > > 
> > > > 
> > > > According to the MC it is a Read and the NIP also leads to a read in the
> > 
> > program.
> > > > ATM, I have disabled the fixup but I will enable that again.
> > > > Question, is it safe add a small printk when this MC happens(after
> > > > fixing up)? I need to see that it has happened as the error is somewhat
> > 
> > random.
> > > 
> > > I think it is safe to add printk as the current machine check handlers 
> > > are also
> > 
> > using printk.
> > 
> > I hope so, but if the fixup fires there is no printk at all so I was a bit 
> > unsure.
> > Don't like this fixup though, is there not a better way than faking a read 
> > to user
> > space(or kernel for that matter) ?
> 
> I don't have a better idea.  Without the fixup, the offending load 
> instruction will never finish if there is anything wrong with the backing 
> device and freeze the whole system.  Do you have any suggestion in mind?
> 

But it never finishes the load, it just fakes a load of 0xf, for user 
space I rather have it signal
a SIGBUS but that does not seem to work either, at least not for us but that 
could be a bug in general MC code
 maybe.
This fixup might be valid for kernel only as it has never worked for user space 
due to the bug I found.

Where can I read about this errata ?

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-06 Thread Joakim Tjernlund

On Wed, 2017-09-06 at 20:28 +, Leo Li wrote:
> > -Original Message-
> > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > Sent: Wednesday, September 06, 2017 3:17 PM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>; York Sun
> > <york@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> > 
> > On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > > > -Original Message-
> > > > From: York Sun
> > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>; linuxppc-
> > > > d...@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > 
> > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > 
> > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > So after some debugging I found this bug:
> > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
> > > > >  if (is_in_pci_mem_space(addr)) {
> > > > >  if (user_mode(regs)) {
> > > > >  pagefault_disable();
> > > > > -   ret = get_user(regs->nip, );
> > > > > +   ret = get_user(inst, (__u32 __user
> > > > > + *)regs->nip);
> > > > >  pagefault_enable();
> > > > >  } else {
> > > > >  ret = probe_kernel_address(regs->nip,
> > > > > inst);
> > > > > 
> > > > > However, the kernel still locked up after fixing that.
> > > > > Now I wonder why this fixup is there in the first place? The
> > > > > routine will not really fixup the insn, just return 0x for
> > > > > the failing read and then advance the process NIP.
> > > 
> > > You are right.  The code here only gives 0x to the load 
> > > instructions and
> > 
> > continue with the next instruction when the load instruction is causing the
> > machine check.  This will prevent a system lockup when reading from
> > PCI/RapidIO device which is link down.
> > > 
> > > I don't know what is actual problem in your case.  Maybe it is a write
> > 
> > instruction instead of read?   Or the code is in a infinite loop waiting 
> > for a valid
> > read result?  Are you able to do some further debugging with the NIP 
> > correctly
> > printed?
> > > 
> > 
> > According to the MC it is a Read and the NIP also leads to a read in the 
> > program.
> > ATM, I have disabled the fixup but I will enable that again.
> > Question, is it safe add a small printk when this MC happens(after fixing 
> > up)? I
> > need to see that it has happened as the error is somewhat random.
> 
> I think it is safe to add printk as the current machine check handlers are 
> also using printk.

I hope so, but if the fixup fires there is no printk at all so I was a bit 
unsure.
Don't like this fixup though, is there not a better way than faking a read
to user space(or kernel for that matter) ?

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-06 Thread Joakim Tjernlund

On Wed, 2017-09-06 at 19:31 +, Leo Li wrote:
> > -Original Message-
> > From: York Sun
> > Sent: Wednesday, September 06, 2017 10:38 AM
> > To: Joakim Tjernlund <joakim.tjernl...@infinera.com>; linuxppc-
> > d...@lists.ozlabs.org; Leo Li <leoyang...@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> > 
> > Scott is no longer with Freescale/NXP. Adding Leo.
> > 
> > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > So after some debugging I found this bug:
> > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
> > >  if (is_in_pci_mem_space(addr)) {
> > >  if (user_mode(regs)) {
> > >  pagefault_disable();
> > > -   ret = get_user(regs->nip, );
> > > +   ret = get_user(inst, (__u32 __user
> > > + *)regs->nip);
> > >  pagefault_enable();
> > >  } else {
> > >  ret = probe_kernel_address(regs->nip, inst);
> > > 
> > > However, the kernel still locked up after fixing that.
> > > Now I wonder why this fixup is there in the first place? The routine
> > > will not really fixup the insn, just return 0x for the failing
> > > read and then advance the process NIP.
> 
> You are right.  The code here only gives 0x to the load instructions 
> and continue with the next instruction when the load instruction is causing 
> the machine check.  This will prevent a system lockup when reading from 
> PCI/RapidIO device which is link down.
> 
> I don't know what is actual problem in your case.  Maybe it is a write 
> instruction instead of read?   Or the code is in a infinite loop waiting for 
> a valid read result?  Are you able to do some further debugging with the NIP 
> correctly printed?
> 

According to the MC it is a Read and the NIP also leads to a read in the 
program.
ATM, I have disabled the fixup but I will enable that again.
Question, is it safe add a small printk when this MC happens(after fixing up)? 
I need to see that
it has happened as the error is somewhat random.

 Jocke

> Regards,
> Leo
> 
> > > 
> > > Removing the fixup does not help either, kernel still locks up:
> > > [   28.170532] Machine check in kernel mode.
> > > [   28.174538] Caused by (from MCSR=10008):
> > > [   28.182804] Bus - Read Data Bus Error: DAR:b7013000
> > > [   28.197079] Oops: Machine check, sig: 7 [#1]
> > > [   28.201343] P1010 RDB
> > > [   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)
> > 
> > linux_kernel_bde(PO)
> > > [   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P   O
> > 
> > 4.1.38+ #201
> > > [   28.219540] task: db16ed10 ti: df122000 task.ti: df122000
> > > [   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
> > > [   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P   O 
> > > (4.1.38+)
> > > [   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 
> > > [   28.243306] DEAR: b7013000 ESR: 
> > > GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 0700
> > 
> > 
> > > 132f9fd8
> > > GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c 
> > > 0000
> > > GPR16: 10f6e29c 10f6c872 10f6db01 b541 b541 11d92fcc 0011
> > > 0001
> > > GPR24: 01a5bd3e 132ffbf0 11d6  07006000  132f9fa8
> > 
> > 
> > > [   28.275547] NIP [10a4e2f4] 0x10a4e2f4
> > > [   28.279204] LR [10a4e404] 0x10a4e404
> > > [   28.282772] Call Trace:
> > > [   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---
> > > [   28.289825]
> > > 
> > > 
> > >   Jocke
> > > 
> > > On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:
> > > > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> > > > 
> > > > [   28.111816] Caused by (from MCSR=10008): Bus - Read Data Bus Error
> > > > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > > > [   28.122263] P1010 RDB
> > > > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)
> > 
> > linux_kernel_bde(PO)
> > > > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P   O
> > 
> > 4.1.38+ #49
> > > > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > > > [   28.145770] NIP:

UIO memmap of PCi devices not working?

2017-09-06 Thread Joakim Tjernlund

Having problems to mmap PCI UIO devices and stumbeled over this page:
 http://billfarrow.blogspot.se/2010/09/userspace-access-to-pci-memory.html
it claims some adjustments are needed for UIO mmap over PCI to work.
These are #if 0 ATM and trying to enable them fails build.

Can this be fixed to at least build again ?
The reason for having #if 0 in the first place appears to be old X servers,
is that still true? Can the special casing be removed now?

 Jocke

Re: Machine Check in P2010(e500v2)

2017-09-06 Thread Joakim Tjernlund

On Wed, 2017-09-06 at 10:05 +, Laurentiu Tudor wrote:
> Hi Jocke,
> 
> On 09/01/2017 02:32 PM, Joakim Tjernlund wrote:
> > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> > 
> > [   28.111816] Caused by (from MCSR=10008): Bus - Read Data Bus Error
> > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > [   28.122263] P1010 RDB
> > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) 
> > linux_kernel_bde(PO)
> > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P   O
> > 4.1.38+ #49
> > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > [   28.145770] NIP:  LR: 10a4e404 CTR: 10046c38
> > [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P   O 
> > (4.1.38+)
> > [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 
> > [   28.164140] DEAR: b7187000 ESR: 
> > GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 0700  
> > 132f9fd8
> > GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c  
> > 
> > GPR16: 10f6e29c 10f6c872 10f6db01 b541 b541 11d92fcc 0011 
> > 0001
> > GPR24: 01a4d12d 132ffbf0 11d6  07006000  132f9fa8 
> > 
> > [   28.196375] NIP []   (null)
> > [   28.199859] LR [10a4e404] 0x10a4e404
> > [   28.203426] Call Trace:
> > [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> > 
> > I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> > MCSRR0 early on but maybe it is lost somehow?
> > 
> > Anyhow, looking at entry_32.S:
> > .globl  mcheck_transfer_to_handler
> > mcheck_transfer_to_handler:
> > mfspr   r0,SPRN_DSRR0
> > stw r0,_DSRR0(r11)
> > mfspr   r0,SPRN_DSRR1
> > stw r0,_DSRR1(r11)
> > /* fall through */
> > 
> > .globl  debug_transfer_to_handler
> > debug_transfer_to_handler:
> > mfspr   r0,SPRN_CSRR0
> > stw r0,_CSRR0(r11)
> > mfspr   r0,SPRN_CSRR1
> > stw r0,_CSRR1(r11)
> > /* fall through */
> > 
> > .globl  crit_transfer_to_handler
> > crit_transfer_to_handler:
> > 
> > It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> > crit has none. Should not this assigment be shifted down one level?
> > 
> 
> This does indeed looks weird. Have you tried moving the SPRN_CSRR* 
> saving in the crit section? Any results?

After looking at this somwhat I think this is intentional and OK.
I sorted NIP == NULL too:
@@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
if (is_in_pci_mem_space(addr)) {
if (user_mode(regs)) {
pagefault_disable();
-   ret = get_user(regs->nip, );
+   ret = get_user(inst, (__u32 __user *)regs->nip);
pagefault_enable();
} else {
ret = probe_kernel_address(regs->nip, inst);

But after this, the CPU is still locked after an Machine Check. Is this
to be expected? I figured the user space process would get a SIGBUS and kernel
would resume normal operations.

Scott, maybe you have some idea?

 Jocke

[PATCH] fsl_pci: Correct fsl_pci_mcheck_exception

2017-09-05 Thread Joakim Tjernlund

get_user() had it args reversed causing NIP to be NULL:ed instead
of fixing up the PCI access.

Note: This still hangs my P1020 Freescale CPU hard, but at least
I get a NIP now.

Signed-off-by: Joakim Tjernlund <joakim.tjernl...@infinera.com>
---
 arch/powerpc/sysdev/fsl_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 7c8b779c329a..9e64c12dff6a 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -996,7 +996,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
if (is_in_pci_mem_space(addr)) {
if (user_mode(regs)) {
pagefault_disable();
-   ret = get_user(regs->nip, );
+   ret = get_user(inst, (__u32 __user *)regs->nip);
pagefault_enable();
} else {
ret = probe_kernel_address(regs->nip, inst);
-- 
2.13.5

1 2 3 4 5 6 7 >

1 - 100 of 689 matches

Mail list logo