Re: [PATCH 1/1] net: nps_enet: Disable interrupts before napi reschedule

2016-05-26 Thread Vineet Gupta
Hi Elad, Noam,

On Thursday 26 May 2016 11:23 PM, Alexey Brodkin wrote:

> 
> We just bumped into the same problem (data exchange hangs on the very first 
> "ping")
> with released Linux v4.6 and linux-next on our nSIM OSCI virtual platform.
> 
> I believe it was commit 05c00d82f4d1 ("net: nps_enet: bug fix - handle lost 
> tx interrupts")
> that introduced the problem. At least reverting it I got networking working.
> 
> And indeed that patch fixes mentioned issue.
> In other words...
> 
> Tested-by: Alexey Brodkin 

FWIW, we now actively use the same driver (and same systemc model) in one of our
our simulation platforms used for testing regressions. So please try to keep arc
mailing list on CC for any nps_enet driver patches so we are in loop and know 
what
is going on !

Thx,
-Vineet


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 1/1] net: nps_enet: Disable interrupts before napi reschedule

2016-05-26 Thread Alexey Brodkin
Hi Elad,

On Thu, 2016-05-26 at 15:00 +0300, Elad Kanfi wrote:
> From: Elad Kanfi 
> 
> Since NAPI works by shutting down event interrupts when theres
> work and turning them on when theres none, the net driver must
> make sure that interrupts are disabled when it reschedules polling.
> By calling napi_reschedule, the driver switches to polling mode,
> therefor there should be no interrupt interference.
> Any received packets will be handled in nps_enet_poll by polling the HW
> indication of received packet until all packets are handled.
> 
> Signed-off-by: Elad Kanfi 
> Acked-by: Noam Camus 
> ---
>  drivers/net/ethernet/ezchip/nps_enet.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ezchip/nps_enet.c 
> b/drivers/net/ethernet/ezchip/nps_enet.c
> index 085f912..06f0317 100644
> --- a/drivers/net/ethernet/ezchip/nps_enet.c
> +++ b/drivers/net/ethernet/ezchip/nps_enet.c
> @@ -205,8 +205,10 @@ static int nps_enet_poll(struct napi_struct *napi, int 
> budget)
>    * re-adding ourselves to the poll list.
>    */
>  
> - if (priv->tx_skb && !tx_ctrl_ct)
> + if (priv->tx_skb && !tx_ctrl_ct) {
> + nps_enet_reg_set(priv, NPS_ENET_REG_BUF_INT_ENABLE, 0);
>   napi_reschedule(napi);
> + }
>   }
>  
>   return work_done;

We just bumped into the same problem (data exchange hangs on the very first 
"ping")
with released Linux v4.6 and linux-next on our nSIM OSCI virtual platform.

I believe it was commit 05c00d82f4d1 ("net: nps_enet: bug fix - handle lost tx 
interrupts")
that introduced the problem. At least reverting it I got networking working.

And indeed that patch fixes mentioned issue.
In other words...

Tested-by: Alexey Brodkin 

P.S. Given my observation is correct please add following to your commit
message if you ever do a respin:
-->8---
Fixes: 05c00d82f4d1 ("net: nps_enet: bug fix - handle lost tx interrupts")

Cc:  # 4.6.x
-->8---
___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

Re: [PATCH 0066/1529] Fix typo

2016-05-26 Thread Vineet Gupta
On Saturday 21 May 2016 05:15 PM, Andrea Gelmini wrote:
> Signed-off-by: Andrea Gelmini 

Thx for these fixes Andrea. Do you mind if I squash the pile for ARC (66 thru 
80)
into a single patch.

-Vineet

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: Unaligned flush_dcache_range in axs101.c

2016-05-26 Thread Vineet Gupta
On Thursday 26 May 2016 05:09 PM, Alexey Brodkin wrote:

> In the code you were referring what I wanted to modify reset vector of the 
> slave core.
> And while we were living without IOC it was all OK. My code above wrote-back
> (or as we used to call it within ARC "flushed") L1 data cache with modified
> reset vector contents to L2 (which is combined data and instruction cache in 
> case of ARC)
> and once slave core was unhalted it read reset vector contents via instruction
> fetch hardware and executed right what I wanted.

I don't have the full context here, but I think for this specific case (writing
reset vector of other core), you don't need to invent a new interface. You can
just do an uncached store (ST.di) which will ensure that the data goes straight 
to
memory. You need to make sure that this line is not in other cores' caches L1
and/or L2 - which can probably be assumed if it's just coming out of reset.

-Vineet

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: Unaligned flush_dcache_range in axs101.c

2016-05-26 Thread Alexey Brodkin
Hi Marek,

On Fri, 2016-04-15 at 15:49 +0200, Marek Vasut wrote:
> On 04/15/2016 03:00 PM, Alexey Brodkin wrote:
> > Cache management functions should be implemented per arch or platform and so
> > that they match requirements of underlying hardware. If hardware may only 
> > work on
> > whole cache line then cache routines must do exactly this.
> Is ARC designed this way ?
> 
> > 
> > As an extra options there could be a check for input arguments in those 
> > functions
> > that will warn a user if he or she passes unaligned start and/or end 
> > addresses.
> Right, ARMv7 does it already, ARM926 too, others need fixing.
> 
> > 
> > Now speaking about data layout your example is very correct and we saw a 
> > lot of
> > such real use-cases - developer of drivers should design data layout such 
> > that
> > cache management doesn't lead to data corruption.
> Right.
> 
> > 
> > But again the code in question has nothing to do with cache line.
> > I only need to writeback 1 word and don't care what really happens with
> > all remaining words in the same cache line. And my implementation of
> > flush_dcache_range() will do everything perfectly correct - it will
> > flush exactly 1 line that contains my word I modified.
> And this will lead to silent corruption which is hard to track down and
> debug. I would rather have a check-and-refuse behavior than
> silently-fix-and-permit.

Even though that discussion is still not finished very related issue
actually hit me recently. So it looks like there's more here to discuss and
probably with wider audience.

So what happens: in our newer ARC HS38 cores we may have so-called IO Coherency
block. This hardware block basically makes sure data exchanged between CPU cores
(we may have single-core CPU or multi-core CPU) and DMA masters are coherent.
That means if IOC exists and enabled in the core we:
 1) Don't need to do manual cache management when sending data to peripherals 
and
    getting data back. IOC makes all work for us.
 2) We must not do manual cache management when sending data to peripherals and
    getting data back. Simply because our manual actions will:
    a) Interfere with what IOC already does
    b) Introduce extra performance overhead which we wanted to get rid of with
       invention of IOC.

So with peripherals it is all cool now.
I was just disabling all cache management routines if IOC gets detected on boot.

But returning back to your initial complaint.

In the code you were referring what I wanted to modify reset vector of the 
slave core.
And while we were living without IOC it was all OK. My code above wrote-back
(or as we used to call it within ARC "flushed") L1 data cache with modified
reset vector contents to L2 (which is combined data and instruction cache in 
case of ARC)
and once slave core was unhalted it read reset vector contents via instruction
fetch hardware and executed right what I wanted.

Now with IOC enabled and as I wrote above with disabled cache operations new 
reset
vector value gets stuck in L1 data cache of the master core. And slave cores 
never get
proper value in reset vector effectively going into the wild on unhalt.

The same actually applies to U-Boot relocation and happens even on single-core 
setups.
We copy U-Boot's image to the new location in the memory and until we 
explicitly flush
L1 data cache to the underlying L2 or DDR (if L2 doesn't exist) something may 
not make its
way back to the CPU when it starts fetching instructions from that new location.

So as a summary I would say that we may want to introduce different cache 
handling routines:
 [1] DMA related
 [2] Non-DMA related

Any thoughts? I'm especially interested if something similar happens on other 
arches and
how people deal with it.

-Alexey
___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc