Re: [PATCH 2/2] uacce: add uacce module

2019-08-27 Thread Kenneth Lee
On Mon, Aug 26, 2019 at 06:29:10AM +0200, Greg Kroah-Hartman wrote:
> Date: Mon, 26 Aug 2019 06:29:10 +0200
> From: Greg Kroah-Hartman 
> To: Kenneth Lee 
> CC: zhangfei , Arnd Bergmann ,
>  linux-accelerat...@lists.ozlabs.org, linux-kernel@vger.kernel.org, Zaibo
>  Xu , Zhou Wang 
> Subject: Re: [PATCH 2/2] uacce: add uacce module
> User-Agent: Mutt/1.12.1 (2019-06-15)
> Message-ID: <20190826042910.ga26...@kroah.com>
> 
> On Mon, Aug 26, 2019 at 12:10:42PM +0800, Kenneth Lee wrote:
> > On Wed, Aug 21, 2019 at 09:05:42AM -0700, Greg Kroah-Hartman wrote:
> > > Date: Wed, 21 Aug 2019 09:05:42 -0700
> > > From: Greg Kroah-Hartman 
> > > To: zhangfei 
> > > CC: Arnd Bergmann , linux-accelerat...@lists.ozlabs.org,
> > >  linux-kernel@vger.kernel.org, Kenneth Lee , Zaibo
> > >  Xu , Zhou Wang 
> > > Subject: Re: [PATCH 2/2] uacce: add uacce module
> > > User-Agent: Mutt/1.12.1 (2019-06-15)
> > > Message-ID: <20190821160542.ga14...@kroah.com>
> > > 
> > > On Wed, Aug 21, 2019 at 10:30:22PM +0800, zhangfei wrote:
> > > > 
> > > > 
> > > > On 2019/8/21 下午5:17, Greg Kroah-Hartman wrote:
> > > > > On Wed, Aug 21, 2019 at 03:21:18PM +0800, zhangfei@foxmail.com 
> > > > > wrote:
> > > > > > Hi, Greg
> > > > > > 
> > > > > > On 2019/8/21 上午12:59, Greg Kroah-Hartman wrote:
> > > > > > > On Tue, Aug 20, 2019 at 09:08:55PM +0800, zhangfei wrote:
> > > > > > > > On 2019/8/15 下午10:13, Greg Kroah-Hartman wrote:
> > > > > > > > > On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote:
> > > > > > > > > > +int uacce_register(struct uacce *uacce)
> > > > > > > > > > +{
> > > > > > > > > > +   int ret;
> > > > > > > > > > +
> > > > > > > > > > +   if (!uacce->pdev) {
> > > > > > > > > > +   pr_debug("uacce parent device not set\n");
> > > > > > > > > > +   return -ENODEV;
> > > > > > > > > > +   }
> > > > > > > > > > +
> > > > > > > > > > +   if (uacce->flags & UACCE_DEV_NOIOMMU) {
> > > > > > > > > > +   add_taint(TAINT_CRAP, LOCKDEP_STILL_OK);
> > > > > > > > > > +   dev_warn(uacce->pdev,
> > > > > > > > > > +"Register to noiommu mode, which 
> > > > > > > > > > export kernel data to user space and may vulnerable to 
> > > > > > > > > > attack");
> > > > > > > > > > +   }
> > > > > > > > > THat is odd, why even offer this feature then if it is a 
> > > > > > > > > major issue?
> > > > > > > > UACCE_DEV_NOIOMMU maybe confusing here.
> > > > > > > > 
> > > > > > > > In this mode, app use ioctl to get dma_handle from 
> > > > > > > > dma_alloc_coherent.
> > > > > > > That's odd, why not use the other default apis to do that?
> > > > > > > 
> > > > > > > > It does not matter iommu is enabled or not.
> > > > > > > > In case iommu is disabled, it maybe dangerous to kernel, so we 
> > > > > > > > added warning here, is it required?
> > > > > > > You should use the other documentated apis for this, don't create 
> > > > > > > your
> > > > > > > own.
> > > > > > I am sorry, not understand here.
> > > > > > Do you mean there is a standard ioctl or standard api in user 
> > > > > > space, it can
> > > > > > get dma_handle from dma_alloc_coherent from kernel?
> > > > > There should be a standard way to get such a handle from userspace
> > > > > today.  Isn't that what the ion interface does?  DRM also does this, 
> > > > > as
> > > > > does UIO I think.
> > > > Thanks Greg,
> > > > Still not find it, will do more search.
> > > > But this may introduce dependency in our lib, like depend on ion?
> > > > > Do you have a spec somewhere that shows exactly what you are trying to
> > > > > do here, along with example userspace 

Re: [PATCH 0/2] A General Accelerator Framework, WarpDrive

2019-08-25 Thread Kenneth Lee
On Thu, Aug 15, 2019 at 01:04:24PM -0400, Jerome Glisse wrote:
> Date: Thu, 15 Aug 2019 13:04:24 -0400
> From: Jerome Glisse 
> To: Zhangfei Gao 
> CC: linux-accelerat...@lists.ozlabs.org, Greg Kroah-Hartman
>  , linux-kernel@vger.kernel.org, Arnd Bergmann
>  
> Subject: Re: [PATCH 0/2] A General Accelerator Framework, WarpDrive
> User-Agent: Mutt/1.11.3 (2019-02-01)
> Message-ID: <20190815170424.ga30...@redhat.com>
> 
> On Wed, Aug 14, 2019 at 05:34:23PM +0800, Zhangfei Gao wrote:
> > *WarpDrive* is a general accelerator framework for the user application to
> > access the hardware without going through the kernel in data path.
> > 
> > WarpDrive is the name for the whole framework. The component in kernel
> > is called uacce, meaning "Unified/User-space-access-intended Accelerator
> > Framework". It makes use of the capability of IOMMU to maintain a
> > unified virtual address space between the hardware and the process.
> > 
> > WarpDrive is intended to be used with Jean Philippe Brucker's SVA
> > patchset[1], which enables IO side page fault and PASID support. 
> > We have keep verifying with Jean's sva/current [2]
> > We also keep verifying with Eric's SMMUv3 Nested Stage patch [3]
> > 
> > This series and related zip & qm driver as well as dummy driver for qemu 
> > test:
> > https://github.com/Linaro/linux-kernel-warpdrive/tree/5.3-rc1-warpdrive-v1
> > zip driver already been upstreamed.
> > zip supporting uacce will be the next step.
> > 
> > The library and user application:
> > https://github.com/Linaro/warpdrive/tree/wdprd-v1-current
> 
> Do we want a new framework ? I think that is the first question that
> should be answer here. Accelerator are in many forms and so far they
> never have been enough commonality to create a framework, even GPUs
> with the drm is an example of that, drm only offer share framework
> for the modesetting part of the GPU (as thankfully monitor connector
> are not specific to GPU brands :))
> 
> FPGA is another example the only common code expose to userspace is
> about bitstream management AFAIK.
> 
> I would argue that a framework should only be created once there is
> enough devices with same userspace API. Meanwhile you can provide
> in kernel helper that allow driver to expose same API. If after a
> while we have enough device driver which all use that same in kernel
> helpers API then it will a good time to introduce a new framework.
> Meanwhile this will allow individual device driver to tinker with
> their API and maybe get to something useful to more devices in the
> end.
> 
> Note that what i propose also allow userspace code sharing for all
> driver that use the same in kernel helper.
> 
> Cheers,
> Jérôme

Hi, Jerome, I explain the idea here: https://zhuanlan.zhihu.com/p/79680889. We
think this is a comment requirement for eveybody. Hope this can help the
discussion. Thanks

-- 
-Kenneth(Hisilicon)



Re: [PATCH 2/2] uacce: add uacce module

2019-08-25 Thread Kenneth Lee
On Wed, Aug 21, 2019 at 09:05:42AM -0700, Greg Kroah-Hartman wrote:
> Date: Wed, 21 Aug 2019 09:05:42 -0700
> From: Greg Kroah-Hartman 
> To: zhangfei 
> CC: Arnd Bergmann , linux-accelerat...@lists.ozlabs.org,
>  linux-kernel@vger.kernel.org, Kenneth Lee , Zaibo
>  Xu , Zhou Wang 
> Subject: Re: [PATCH 2/2] uacce: add uacce module
> User-Agent: Mutt/1.12.1 (2019-06-15)
> Message-ID: <20190821160542.ga14...@kroah.com>
> 
> On Wed, Aug 21, 2019 at 10:30:22PM +0800, zhangfei wrote:
> > 
> > 
> > On 2019/8/21 下午5:17, Greg Kroah-Hartman wrote:
> > > On Wed, Aug 21, 2019 at 03:21:18PM +0800, zhangfei@foxmail.com wrote:
> > > > Hi, Greg
> > > > 
> > > > On 2019/8/21 上午12:59, Greg Kroah-Hartman wrote:
> > > > > On Tue, Aug 20, 2019 at 09:08:55PM +0800, zhangfei wrote:
> > > > > > On 2019/8/15 下午10:13, Greg Kroah-Hartman wrote:
> > > > > > > On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote:
> > > > > > > > +int uacce_register(struct uacce *uacce)
> > > > > > > > +{
> > > > > > > > +   int ret;
> > > > > > > > +
> > > > > > > > +   if (!uacce->pdev) {
> > > > > > > > +   pr_debug("uacce parent device not set\n");
> > > > > > > > +   return -ENODEV;
> > > > > > > > +   }
> > > > > > > > +
> > > > > > > > +   if (uacce->flags & UACCE_DEV_NOIOMMU) {
> > > > > > > > +   add_taint(TAINT_CRAP, LOCKDEP_STILL_OK);
> > > > > > > > +   dev_warn(uacce->pdev,
> > > > > > > > +"Register to noiommu mode, which 
> > > > > > > > export kernel data to user space and may vulnerable to attack");
> > > > > > > > +   }
> > > > > > > THat is odd, why even offer this feature then if it is a major 
> > > > > > > issue?
> > > > > > UACCE_DEV_NOIOMMU maybe confusing here.
> > > > > > 
> > > > > > In this mode, app use ioctl to get dma_handle from 
> > > > > > dma_alloc_coherent.
> > > > > That's odd, why not use the other default apis to do that?
> > > > > 
> > > > > > It does not matter iommu is enabled or not.
> > > > > > In case iommu is disabled, it maybe dangerous to kernel, so we 
> > > > > > added warning here, is it required?
> > > > > You should use the other documentated apis for this, don't create your
> > > > > own.
> > > > I am sorry, not understand here.
> > > > Do you mean there is a standard ioctl or standard api in user space, it 
> > > > can
> > > > get dma_handle from dma_alloc_coherent from kernel?
> > > There should be a standard way to get such a handle from userspace
> > > today.  Isn't that what the ion interface does?  DRM also does this, as
> > > does UIO I think.
> > Thanks Greg,
> > Still not find it, will do more search.
> > But this may introduce dependency in our lib, like depend on ion?
> > > Do you have a spec somewhere that shows exactly what you are trying to
> > > do here, along with example userspace code?  It's hard to determine it
> > > given you only have one "half" of the code here and no users of the apis
> > > you are creating.
> > > 
> > The purpose is doing dma in user space.
> 
> Oh no, please no.  Are you _SURE_ you want to do this?
> 
> Again, look at how ION does this and how the DMAbuff stuff is replacing
> it.  Use that api please instead, otherwise you will get it wrong and we
> don't want to duplicate efforts.
> 
> thanks,
> 
> greg k-h

Dear Greg. I wrote a blog to explain the intention of WarpDrive here:
https://zhuanlan.zhihu.com/p/79680889.

Sharing data is not our intention, Sharing address is. NOIOMMU mode is just a
temporary solution to let some hardware which does not care the security issue
to try WarpDrive for the first step. Some user do not care this much in embedded
scenario. We saw VFIO use the same model so we also want to make a try. If you
insist this is risky, we can remove it.

Thanks.

-- 
-Kenneth(Hisilicon)



Re: [PATCH/RFC 0/5] HW accel subsystem

2019-02-01 Thread Kenneth Lee



在 2019/2/1 下午6:07, Greg Kroah-Hartman 写道:

On Fri, Feb 01, 2019 at 05:10:40PM +0800, Kenneth Lee wrote:

After the RFCv2 was sent to the lkml, we do not get much feedback. But the
Infini-band guys said they did not like it. They think the solution is
re-invention of ib-verbs.

No one needs to re-invent a monstrosity that is ib-verbs.  If anything,
that is a model that should never be recreated again, showing that we
can learn from past mistakes :)


But we do not think so. ib-verbs maintains semantics of "REMOTE memory". But
UACCE maintains semantics of "LOCAL memory". We don't need to send, or sync
memory with other parties. We share those memory with all processes who share
the local bus.

I agree, don't try to duplicate the mess that people moved away from
(hint, everyone sane wraps ib-verbs in another model that can actually
be used and understood...)


But we know we need more "complete" solution to let people understand and accept
our idea. So now we are working on it with our Compression and RSA accelerator
on Hi1620 Server SoC. We are also planning to port our AI framework on it.

Do you think we can cooperate to create an framework in Linux together? Please
feel free to ask for more information. We are happy to answer it.

Sure, that sounds like a great goal!


Thank you very much for your encouragement:)

Kenneth Lee



thanks,

greg k-h



Re: [PATCH/RFC 0/5] HW accel subsystem

2019-02-01 Thread Kenneth Lee
On Fri, Jan 25, 2019 at 10:16:11AM -0800, Olof Johansson wrote:
> Date: Fri, 25 Jan 2019 10:16:11 -0800
> From: Olof Johansson 
> To: linux-kernel@vger.kernel.org
> CC: ogab...@habana.ai, Greg Kroah-Hartman ,
>  jgli...@redhat.com, Andrew Donnellan ,
>  Frederic Barrat , airl...@redhat.com,
>  linux-accelerat...@lists.ozlabs.org
> Subject: [PATCH/RFC 0/5] HW accel subsystem
> X-Mailer: git-send-email 2.11.0
> Message-ID: <20190125181616.62609-1-o...@lixom.net>
> 
> Per discussion in on the Habana Labs driver submission
> (https://lore.kernel.org/lkml/2019012357.31477-1-oded.gab...@gmail.com/),
> there seems to be time to create a separate subsystem for hw accellerators
> instead of letting them proliferate around the tree (and/or in misc).
> 
> There's difference in opinion on how stringent the requirements are for
> a fully open stack for these kind of drivers. I've documented the middle
> road approach in the first patch (requiring some sort of open low-level
> userspace for the kernel interaction, and a way to use/test it).
> 
> Comments and suggestions for better approaches are definitely welcome.

Dear Olof,

How are you? Let me introduce myself. My name is Kenenth Lee, working for
Hisilicon. Our company provide server, AI, networking and terminal SoCs to the
market. We tried to create an accelerator framework a year back and now we are
working on the branch here (There is document in Documentation/warpdrive
directory):

https://github.com/Kenneth-Lee/linux-kernel-warpdrive/tree/wdprd-v1

The user space framework is here:

https://github.com/Kenneth-Lee/warpdrive/tree/wdprd-v1

We have tried to create it on VFIO at the very beginning. The RFCv1 is here:

https://lwn.net/Articles/763990/

But it seems it is not fit. There are two major issues:

1. The VFIO framework enforces the concept of separating the resource into
   devices before using it. This is not an accelerator style. Accelerator is
   another CPU to let the others to share it.
2. The way VFIO used to pin memory in place, has some flaw. In the current
   kernel, if you fork a sub-rpcess after pin the dma memory, you may lost the
   physical pages. (You can get more detail in the threads)

So we tried RFCv2 and build the solution directly on IOMMU. We call our solution
as WarpDrive and the kernel module is called uacce. Our assumption is that:

1. Most of users of the accelerator are in user space.
2. An accelerator is always another heterogeneous processor. It is waiting and
   processing work load sent from CPU.
3. The data structure in the CPU may be complex. It is no good to wrap the data
   and send it to hardware again and again. The better way is to keep the data
   in place and give a pointer to the accelerator, leaving it to finish the job.

So we create a pipe (we called it queue) between the user process and the
hardware directly. It is presented as a file to the user space. The user process
mmap the queue file to address the mmio space of the hardware, share memory and 
so on. With the capability of IOMMU, we can share the whole or part of process
space with the hardware. This can make the software solution easier.

After the RFCv2 was sent to the lkml, we do not get much feedback. But the
Infini-band guys said they did not like it. They think the solution is
re-invention of ib-verbs.

But we do not think so. ib-verbs maintains semantics of "REMOTE memory". But
UACCE maintains semantics of "LOCAL memory". We don't need to send, or sync
memory with other parties. We share those memory with all processes who share
the local bus.

But we know we need more "complete" solution to let people understand and accept
our idea. So now we are working on it with our Compression and RSA accelerator
on Hi1620 Server SoC. We are also planning to port our AI framework on it.

Do you think we can cooperate to create an framework in Linux together? Please
feel free to ask for more information. We are happy to answer it.


Cheers
-- 
-Kenneth(Hisilicon)



Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-20 Thread Kenneth Lee
On Mon, Nov 19, 2018 at 08:29:39PM -0700, Jason Gunthorpe wrote:
> Date: Mon, 19 Nov 2018 20:29:39 -0700
> From: Jason Gunthorpe 
> To: Kenneth Lee 
> CC: Leon Romanovsky , Kenneth Lee ,
>  Tim Sell , linux-...@vger.kernel.org, Alexander
>  Shishkin , Zaibo Xu
>  , zhangfei@foxmail.com, linux...@huawei.com,
>  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
>  , Gavin Schenk , RDMA mailing
>  list , Zhou Wang ,
>  Doug Ledford , Uwe Kleine-König
>  , David Kershner
>  , Johan Hovold , Cyrille
>  Pitchen , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Vinod Koul
>  , linux-cry...@vger.kernel.org, Philippe Ombredanne
>  , Sanyog Kale , "David S.
>  Miller" , linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.9.4 (2018-02-28)
> Message-ID: <20181120032939.gr4...@ziepe.ca>
> 
> On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > > From: Jason Gunthorpe 
> > > To: Kenneth Lee 
> > > CC: Leon Romanovsky , Kenneth Lee ,
> > >  Tim Sell , linux-...@vger.kernel.org, Alexander
> > >  Shishkin , Zaibo Xu
> > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > >  , Gavin Schenk , RDMA 
> > > mailing
> > >  list , Zhou Wang ,
> > >  Doug Ledford , Uwe Kleine-König
> > >  , David Kershner
> > >  , Johan Hovold , Cyrille
> > >  Pitchen , Sagar Dharia
> > >  , Jens Axboe ,
> > >  guodong...@linaro.org, linux-netdev , Randy 
> > > Dunlap
> > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > >  , Sanyog Kale , "David S.
> > >  Miller" , linux-accelerat...@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.9.4 (2018-02-28)
> > > Message-ID: <20181119184954.gb4...@ziepe.ca>
> > > 
> > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> > >  
> > > > If the hardware cannot share page table with the CPU, we then need to 
> > > > have
> > > > some way to change the device page table. This is what happen in ODP. It
> > > > invalidates the page table in device upon mmu_notifier call back. But 
> > > > this cannot
> > > > solve the COW problem: if the user process A share a page P with 
> > > > device, and A 
> > > > forks a new process B, and it continue to write to the page. By COW, the
> > > > process B will keep the page P, while A will get a new page P'. But you 
> > > > have
> > > > no way to let the device know it should use P' rather than P.
> > > 
> > > Is this true? I thought mmu_notifiers covered all these cases.
> > > 
> > > The mm_notifier for A should fire if B causes the physical address of
> > > A's pages to change via COW. 
> > > 
> > > And this causes the device page tables to re-synchronize.
> > 
> > I don't see such code. The current do_cow_fault() implemenation has nothing 
> > to
> > do with mm_notifer.
> 
> Well, that sure sounds like it would be a bug in mmu_notifiers..

Yes, it can be taken that way:) But it is going to be a tough bug.

> 
> But considering Jean's SVA stuff seems based on mmu notifiers, I have
> a hard time believing that it has any different behavior from RDMA's
> ODP, and if it does have different behavior, then it is probably just
> a bug in the ODP implementation.

As Jean has explained, his solution is based on page table sharing. I think ODP
should also consider this new feature.

> 
> > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it 
> > > > support
> > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you 
> > > > don't need
> > > > to write any code for that. Because it has been done by IOMMU 
> > > > framework. If it
> > > 
> > > Looks like the IOMMU code uses mmu_notifier, so it is identical to
> > > IB's ODP. The only difference is that IB tends to have the IOMMU page
> > > table in the device, not in the CPU.
> > > 
> > > The only case I know if that is different is the new-fangled CAPI
> > > stuff 

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-20 Thread Kenneth Lee
On Mon, Nov 19, 2018 at 08:29:39PM -0700, Jason Gunthorpe wrote:
> Date: Mon, 19 Nov 2018 20:29:39 -0700
> From: Jason Gunthorpe 
> To: Kenneth Lee 
> CC: Leon Romanovsky , Kenneth Lee ,
>  Tim Sell , linux-...@vger.kernel.org, Alexander
>  Shishkin , Zaibo Xu
>  , zhangfei@foxmail.com, linux...@huawei.com,
>  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
>  , Gavin Schenk , RDMA mailing
>  list , Zhou Wang ,
>  Doug Ledford , Uwe Kleine-König
>  , David Kershner
>  , Johan Hovold , Cyrille
>  Pitchen , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Vinod Koul
>  , linux-cry...@vger.kernel.org, Philippe Ombredanne
>  , Sanyog Kale , "David S.
>  Miller" , linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.9.4 (2018-02-28)
> Message-ID: <20181120032939.gr4...@ziepe.ca>
> 
> On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > > From: Jason Gunthorpe 
> > > To: Kenneth Lee 
> > > CC: Leon Romanovsky , Kenneth Lee ,
> > >  Tim Sell , linux-...@vger.kernel.org, Alexander
> > >  Shishkin , Zaibo Xu
> > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > >  , Gavin Schenk , RDMA 
> > > mailing
> > >  list , Zhou Wang ,
> > >  Doug Ledford , Uwe Kleine-König
> > >  , David Kershner
> > >  , Johan Hovold , Cyrille
> > >  Pitchen , Sagar Dharia
> > >  , Jens Axboe ,
> > >  guodong...@linaro.org, linux-netdev , Randy 
> > > Dunlap
> > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > >  , Sanyog Kale , "David S.
> > >  Miller" , linux-accelerat...@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.9.4 (2018-02-28)
> > > Message-ID: <20181119184954.gb4...@ziepe.ca>
> > > 
> > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> > >  
> > > > If the hardware cannot share page table with the CPU, we then need to 
> > > > have
> > > > some way to change the device page table. This is what happen in ODP. It
> > > > invalidates the page table in device upon mmu_notifier call back. But 
> > > > this cannot
> > > > solve the COW problem: if the user process A share a page P with 
> > > > device, and A 
> > > > forks a new process B, and it continue to write to the page. By COW, the
> > > > process B will keep the page P, while A will get a new page P'. But you 
> > > > have
> > > > no way to let the device know it should use P' rather than P.
> > > 
> > > Is this true? I thought mmu_notifiers covered all these cases.
> > > 
> > > The mm_notifier for A should fire if B causes the physical address of
> > > A's pages to change via COW. 
> > > 
> > > And this causes the device page tables to re-synchronize.
> > 
> > I don't see such code. The current do_cow_fault() implemenation has nothing 
> > to
> > do with mm_notifer.
> 
> Well, that sure sounds like it would be a bug in mmu_notifiers..

Yes, it can be taken that way:) But it is going to be a tough bug.

> 
> But considering Jean's SVA stuff seems based on mmu notifiers, I have
> a hard time believing that it has any different behavior from RDMA's
> ODP, and if it does have different behavior, then it is probably just
> a bug in the ODP implementation.

As Jean has explained, his solution is based on page table sharing. I think ODP
should also consider this new feature.

> 
> > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it 
> > > > support
> > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you 
> > > > don't need
> > > > to write any code for that. Because it has been done by IOMMU 
> > > > framework. If it
> > > 
> > > Looks like the IOMMU code uses mmu_notifier, so it is identical to
> > > IB's ODP. The only difference is that IB tends to have the IOMMU page
> > > table in the device, not in the CPU.
> > > 
> > > The only case I know if that is different is the new-fangled CAPI
> > > stuff 

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Kenneth Lee
On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> Date: Mon, 19 Nov 2018 17:14:05 +0800
> From: Kenneth Lee 
> To: Leon Romanovsky 
> CC: Tim Sell , linux-...@vger.kernel.org,
>  Alexander Shishkin , Zaibo Xu
>  , zhangfei@foxmail.com, linux...@huawei.com,
>  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
>  , Gavin Schenk , RDMA mailing
>  list , Vinod Koul , Jason
>  Gunthorpe , Doug Ledford , Uwe
>  Kleine-König , David Kershner
>  , Kenneth Lee , Johan
>  Hovold , Cyrille Pitchen
>  , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Zhou Wang
>  , linux-cry...@vger.kernel.org, Philippe
>  Ombredanne , Sanyog Kale ,
>  "David S. Miller" ,
>  linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Message-ID: <20181119091405.GE157308@Turing-Arch-b>
> 
> On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> > Date: Thu, 15 Nov 2018 16:54:55 +0200
> > From: Leon Romanovsky 
> > To: Kenneth Lee 
> > CC: Kenneth Lee , Tim Sell ,
> >  linux-...@vger.kernel.org, Alexander Shishkin
> >  , Zaibo Xu ,
> >  zhangfei@foxmail.com, linux...@huawei.com, haojian.zhu...@linaro.org,
> >  Christoph Lameter , Hao Fang , Gavin
> >  Schenk , RDMA mailing list
> >  , Zhou Wang , Jason
> >  Gunthorpe , Doug Ledford , Uwe
> >  Kleine-König , David Kershner
> >  , Johan Hovold , Cyrille
> >  Pitchen , Sagar Dharia
> >  , Jens Axboe ,
> >  guodong...@linaro.org, linux-netdev , Randy Dunlap
> >  , linux-kernel@vger.kernel.org, Vinod Koul
> >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> >  , Sanyog Kale , "David S.
> >  Miller" , linux-accelerat...@lists.ozlabs.org
> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > User-Agent: Mutt/1.10.1 (2018-07-13)
> > Message-ID: <20181115145455.gn3...@mtr-leonro.mtl.com>
> > 
> > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > > > From: Leon Romanovsky 
> > > > To: Kenneth Lee 
> > > > CC: Tim Sell , linux-...@vger.kernel.org,
> > > >  Alexander Shishkin , Zaibo Xu
> > > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > > >  , Gavin Schenk , RDMA 
> > > > mailing
> > > >  list , Zhou Wang ,
> > > >  Jason Gunthorpe , Doug Ledford , 
> > > > Uwe
> > > >  Kleine-König , David Kershner
> > > >  , Johan Hovold , Cyrille
> > > >  Pitchen , Sagar Dharia
> > > >  , Jens Axboe ,
> > > >  guodong...@linaro.org, linux-netdev , Randy 
> > > > Dunlap
> > > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > > >  , Sanyog Kale , Kenneth 
> > > > Lee
> > > >  , "David S. Miller" ,
> > > >  linux-accelerat...@lists.ozlabs.org
> > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> > > >
> > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > > >
> > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > > > From: Kenneth Lee 
> > > > > > >
> > > > > > > WarpDrive is a general accelerator framework for the user 
> > > > > > > application to
> > > > > > > access the hardware without going through the kernel in data path.
> > > > > > >
> > > > > > > The kernel component to provide kernel facility to driver for 
> > > > > > > expose the
> > > > > > > user interface is called uacce. It a short name for
> > > > > > > "Unified/User-space-access-intended Accelerator Framework".
> > > > > > >
> > > > > > > This patch add document to explain how it works.
> > > > > > + RDMA and netdev folks
> > > > > >
> > > > > 

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Kenneth Lee
On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> Date: Mon, 19 Nov 2018 17:14:05 +0800
> From: Kenneth Lee 
> To: Leon Romanovsky 
> CC: Tim Sell , linux-...@vger.kernel.org,
>  Alexander Shishkin , Zaibo Xu
>  , zhangfei@foxmail.com, linux...@huawei.com,
>  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
>  , Gavin Schenk , RDMA mailing
>  list , Vinod Koul , Jason
>  Gunthorpe , Doug Ledford , Uwe
>  Kleine-König , David Kershner
>  , Kenneth Lee , Johan
>  Hovold , Cyrille Pitchen
>  , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Zhou Wang
>  , linux-cry...@vger.kernel.org, Philippe
>  Ombredanne , Sanyog Kale ,
>  "David S. Miller" ,
>  linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.5.21 (2010-09-15)
> Message-ID: <20181119091405.GE157308@Turing-Arch-b>
> 
> On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> > Date: Thu, 15 Nov 2018 16:54:55 +0200
> > From: Leon Romanovsky 
> > To: Kenneth Lee 
> > CC: Kenneth Lee , Tim Sell ,
> >  linux-...@vger.kernel.org, Alexander Shishkin
> >  , Zaibo Xu ,
> >  zhangfei@foxmail.com, linux...@huawei.com, haojian.zhu...@linaro.org,
> >  Christoph Lameter , Hao Fang , Gavin
> >  Schenk , RDMA mailing list
> >  , Zhou Wang , Jason
> >  Gunthorpe , Doug Ledford , Uwe
> >  Kleine-König , David Kershner
> >  , Johan Hovold , Cyrille
> >  Pitchen , Sagar Dharia
> >  , Jens Axboe ,
> >  guodong...@linaro.org, linux-netdev , Randy Dunlap
> >  , linux-kernel@vger.kernel.org, Vinod Koul
> >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> >  , Sanyog Kale , "David S.
> >  Miller" , linux-accelerat...@lists.ozlabs.org
> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > User-Agent: Mutt/1.10.1 (2018-07-13)
> > Message-ID: <20181115145455.gn3...@mtr-leonro.mtl.com>
> > 
> > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > > > From: Leon Romanovsky 
> > > > To: Kenneth Lee 
> > > > CC: Tim Sell , linux-...@vger.kernel.org,
> > > >  Alexander Shishkin , Zaibo Xu
> > > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > > >  , Gavin Schenk , RDMA 
> > > > mailing
> > > >  list , Zhou Wang ,
> > > >  Jason Gunthorpe , Doug Ledford , 
> > > > Uwe
> > > >  Kleine-König , David Kershner
> > > >  , Johan Hovold , Cyrille
> > > >  Pitchen , Sagar Dharia
> > > >  , Jens Axboe ,
> > > >  guodong...@linaro.org, linux-netdev , Randy 
> > > > Dunlap
> > > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > > >  , Sanyog Kale , Kenneth 
> > > > Lee
> > > >  , "David S. Miller" ,
> > > >  linux-accelerat...@lists.ozlabs.org
> > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> > > >
> > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > > >
> > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > > > From: Kenneth Lee 
> > > > > > >
> > > > > > > WarpDrive is a general accelerator framework for the user 
> > > > > > > application to
> > > > > > > access the hardware without going through the kernel in data path.
> > > > > > >
> > > > > > > The kernel component to provide kernel facility to driver for 
> > > > > > > expose the
> > > > > > > user interface is called uacce. It a short name for
> > > > > > > "Unified/User-space-access-intended Accelerator Framework".
> > > > > > >
> > > > > > > This patch add document to explain how it works.
> > > > > > + RDMA and netdev folks
> > > > > >
> > > > > 

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Kenneth Lee
On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> Date: Thu, 15 Nov 2018 16:54:55 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: Kenneth Lee , Tim Sell ,
>  linux-...@vger.kernel.org, Alexander Shishkin
>  , Zaibo Xu ,
>  zhangfei@foxmail.com, linux...@huawei.com, haojian.zhu...@linaro.org,
>  Christoph Lameter , Hao Fang , Gavin
>  Schenk , RDMA mailing list
>  , Zhou Wang , Jason
>  Gunthorpe , Doug Ledford , Uwe
>  Kleine-König , David Kershner
>  , Johan Hovold , Cyrille
>  Pitchen , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Vinod Koul
>  , linux-cry...@vger.kernel.org, Philippe Ombredanne
>  , Sanyog Kale , "David S.
>  Miller" , linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.10.1 (2018-07-13)
> Message-ID: <20181115145455.gn3...@mtr-leonro.mtl.com>
> 
> On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > > From: Leon Romanovsky 
> > > To: Kenneth Lee 
> > > CC: Tim Sell , linux-...@vger.kernel.org,
> > >  Alexander Shishkin , Zaibo Xu
> > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > >  , Gavin Schenk , RDMA 
> > > mailing
> > >  list , Zhou Wang ,
> > >  Jason Gunthorpe , Doug Ledford , Uwe
> > >  Kleine-König , David Kershner
> > >  , Johan Hovold , Cyrille
> > >  Pitchen , Sagar Dharia
> > >  , Jens Axboe ,
> > >  guodong...@linaro.org, linux-netdev , Randy 
> > > Dunlap
> > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > >  , Sanyog Kale , Kenneth 
> > > Lee
> > >  , "David S. Miller" ,
> > >  linux-accelerat...@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> > >
> > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > >
> > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > > From: Kenneth Lee 
> > > > > >
> > > > > > WarpDrive is a general accelerator framework for the user 
> > > > > > application to
> > > > > > access the hardware without going through the kernel in data path.
> > > > > >
> > > > > > The kernel component to provide kernel facility to driver for 
> > > > > > expose the
> > > > > > user interface is called uacce. It a short name for
> > > > > > "Unified/User-space-access-intended Accelerator Framework".
> > > > > >
> > > > > > This patch add document to explain how it works.
> > > > > + RDMA and netdev folks
> > > > >
> > > > > Sorry, to be late in the game, I don't see other patches, but from
> > > > > the description below it seems like you are reinventing RDMA verbs
> > > > > model. I have hard time to see the differences in the proposed
> > > > > framework to already implemented in drivers/infiniband/* for the 
> > > > > kernel
> > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the 
> > > > > user
> > > > > space parts.
> > > >
> > > > Thanks Leon,
> > > >
> > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot 
> > > > from
> > > > the exist code of RDMA. But we we have to make a new one because we 
> > > > cannot
> > > > register accelerators such as AI operation, encryption or compression 
> > > > to the
> > > > RDMA framework:)
> > >
> > > Assuming that you did everything right and still failed to use RDMA
> > > framework, you was supposed to fix it and not to reinvent new exactly
> > > same one. It is how we develop kernel, by reusing existing code.
> >
> > Yes, but we don't force other system such as NIC or GPU into RDMA, do we?
> 
> You don't introduce new NIC or GPU, but proposing another interfac

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Kenneth Lee
On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> Date: Thu, 15 Nov 2018 16:54:55 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: Kenneth Lee , Tim Sell ,
>  linux-...@vger.kernel.org, Alexander Shishkin
>  , Zaibo Xu ,
>  zhangfei@foxmail.com, linux...@huawei.com, haojian.zhu...@linaro.org,
>  Christoph Lameter , Hao Fang , Gavin
>  Schenk , RDMA mailing list
>  , Zhou Wang , Jason
>  Gunthorpe , Doug Ledford , Uwe
>  Kleine-König , David Kershner
>  , Johan Hovold , Cyrille
>  Pitchen , Sagar Dharia
>  , Jens Axboe ,
>  guodong...@linaro.org, linux-netdev , Randy Dunlap
>  , linux-kernel@vger.kernel.org, Vinod Koul
>  , linux-cry...@vger.kernel.org, Philippe Ombredanne
>  , Sanyog Kale , "David S.
>  Miller" , linux-accelerat...@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.10.1 (2018-07-13)
> Message-ID: <20181115145455.gn3...@mtr-leonro.mtl.com>
> 
> On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > > From: Leon Romanovsky 
> > > To: Kenneth Lee 
> > > CC: Tim Sell , linux-...@vger.kernel.org,
> > >  Alexander Shishkin , Zaibo Xu
> > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> > >  , Gavin Schenk , RDMA 
> > > mailing
> > >  list , Zhou Wang ,
> > >  Jason Gunthorpe , Doug Ledford , Uwe
> > >  Kleine-König , David Kershner
> > >  , Johan Hovold , Cyrille
> > >  Pitchen , Sagar Dharia
> > >  , Jens Axboe ,
> > >  guodong...@linaro.org, linux-netdev , Randy 
> > > Dunlap
> > >  , linux-kernel@vger.kernel.org, Vinod Koul
> > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > >  , Sanyog Kale , Kenneth 
> > > Lee
> > >  , "David S. Miller" ,
> > >  linux-accelerat...@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> > >
> > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > >
> > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > > From: Kenneth Lee 
> > > > > >
> > > > > > WarpDrive is a general accelerator framework for the user 
> > > > > > application to
> > > > > > access the hardware without going through the kernel in data path.
> > > > > >
> > > > > > The kernel component to provide kernel facility to driver for 
> > > > > > expose the
> > > > > > user interface is called uacce. It a short name for
> > > > > > "Unified/User-space-access-intended Accelerator Framework".
> > > > > >
> > > > > > This patch add document to explain how it works.
> > > > > + RDMA and netdev folks
> > > > >
> > > > > Sorry, to be late in the game, I don't see other patches, but from
> > > > > the description below it seems like you are reinventing RDMA verbs
> > > > > model. I have hard time to see the differences in the proposed
> > > > > framework to already implemented in drivers/infiniband/* for the 
> > > > > kernel
> > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the 
> > > > > user
> > > > > space parts.
> > > >
> > > > Thanks Leon,
> > > >
> > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot 
> > > > from
> > > > the exist code of RDMA. But we we have to make a new one because we 
> > > > cannot
> > > > register accelerators such as AI operation, encryption or compression 
> > > > to the
> > > > RDMA framework:)
> > >
> > > Assuming that you did everything right and still failed to use RDMA
> > > framework, you was supposed to fix it and not to reinvent new exactly
> > > same one. It is how we develop kernel, by reusing existing code.
> >
> > Yes, but we don't force other system such as NIC or GPU into RDMA, do we?
> 
> You don't introduce new NIC or GPU, but proposing another interfac

[RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-12 Thread Kenneth Lee
From: Kenneth Lee 

WarpDrive is a general accelerator framework for the user application to
access the hardware without going through the kernel in data path.

The kernel component to provide kernel facility to driver for expose the
user interface is called uacce. It a short name for
"Unified/User-space-access-intended Accelerator Framework".

This patch add document to explain how it works.

Signed-off-by: Kenneth Lee 
---
 Documentation/warpdrive/warpdrive.rst   | 260 +++
 Documentation/warpdrive/wd-arch.svg | 764 
 Documentation/warpdrive/wd.svg  | 526 ++
 Documentation/warpdrive/wd_q_addr_space.svg | 359 +
 4 files changed, 1909 insertions(+)
 create mode 100644 Documentation/warpdrive/warpdrive.rst
 create mode 100644 Documentation/warpdrive/wd-arch.svg
 create mode 100644 Documentation/warpdrive/wd.svg
 create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg

diff --git a/Documentation/warpdrive/warpdrive.rst 
b/Documentation/warpdrive/warpdrive.rst
new file mode 100644
index ..ef84d3a2d462
--- /dev/null
+++ b/Documentation/warpdrive/warpdrive.rst
@@ -0,0 +1,260 @@
+Introduction of WarpDrive
+=
+
+*WarpDrive* is a general accelerator framework for the user application to
+access the hardware without going through the kernel in data path.
+
+It can be used as the quick channel for accelerators, network adaptors or
+other hardware for application in user space.
+
+This may make some implementation simpler.  E.g.  you can reuse most of the
+*netdev* driver in kernel and just share some ring buffer to the user space
+driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with
+the *netdev* in the user space as a https reversed proxy, etc.
+
+*WarpDrive* takes the hardware accelerator as a heterogeneous processor which
+can share particular load from the CPU:
+
+.. image:: wd.svg
+:alt: WarpDrive Concept
+
+The virtual concept, queue, is used to manage the requests sent to the
+accelerator. The application send requests to the queue by writing to some
+particular address, while the hardware takes the requests directly from the
+address and send feedback accordingly.
+
+The format of the queue may differ from hardware to hardware. But the
+application need not to make any system call for the communication.
+
+*WarpDrive* tries to create a shared virtual address space for all involved
+accelerators. Within this space, the requests sent to queue can refer to any
+virtual address, which will be valid to the application and all involved
+accelerators.
+
+The name *WarpDrive* is simply a cool and general name meaning the framework
+makes the application faster. It includes general user library, kernel
+management module and drivers for the hardware. In kernel, the management
+module is called *uacce*, meaning "Unified/User-space-access-intended
+Accelerator Framework".
+
+
+How does it work
+
+
+*WarpDrive* uses *mmap* and *IOMMU* to play the trick.
+
+*Uacce* creates a chrdev for the device registered to it. A "queue" will be
+created when the chrdev is opened. The application access the queue by mmap
+different address region of the queue file.
+
+The following figure demonstrated the queue file address space:
+
+.. image:: wd_q_addr_space.svg
+:alt: WarpDrive Queue Address Space
+
+The first region of the space, device region, is used for the application to
+write request or read answer to or from the hardware.
+
+Normally, there can be three types of device regions mmio and memory regions.
+It is recommended to use common memory for request/answer descriptors and use
+the mmio space for device notification, such as doorbell. But of course, this
+is all up to the interface designer.
+
+There can be two types of device memory regions, kernel-only and user-shared.
+This will be explained in the "kernel APIs" section.
+
+The Static Share Virtual Memory region is necessary only when the device IOMMU
+does not support "Share Virtual Memory". This will be explained after the
+*IOMMU* idea.
+
+
+Architecture
+
+
+The full *WarpDrive* architecture is represented in the following class
+diagram:
+
+.. image:: wd-arch.svg
+:alt: WarpDrive Architecture
+
+
+The user API
+
+
+We adopt a polling style interface in the user space: ::
+
+int wd_request_queue(struct wd_queue *q);
+void wd_release_queue(struct wd_queue *q);
+
+int wd_send(struct wd_queue *q, void *req);
+int wd_recv(struct wd_queue *q, void **req);
+int wd_recv_sync(struct wd_queue *q, void **req);
+void wd_flush(struct wd_queue *q);
+
+wd_recv_sync() is a wrapper to its non-sync version. It will trapped into
+kernel and waits until the queue become available.
+
+If the queue do not support SVA/SVM. The following helper function
+can be used to crea

[RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-12 Thread Kenneth Lee
From: Kenneth Lee 

WarpDrive is a general accelerator framework for the user application to
access the hardware without going through the kernel in data path.

The kernel component to provide kernel facility to driver for expose the
user interface is called uacce. It a short name for
"Unified/User-space-access-intended Accelerator Framework".

This patch add document to explain how it works.

Signed-off-by: Kenneth Lee 
---
 Documentation/warpdrive/warpdrive.rst   | 260 +++
 Documentation/warpdrive/wd-arch.svg | 764 
 Documentation/warpdrive/wd.svg  | 526 ++
 Documentation/warpdrive/wd_q_addr_space.svg | 359 +
 4 files changed, 1909 insertions(+)
 create mode 100644 Documentation/warpdrive/warpdrive.rst
 create mode 100644 Documentation/warpdrive/wd-arch.svg
 create mode 100644 Documentation/warpdrive/wd.svg
 create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg

diff --git a/Documentation/warpdrive/warpdrive.rst 
b/Documentation/warpdrive/warpdrive.rst
new file mode 100644
index ..ef84d3a2d462
--- /dev/null
+++ b/Documentation/warpdrive/warpdrive.rst
@@ -0,0 +1,260 @@
+Introduction of WarpDrive
+=
+
+*WarpDrive* is a general accelerator framework for the user application to
+access the hardware without going through the kernel in data path.
+
+It can be used as the quick channel for accelerators, network adaptors or
+other hardware for application in user space.
+
+This may make some implementation simpler.  E.g.  you can reuse most of the
+*netdev* driver in kernel and just share some ring buffer to the user space
+driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with
+the *netdev* in the user space as a https reversed proxy, etc.
+
+*WarpDrive* takes the hardware accelerator as a heterogeneous processor which
+can share particular load from the CPU:
+
+.. image:: wd.svg
+:alt: WarpDrive Concept
+
+The virtual concept, queue, is used to manage the requests sent to the
+accelerator. The application send requests to the queue by writing to some
+particular address, while the hardware takes the requests directly from the
+address and send feedback accordingly.
+
+The format of the queue may differ from hardware to hardware. But the
+application need not to make any system call for the communication.
+
+*WarpDrive* tries to create a shared virtual address space for all involved
+accelerators. Within this space, the requests sent to queue can refer to any
+virtual address, which will be valid to the application and all involved
+accelerators.
+
+The name *WarpDrive* is simply a cool and general name meaning the framework
+makes the application faster. It includes general user library, kernel
+management module and drivers for the hardware. In kernel, the management
+module is called *uacce*, meaning "Unified/User-space-access-intended
+Accelerator Framework".
+
+
+How does it work
+
+
+*WarpDrive* uses *mmap* and *IOMMU* to play the trick.
+
+*Uacce* creates a chrdev for the device registered to it. A "queue" will be
+created when the chrdev is opened. The application access the queue by mmap
+different address region of the queue file.
+
+The following figure demonstrated the queue file address space:
+
+.. image:: wd_q_addr_space.svg
+:alt: WarpDrive Queue Address Space
+
+The first region of the space, device region, is used for the application to
+write request or read answer to or from the hardware.
+
+Normally, there can be three types of device regions mmio and memory regions.
+It is recommended to use common memory for request/answer descriptors and use
+the mmio space for device notification, such as doorbell. But of course, this
+is all up to the interface designer.
+
+There can be two types of device memory regions, kernel-only and user-shared.
+This will be explained in the "kernel APIs" section.
+
+The Static Share Virtual Memory region is necessary only when the device IOMMU
+does not support "Share Virtual Memory". This will be explained after the
+*IOMMU* idea.
+
+
+Architecture
+
+
+The full *WarpDrive* architecture is represented in the following class
+diagram:
+
+.. image:: wd-arch.svg
+:alt: WarpDrive Architecture
+
+
+The user API
+
+
+We adopt a polling style interface in the user space: ::
+
+int wd_request_queue(struct wd_queue *q);
+void wd_release_queue(struct wd_queue *q);
+
+int wd_send(struct wd_queue *q, void *req);
+int wd_recv(struct wd_queue *q, void **req);
+int wd_recv_sync(struct wd_queue *q, void **req);
+void wd_flush(struct wd_queue *q);
+
+wd_recv_sync() is a wrapper to its non-sync version. It will trapped into
+kernel and waits until the queue become available.
+
+If the queue do not support SVA/SVM. The following helper function
+can be used to crea

Re: [PATCH 4/7] crypto: add hisilicon Queue Manager driver

2018-09-06 Thread Kenneth Lee
On Sun, Sep 02, 2018 at 07:15:07PM -0700, Randy Dunlap wrote:
> Date: Sun, 2 Sep 2018 19:15:07 -0700
> From: Randy Dunlap 
> To: Kenneth Lee , Jonathan Corbet ,
>  Herbert Xu , "David S . Miller"
>  , Joerg Roedel , Alex Williamson
>  , Kenneth Lee , Hao
>  Fang , Zhou Wang , Zaibo Xu
>  , Philippe Ombredanne , Greg
>  Kroah-Hartman , Thomas Gleixner
>  , linux-...@vger.kernel.org,
>  linux-kernel@vger.kernel.org, linux-cry...@vger.kernel.org,
>  io...@lists.linux-foundation.org, k...@vger.kernel.org,
>  linux-accelerat...@lists.ozlabs.org, Lu Baolu ,
>  Sanjay Kumar 
> CC: linux...@huawei.com
> Subject: Re: [PATCH 4/7] crypto: add hisilicon Queue Manager driver
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
>  Thunderbird/52.9.1
> Message-ID: <4e46a451-d1cd-ac68-84b4-20792fdbc...@infradead.org>
> 
> On 09/02/2018 05:52 PM, Kenneth Lee wrote:
> > diff --git a/drivers/crypto/hisilicon/Kconfig 
> > b/drivers/crypto/hisilicon/Kconfig
> > index 8ca9c503bcb0..02a6eef84101 100644
> > --- a/drivers/crypto/hisilicon/Kconfig
> > +++ b/drivers/crypto/hisilicon/Kconfig
> > @@ -1,4 +1,8 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +config CRYPTO_DEV_HISILICON
> > +   tristate "Support for HISILICON CRYPTO ACCELERATOR"
> > +   help
> > + Enable this to use Hisilicon Hardware Accelerators
> 
>   Accelerators.

Thanks, will change it in next version.

> 
> 
> -- 
> ~Randy

-- 
-Kenneth(Hisilicon)


本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the 
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!



Re: [PATCH 4/7] crypto: add hisilicon Queue Manager driver

2018-09-06 Thread Kenneth Lee
On Sun, Sep 02, 2018 at 07:15:07PM -0700, Randy Dunlap wrote:
> Date: Sun, 2 Sep 2018 19:15:07 -0700
> From: Randy Dunlap 
> To: Kenneth Lee , Jonathan Corbet ,
>  Herbert Xu , "David S . Miller"
>  , Joerg Roedel , Alex Williamson
>  , Kenneth Lee , Hao
>  Fang , Zhou Wang , Zaibo Xu
>  , Philippe Ombredanne , Greg
>  Kroah-Hartman , Thomas Gleixner
>  , linux-...@vger.kernel.org,
>  linux-kernel@vger.kernel.org, linux-cry...@vger.kernel.org,
>  io...@lists.linux-foundation.org, k...@vger.kernel.org,
>  linux-accelerat...@lists.ozlabs.org, Lu Baolu ,
>  Sanjay Kumar 
> CC: linux...@huawei.com
> Subject: Re: [PATCH 4/7] crypto: add hisilicon Queue Manager driver
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
>  Thunderbird/52.9.1
> Message-ID: <4e46a451-d1cd-ac68-84b4-20792fdbc...@infradead.org>
> 
> On 09/02/2018 05:52 PM, Kenneth Lee wrote:
> > diff --git a/drivers/crypto/hisilicon/Kconfig 
> > b/drivers/crypto/hisilicon/Kconfig
> > index 8ca9c503bcb0..02a6eef84101 100644
> > --- a/drivers/crypto/hisilicon/Kconfig
> > +++ b/drivers/crypto/hisilicon/Kconfig
> > @@ -1,4 +1,8 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > +config CRYPTO_DEV_HISILICON
> > +   tristate "Support for HISILICON CRYPTO ACCELERATOR"
> > +   help
> > + Enable this to use Hisilicon Hardware Accelerators
> 
>   Accelerators.

Thanks, will change it in next version.

> 
> 
> -- 
> ~Randy

-- 
-Kenneth(Hisilicon)


本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the 
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!



Re: [RFC PATCH 3/7] vfio: add spimdev support

2018-08-05 Thread Kenneth Lee
On Thu, Aug 02, 2018 at 12:43:27PM -0600, Alex Williamson wrote:
> Date: Thu, 2 Aug 2018 12:43:27 -0600
> From: Alex Williamson 
> To: Cornelia Huck 
> CC: Kenneth Lee , "Tian, Kevin"
>  , Kenneth Lee , Jonathan Corbet
>  , Herbert Xu , "David S .
>  Miller" , Joerg Roedel , Hao Fang
>  , Zhou Wang , Zaibo Xu
>  , Philippe Ombredanne , "Greg
>  Kroah-Hartman" , Thomas Gleixner
>  , "linux-...@vger.kernel.org"
>  , "linux-kernel@vger.kernel.org"
>  , "linux-cry...@vger.kernel.org"
>  , "io...@lists.linux-foundation.org"
>  , "k...@vger.kernel.org"
>  , "linux-accelerat...@lists.ozlabs.org\"
>  , Lu Baolu
>  ,  Kumar",   , " linux...@huawei.com "
>  ">
> Subject: Re: [RFC PATCH 3/7] vfio: add spimdev support
> Message-ID: <20180802124327.403b1...@t450s.home>
> 
> On Thu, 2 Aug 2018 10:35:28 +0200
> Cornelia Huck  wrote:
> 
> > On Thu, 2 Aug 2018 15:34:40 +0800
> > Kenneth Lee  wrote:
> > 
> > > On Thu, Aug 02, 2018 at 04:24:22AM +, Tian, Kevin wrote:  
> > 
> > > > > From: Kenneth Lee [mailto:liguo...@hisilicon.com]
> > > > > Sent: Thursday, August 2, 2018 11:47 AM
> > > > > 
> > > > > >
> > > > > > > From: Kenneth Lee
> > > > > > > Sent: Wednesday, August 1, 2018 6:22 PM
> > > > > > >
> > > > > > > From: Kenneth Lee 
> > > > > > >
> > > > > > > SPIMDEV is "Share Parent IOMMU Mdev". It is a vfio-mdev. But 
> > > > > > > differ
> > > > > from
> > > > > > > the general vfio-mdev:
> > > > > > >
> > > > > > > 1. It shares its parent's IOMMU.
> > > > > > > 2. There is no hardware resource attached to the mdev is created. 
> > > > > > > The
> > > > > > > hardware resource (A `queue') is allocated only when the mdev is
> > > > > > > opened.
> > > > > >
> > > > > > Alex has concern on doing so, as pointed out in:
> > > > > >
> > > > > > https://www.spinics.net/lists/kvm/msg172652.html
> > > > > >
> > > > > > resource allocation should be reserved at creation time.
> > > > > 
> > > > > Yes. That is why I keep telling that SPIMDEV is not for "VM", it is 
> > > > > for "many
> > > > > processes", it is just an access point to the process. Not a device 
> > > > > to VM. I
> > > > > hope
> > > > > Alex can accept it:)
> > > > > 
> > > > 
> > > > VFIO is just about assigning device resource to user space. It doesn't 
> > > > care
> > > > whether it's native processes or VM using the device so far. Along the 
> > > > direction
> > > > which you described, looks VFIO needs to support the configuration that
> > > > some mdevs are used for native process only, while others can be used
> > > > for both native and VM. I'm not sure whether there is a clean way to
> > > > enforce it...
> > > 
> > > I had the same idea at the beginning. But finally I found that the life 
> > > cycle
> > > of the virtual device for VM and process were different. Consider you 
> > > create
> > > some mdevs for VM use, you will give all those mdevs to lib-virt, which
> > > distribute those mdev to VMs or containers. If the VM or container exits, 
> > > the
> > > mdev is returned to the lib-virt and used for next allocation. It is the
> > > administrator who controlled every mdev's allocation.
> 
> Libvirt currently does no management of mdev devices, so I believe
> this example is fictitious.  The extent of libvirt's interaction with
> mdev is that XML may specify an mdev UUID as the source for a hostdev
> and set the permissions on the device files appropriately.  Whether
> mdevs are created in advance and re-used or created and destroyed
> around a VM instance (for example via qemu hooks scripts) is not a
> policy that libvirt imposes.
>  
> > > But for process, it is different. There is no lib-virt in control. The
> > > administrator's intension is to grant some type of application to access 
> > > the
> > > hardware. The application can get a handle of the hardware, send request 
> > > and ge

Re: [RFC PATCH 3/7] vfio: add spimdev support

2018-08-05 Thread Kenneth Lee
On Thu, Aug 02, 2018 at 12:43:27PM -0600, Alex Williamson wrote:
> Date: Thu, 2 Aug 2018 12:43:27 -0600
> From: Alex Williamson 
> To: Cornelia Huck 
> CC: Kenneth Lee , "Tian, Kevin"
>  , Kenneth Lee , Jonathan Corbet
>  , Herbert Xu , "David S .
>  Miller" , Joerg Roedel , Hao Fang
>  , Zhou Wang , Zaibo Xu
>  , Philippe Ombredanne , "Greg
>  Kroah-Hartman" , Thomas Gleixner
>  , "linux-...@vger.kernel.org"
>  , "linux-kernel@vger.kernel.org"
>  , "linux-cry...@vger.kernel.org"
>  , "io...@lists.linux-foundation.org"
>  , "k...@vger.kernel.org"
>  , "linux-accelerat...@lists.ozlabs.org\"
>  , Lu Baolu
>  ,  Kumar",   , " linux...@huawei.com "
>  ">
> Subject: Re: [RFC PATCH 3/7] vfio: add spimdev support
> Message-ID: <20180802124327.403b1...@t450s.home>
> 
> On Thu, 2 Aug 2018 10:35:28 +0200
> Cornelia Huck  wrote:
> 
> > On Thu, 2 Aug 2018 15:34:40 +0800
> > Kenneth Lee  wrote:
> > 
> > > On Thu, Aug 02, 2018 at 04:24:22AM +, Tian, Kevin wrote:  
> > 
> > > > > From: Kenneth Lee [mailto:liguo...@hisilicon.com]
> > > > > Sent: Thursday, August 2, 2018 11:47 AM
> > > > > 
> > > > > >
> > > > > > > From: Kenneth Lee
> > > > > > > Sent: Wednesday, August 1, 2018 6:22 PM
> > > > > > >
> > > > > > > From: Kenneth Lee 
> > > > > > >
> > > > > > > SPIMDEV is "Share Parent IOMMU Mdev". It is a vfio-mdev. But 
> > > > > > > differ
> > > > > from
> > > > > > > the general vfio-mdev:
> > > > > > >
> > > > > > > 1. It shares its parent's IOMMU.
> > > > > > > 2. There is no hardware resource attached to the mdev is created. 
> > > > > > > The
> > > > > > > hardware resource (A `queue') is allocated only when the mdev is
> > > > > > > opened.
> > > > > >
> > > > > > Alex has concern on doing so, as pointed out in:
> > > > > >
> > > > > > https://www.spinics.net/lists/kvm/msg172652.html
> > > > > >
> > > > > > resource allocation should be reserved at creation time.
> > > > > 
> > > > > Yes. That is why I keep telling that SPIMDEV is not for "VM", it is 
> > > > > for "many
> > > > > processes", it is just an access point to the process. Not a device 
> > > > > to VM. I
> > > > > hope
> > > > > Alex can accept it:)
> > > > > 
> > > > 
> > > > VFIO is just about assigning device resource to user space. It doesn't 
> > > > care
> > > > whether it's native processes or VM using the device so far. Along the 
> > > > direction
> > > > which you described, looks VFIO needs to support the configuration that
> > > > some mdevs are used for native process only, while others can be used
> > > > for both native and VM. I'm not sure whether there is a clean way to
> > > > enforce it...
> > > 
> > > I had the same idea at the beginning. But finally I found that the life 
> > > cycle
> > > of the virtual device for VM and process were different. Consider you 
> > > create
> > > some mdevs for VM use, you will give all those mdevs to lib-virt, which
> > > distribute those mdev to VMs or containers. If the VM or container exits, 
> > > the
> > > mdev is returned to the lib-virt and used for next allocation. It is the
> > > administrator who controlled every mdev's allocation.
> 
> Libvirt currently does no management of mdev devices, so I believe
> this example is fictitious.  The extent of libvirt's interaction with
> mdev is that XML may specify an mdev UUID as the source for a hostdev
> and set the permissions on the device files appropriately.  Whether
> mdevs are created in advance and re-used or created and destroyed
> around a VM instance (for example via qemu hooks scripts) is not a
> policy that libvirt imposes.
>  
> > > But for process, it is different. There is no lib-virt in control. The
> > > administrator's intension is to grant some type of application to access 
> > > the
> > > hardware. The application can get a handle of the hardware, send request 
> > > and ge

Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive

2018-08-05 Thread Kenneth Lee
On Fri, Aug 03, 2018 at 03:20:43PM +0100, Alan Cox wrote:
> Date: Fri, 3 Aug 2018 15:20:43 +0100
> From: Alan Cox 
> To: Jerome Glisse 
> CC: "Tian, Kevin" , Kenneth Lee
>  , Hao Fang , Herbert Xu
>  , "k...@vger.kernel.org"
>  , Jonathan Corbet , Greg
>  Kroah-Hartman , "linux-...@vger.kernel.org"
>  , "Kumar, Sanjay K" ,
>  "io...@lists.linux-foundation.org" ,
>  "linux-kernel@vger.kernel.org" ,
>  "linux...@huawei.com" , Alex Williamson
>  , Thomas Gleixner ,
>  "linux-cry...@vger.kernel.org" , Philippe
>  Ombredanne , Zaibo Xu , Kenneth
>  Lee , "David S . Miller" ,
>  Ross Zwisler 
> Subject: Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive
> Organization: Intel Corporation
> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-redhat-linux-gnu)
> Message-ID: <20180803152043.40f88947@alans-desktop>
> 
> > If we are going to have any kind of general purpose accelerator API then
> > > it has to be able to implement things like  
> > 
> > Why is the existing driver model not good enough ? So you want
> > a device with function X you look into /dev/X (for instance
> > for GPU you look in /dev/dri)
> 
> Except when my GPU is in an FPGA in which case it might be somewhere else
> or it's a general purpose accelerator that happens to be usable as a GPU.
> Unusual today in big computer space but you'll find it in
> microcontrollers.
> 
> > Each of those device need a userspace driver and thus this
> > user space driver can easily knows where to look. I do not
> > expect that every application will reimplement those drivers
> > but instead use some kind of library that provide a high
> > level API for each of those devices.
> 
> Think about it from the user level. You have a pipeline of things you
> wish to execute, you need to get the right accelerator combinations and
> they need to fit together to meet system constraints like number of
> IOMMU ids the accelerator supports, where they are connected.
> 
> > Now you have a hierarchy of memory for the CPU (HBM, local
> > node main memory aka you DDR dimm, persistent memory) each
> 
> It's not a heirarchy, it's a graph. There's no fundamental reason two
> accelerators can't be close to two different CPU cores but have shared
> HBM that is far from each processor. There are physical reasons it tends
> to look more like a heirarchy today.
> 
> > Anyway i think finding devices and finding relation between
> > devices and memory is 2 separate problems and as such should
> > be handled separatly.
> 
> At a certain level they are deeply intertwined because you need a common
> API. It's not good if I want a particular accelerator and need to then
> see which API its under on this machine and which interface I have to
> use, and maybe have a mix of FPGA, WarpDrive and Google ASIC interfaces
> all different.
> 
> The job of the kernel is to impose some kind of sanity and unity on this
> lot.
> 
> All of it in the end comes down to
> 
> 'Somehow glue some chunk of memory into my address space and find any
> supporting driver I need'
> 

Agree. This is also our intension on WarpDrive. And it looks VFIO is the best
place to fulfill this requirement.

> plus virtualization of the above.
> 
> That bit's easy - but making it usable is a different story.
> 
> Alan

-- 
-Kenneth(Hisilicon)


本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the 
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!



Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive

2018-08-05 Thread Kenneth Lee
On Fri, Aug 03, 2018 at 03:20:43PM +0100, Alan Cox wrote:
> Date: Fri, 3 Aug 2018 15:20:43 +0100
> From: Alan Cox 
> To: Jerome Glisse 
> CC: "Tian, Kevin" , Kenneth Lee
>  , Hao Fang , Herbert Xu
>  , "k...@vger.kernel.org"
>  , Jonathan Corbet , Greg
>  Kroah-Hartman , "linux-...@vger.kernel.org"
>  , "Kumar, Sanjay K" ,
>  "io...@lists.linux-foundation.org" ,
>  "linux-kernel@vger.kernel.org" ,
>  "linux...@huawei.com" , Alex Williamson
>  , Thomas Gleixner ,
>  "linux-cry...@vger.kernel.org" , Philippe
>  Ombredanne , Zaibo Xu , Kenneth
>  Lee , "David S . Miller" ,
>  Ross Zwisler 
> Subject: Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive
> Organization: Intel Corporation
> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-redhat-linux-gnu)
> Message-ID: <20180803152043.40f88947@alans-desktop>
> 
> > If we are going to have any kind of general purpose accelerator API then
> > > it has to be able to implement things like  
> > 
> > Why is the existing driver model not good enough ? So you want
> > a device with function X you look into /dev/X (for instance
> > for GPU you look in /dev/dri)
> 
> Except when my GPU is in an FPGA in which case it might be somewhere else
> or it's a general purpose accelerator that happens to be usable as a GPU.
> Unusual today in big computer space but you'll find it in
> microcontrollers.
> 
> > Each of those device need a userspace driver and thus this
> > user space driver can easily knows where to look. I do not
> > expect that every application will reimplement those drivers
> > but instead use some kind of library that provide a high
> > level API for each of those devices.
> 
> Think about it from the user level. You have a pipeline of things you
> wish to execute, you need to get the right accelerator combinations and
> they need to fit together to meet system constraints like number of
> IOMMU ids the accelerator supports, where they are connected.
> 
> > Now you have a hierarchy of memory for the CPU (HBM, local
> > node main memory aka you DDR dimm, persistent memory) each
> 
> It's not a heirarchy, it's a graph. There's no fundamental reason two
> accelerators can't be close to two different CPU cores but have shared
> HBM that is far from each processor. There are physical reasons it tends
> to look more like a heirarchy today.
> 
> > Anyway i think finding devices and finding relation between
> > devices and memory is 2 separate problems and as such should
> > be handled separatly.
> 
> At a certain level they are deeply intertwined because you need a common
> API. It's not good if I want a particular accelerator and need to then
> see which API its under on this machine and which interface I have to
> use, and maybe have a mix of FPGA, WarpDrive and Google ASIC interfaces
> all different.
> 
> The job of the kernel is to impose some kind of sanity and unity on this
> lot.
> 
> All of it in the end comes down to
> 
> 'Somehow glue some chunk of memory into my address space and find any
> supporting driver I need'
> 

Agree. This is also our intension on WarpDrive. And it looks VFIO is the best
place to fulfill this requirement.

> plus virtualization of the above.
> 
> That bit's easy - but making it usable is a different story.
> 
> Alan

-- 
-Kenneth(Hisilicon)


本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the 
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
intended 
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!



[PATCH v4] IB/umem: Release pid in error and ODP flow

2017-01-04 Thread Kenneth Lee
1. Release pid before enter odp flow
2. Release pid when fail to allocate memory

Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from 
ib_umem_get")
Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
Reviewed-by: Haggai Eran <hagg...@mellanox.com>
---
Change from v1 to v2:
  Correcting the patch title and description  
Change from v2 to v3:
  Update the title and add "Fixes" fields in the description  
Change from v3 to v4:
  Keep the Fixes tag at the end of the description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f0cb28..4609b921f899 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH v4] IB/umem: Release pid in error and ODP flow

2017-01-04 Thread Kenneth Lee
1. Release pid before enter odp flow
2. Release pid when fail to allocate memory

Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from 
ib_umem_get")
Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
Signed-off-by: Kenneth Lee 
Reviewed-by: Haggai Eran 
---
Change from v1 to v2:
  Correcting the patch title and description  
Change from v2 to v3:
  Update the title and add "Fixes" fields in the description  
Change from v3 to v4:
  Keep the Fixes tag at the end of the description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f0cb28..4609b921f899 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-03 Thread Kenneth Lee
On Tue, Jan 03, 2017 at 12:12:24PM +0200, Leon Romanovsky wrote:
> Date: Tue, 3 Jan 2017 12:12:24 +0200
> From: Leon Romanovsky <l...@kernel.org>
> To: Kenneth Lee <liguo...@hisilicon.com>
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170103101224.GH12077@mtr-leonro.local>
> 
> On Tue, Jan 03, 2017 at 10:32:50AM +0800, Kenneth Lee wrote:
> > On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> > > Date: Sun, 1 Jan 2017 08:47:12 +0200
> > > From: Leon Romanovsky <l...@kernel.org>
> > > To: Kenneth Lee <liguo...@hisilicon.com>
> > > CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
> > >  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
> > >  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
> > >  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
> > >  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> > > User-Agent: Mutt/1.7.2 (2016-11-26)
> > > Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> > >
> > > On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > > > There are two bugfixes in this patch:
> > > >
> > > > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement 
> > > > mm->pinned_vm from ib_umem_get")
> > > > This patch introduce the get_task_pid but not put it back on 
> > > > all error
> > > > path
> > > >
> > > > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging 
> > > > regions")
> > > > This patch introduce a ODP flow without release pid before 
> > > > enter it
> > > >
> > > >
> > > > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > > > Reviewed-by: Haggai Eran <hagg...@mellanox.com>
> > > > ---
> > > > Change from v1 to v2:
> > > >   Correcting the patch title and description
> > > > Change from v2 to v3:
> > > >   Update the title and add "Fixes" fields in the description
> > >
> > > OK,
> > >
> > > I see that you still didn't read Documentation/SubmittingPatches. You
> > > must read that document before you are sending patches.
> > >
> > > But I'll stop here, the code is correct (it fixes bugs) and commit message
> > > more usefull than before.
> > >
> > >
> > > >
> > > >  drivers/infiniband/core/umem.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/infiniband/core/umem.c 
> > > > b/drivers/infiniband/core/umem.c
> > > > index 1e62a5f..4609b92 100644
> > > > --- a/drivers/infiniband/core/umem.c
> > > > +++ b/drivers/infiniband/core/umem.c
> > > > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > > > *context, unsigned long addr,
> > > >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> > > >
> > > > if (access & IB_ACCESS_ON_DEMAND) {
> > > > +   put_pid(umem->pid);
> > > > ret = ib_umem_odp_get(context, umem);
> > > > if (ret) {
> > > > kfree(umem);
> > > > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > > > *context, unsigned long addr,
> > > >
> > > > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > > > if (!page_list) {
> > > > +   put_pid(umem->pid);
> > > > kfree(umem);
> > > > return ERR_PTR(-ENOMEM);
> > > > }
> > > > --
> > > > 1.9.1
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > > the body of a message to majord...@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> > Thanks,
> >
> > I did read the doc, but maybe I mis-understant some points. Could you please
> > point it out?
> 
> Fixes line should be placed above bottom signatures.
> 
> As an example of properly written patch, you can take a look on the
> following patch [1] from Steve.
> 
> [1] http://marc.info/?l=linux-rdma=148244272205411=2

Thank you. A sample help a lot.

But please allow me to argue a little:
Documentation/process/submitting-patches.rst does really not mention where Fixes
tags should be put:)

> 
> >
> > And sorry. please ignore the last message. I forget to use a bottom-post 
> > style.
> >
> >
> >
> > --
> > -Kenneth(Hisilicon)



-- 
-Kenneth(Hisilicon)


Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-03 Thread Kenneth Lee
On Tue, Jan 03, 2017 at 12:12:24PM +0200, Leon Romanovsky wrote:
> Date: Tue, 3 Jan 2017 12:12:24 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170103101224.GH12077@mtr-leonro.local>
> 
> On Tue, Jan 03, 2017 at 10:32:50AM +0800, Kenneth Lee wrote:
> > On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> > > Date: Sun, 1 Jan 2017 08:47:12 +0200
> > > From: Leon Romanovsky 
> > > To: Kenneth Lee 
> > > CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
> > >  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
> > >  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
> > >  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
> > >  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> > > User-Agent: Mutt/1.7.2 (2016-11-26)
> > > Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> > >
> > > On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > > > There are two bugfixes in this patch:
> > > >
> > > > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement 
> > > > mm->pinned_vm from ib_umem_get")
> > > > This patch introduce the get_task_pid but not put it back on 
> > > > all error
> > > > path
> > > >
> > > > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging 
> > > > regions")
> > > > This patch introduce a ODP flow without release pid before 
> > > > enter it
> > > >
> > > >
> > > > Signed-off-by: Kenneth Lee 
> > > > Reviewed-by: Haggai Eran 
> > > > ---
> > > > Change from v1 to v2:
> > > >   Correcting the patch title and description
> > > > Change from v2 to v3:
> > > >   Update the title and add "Fixes" fields in the description
> > >
> > > OK,
> > >
> > > I see that you still didn't read Documentation/SubmittingPatches. You
> > > must read that document before you are sending patches.
> > >
> > > But I'll stop here, the code is correct (it fixes bugs) and commit message
> > > more usefull than before.
> > >
> > >
> > > >
> > > >  drivers/infiniband/core/umem.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/infiniband/core/umem.c 
> > > > b/drivers/infiniband/core/umem.c
> > > > index 1e62a5f..4609b92 100644
> > > > --- a/drivers/infiniband/core/umem.c
> > > > +++ b/drivers/infiniband/core/umem.c
> > > > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > > > *context, unsigned long addr,
> > > >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> > > >
> > > > if (access & IB_ACCESS_ON_DEMAND) {
> > > > +   put_pid(umem->pid);
> > > > ret = ib_umem_odp_get(context, umem);
> > > > if (ret) {
> > > > kfree(umem);
> > > > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > > > *context, unsigned long addr,
> > > >
> > > > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > > > if (!page_list) {
> > > > +   put_pid(umem->pid);
> > > > kfree(umem);
> > > > return ERR_PTR(-ENOMEM);
> > > > }
> > > > --
> > > > 1.9.1
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > > the body of a message to majord...@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> > Thanks,
> >
> > I did read the doc, but maybe I mis-understant some points. Could you please
> > point it out?
> 
> Fixes line should be placed above bottom signatures.
> 
> As an example of properly written patch, you can take a look on the
> following patch [1] from Steve.
> 
> [1] http://marc.info/?l=linux-rdma=148244272205411=2

Thank you. A sample help a lot.

But please allow me to argue a little:
Documentation/process/submitting-patches.rst does really not mention where Fixes
tags should be put:)

> 
> >
> > And sorry. please ignore the last message. I forget to use a bottom-post 
> > style.
> >
> >
> >
> > --
> > -Kenneth(Hisilicon)



-- 
-Kenneth(Hisilicon)


Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-02 Thread Kenneth Lee
On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> Date: Sun, 1 Jan 2017 08:47:12 +0200
> From: Leon Romanovsky <l...@kernel.org>
> To: Kenneth Lee <liguo...@hisilicon.com>
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm 
> > from ib_umem_get")
> > This patch introduce the get_task_pid but not put it back on all error
> > path
> >
> > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> > This patch introduce a ODP flow without release pid before enter it
> >
> >
> > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > Reviewed-by: Haggai Eran <hagg...@mellanox.com>
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> > Change from v2 to v3:
> >   Update the title and add "Fixes" fields in the description
> 
> OK,
> 
> I see that you still didn't read Documentation/SubmittingPatches. You
> must read that document before you are sending patches.
> 
> But I'll stop here, the code is correct (it fixes bugs) and commit message
> more usefull than before.
> 
> 
> >
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > --
> > 1.9.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks,

I did read the doc, but maybe I mis-understant some points. Could you please
point it out?

And sorry. please ignore the last message. I forget to use a bottom-post style.



-- 
-Kenneth(Hisilicon)


Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-02 Thread Kenneth Lee
On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> Date: Sun, 1 Jan 2017 08:47:12 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm 
> > from ib_umem_get")
> > This patch introduce the get_task_pid but not put it back on all error
> > path
> >
> > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> > This patch introduce a ODP flow without release pid before enter it
> >
> >
> > Signed-off-by: Kenneth Lee 
> > Reviewed-by: Haggai Eran 
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> > Change from v2 to v3:
> >   Update the title and add "Fixes" fields in the description
> 
> OK,
> 
> I see that you still didn't read Documentation/SubmittingPatches. You
> must read that document before you are sending patches.
> 
> But I'll stop here, the code is correct (it fixes bugs) and commit message
> more usefull than before.
> 
> 
> >
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > --
> > 1.9.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks,

I did read the doc, but maybe I mis-understant some points. Could you please
point it out?

And sorry. please ignore the last message. I forget to use a bottom-post style.



-- 
-Kenneth(Hisilicon)


Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-02 Thread Kenneth Lee
Thanks,

I did read the doc, but maybe I mis-understant some points. Could you please
point it out?

On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> Date: Sun, 1 Jan 2017 08:47:12 +0200
> From: Leon Romanovsky <l...@kernel.org>
> To: Kenneth Lee <liguo...@hisilicon.com>
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm 
> > from ib_umem_get")
> > This patch introduce the get_task_pid but not put it back on all error
> > path
> >
> > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> > This patch introduce a ODP flow without release pid before enter it
> >
> >
> > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > Reviewed-by: Haggai Eran <hagg...@mellanox.com>
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> > Change from v2 to v3:
> >   Update the title and add "Fixes" fields in the description
> 
> OK,
> 
> I see that you still didn't read Documentation/SubmittingPatches. You
> must read that document before you are sending patches.
> 
> But I'll stop here, the code is correct (it fixes bugs) and commit message
> more usefull than before.
> 
> 
> >
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > --
> > 1.9.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
-Kenneth(Hisilicon)


Re: [PATCH v3] IB/umem: Release pid in error and ODP flow

2017-01-02 Thread Kenneth Lee
Thanks,

I did read the doc, but maybe I mis-understant some points. Could you please
point it out?

On Sun, Jan 01, 2017 at 08:47:12AM +0200, Leon Romanovsky wrote:
> Date: Sun, 1 Jan 2017 08:47:12 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] IB/umem: Release pid in error and ODP flow
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20170101064712.GQ26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 06:18:29PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm 
> > from ib_umem_get")
> > This patch introduce the get_task_pid but not put it back on all error
> > path
> >
> > Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> > This patch introduce a ODP flow without release pid before enter it
> >
> >
> > Signed-off-by: Kenneth Lee 
> > Reviewed-by: Haggai Eran 
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> > Change from v2 to v3:
> >   Update the title and add "Fixes" fields in the description
> 
> OK,
> 
> I see that you still didn't read Documentation/SubmittingPatches. You
> must read that document before you are sending patches.
> 
> But I'll stop here, the code is correct (it fixes bugs) and commit message
> more usefull than before.
> 
> 
> >
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > --
> > 1.9.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
-Kenneth(Hisilicon)


[PATCH v3] IB/umem: Release pid in error and ODP flow

2016-12-30 Thread Kenneth Lee
There are two bugfixes in this patch:

Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from 
ib_umem_get")
This patch introduce the get_task_pid but not put it back on all error
path

Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
This patch introduce a ODP flow without release pid before enter it


Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
Reviewed-by: Haggai Eran <hagg...@mellanox.com>
---
Change from v1 to v2:
  Correcting the patch title and description
Change from v2 to v3:
  Update the title and add "Fixes" fields in the description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH v3] IB/umem: Release pid in error and ODP flow

2016-12-30 Thread Kenneth Lee
There are two bugfixes in this patch:

Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from 
ib_umem_get")
This patch introduce the get_task_pid but not put it back on all error
path

Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
This patch introduce a ODP flow without release pid before enter it


Signed-off-by: Kenneth Lee 
Reviewed-by: Haggai Eran 
---
Change from v1 to v2:
  Correcting the patch title and description
Change from v2 to v3:
  Update the title and add "Fixes" fields in the description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-30 Thread Kenneth Lee
On Fri, Dec 30, 2016 at 08:55:10AM +0200, Leon Romanovsky wrote:
> Date: Fri, 30 Dec 2016 08:55:10 +0200
> From: Leon Romanovsky <l...@kernel.org>
> To: Kenneth Lee <liguo...@hisilicon.com>
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20161230065510.GL26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 12:50:11PM +0800, Kenneth Lee wrote:
> > Hi, Leon,
> >
> > 1. I do change the title except for the version number itself:) But my 
> > English
> > is quite bad, maybe the title is still quite stupid. I can update it 
> > according
> > to your advice.
> 
> Yes, please
> The main points are:
> 1. Remove "bugifix", it is not needed.
> 2. Use description in the title and not function names.
> 
> >
> > 2. I catched the bug by reading the final code, not by bisect-ing the old
> > commit. Do you means I should find out which commit introducing the bug? It 
> > will
> > not be easily to say which it is because it is a "missing bug", rather than 
> > a
> > "introduced bug". Indicate the commit may not help to remove a patch/commit 
> > from
> > the stable tree.
> 
> The fixes line won't cause for removal of commit, but to addition of
> yours on top of their code base.
> 
> git blame is your friend.
> 
> one fixes line is:
> Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> 
> and the second line is ! NOT !, you need to go deeper in the logs 
> !!
> Fixes: f7c6a7b5d599 ("IB/uverbs: Export ib_umem_get()/ib_umem_release() to 
> modules")
> 
> >
> > Could you please give more suggestion? Thanks.
> 
> Please, don't use top-posting for this mailing list.
> It is really-really annoying.
> 
> >
> > On Thu, Dec 29, 2016 at 10:17:56AM +0200, Leon Romanovsky wrote:
> > > Date: Thu, 29 Dec 2016 10:17:56 +0200
> > > From: Leon Romanovsky <l...@kernel.org>
> > > To: Kenneth Lee <liguo...@hisilicon.com>
> > > CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
> > >  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
> > >  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
> > >  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
> > >  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> > > User-Agent: Mutt/1.7.2 (2016-11-26)
> > > Message-ID: <20161229081756.GI26885@mtr-leonro.local>
> > >
> > > On Thu, Dec 29, 2016 at 04:27:28PM +0800, Kenneth Lee wrote:
> > > > There are two bugfixes in this patch:
> > > >
> > > > 1. When the execution go to the ib_umem_odp_get() path, pid should be 
> > > > put
> > > >back.
> > > > 2. When the memory allocation fail, the pid also should be put back 
> > > > before
> > > >exit.
> > > >
> > > > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > > > Reviewed-by: Haggai Eran <hagg...@mellanox.com>
> > > > ---
> > > > Change from v1 to v2:
> > > >   Correcting the patch title and description
> > >
> > > I don't see any changes except version in the title.
> > > What about anything like this?
> > > [PATCH v3] IB/umem: Release pid in error and ODP flows
> > >
> > > And Fixes line please, it will help to forward it to stable trees.
> > >
> > > Thanks
> >
> >
> >
> > --
> > -Kenneth(Hisilicon)

Very helpful. Thank you. I will send the Patch v3 soon.

-- 
-Kenneth(Hisilicon)


Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-30 Thread Kenneth Lee
On Fri, Dec 30, 2016 at 08:55:10AM +0200, Leon Romanovsky wrote:
> Date: Fri, 30 Dec 2016 08:55:10 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20161230065510.GL26885@mtr-leonro.local>
> 
> On Fri, Dec 30, 2016 at 12:50:11PM +0800, Kenneth Lee wrote:
> > Hi, Leon,
> >
> > 1. I do change the title except for the version number itself:) But my 
> > English
> > is quite bad, maybe the title is still quite stupid. I can update it 
> > according
> > to your advice.
> 
> Yes, please
> The main points are:
> 1. Remove "bugifix", it is not needed.
> 2. Use description in the title and not function names.
> 
> >
> > 2. I catched the bug by reading the final code, not by bisect-ing the old
> > commit. Do you means I should find out which commit introducing the bug? It 
> > will
> > not be easily to say which it is because it is a "missing bug", rather than 
> > a
> > "introduced bug". Indicate the commit may not help to remove a patch/commit 
> > from
> > the stable tree.
> 
> The fixes line won't cause for removal of commit, but to addition of
> yours on top of their code base.
> 
> git blame is your friend.
> 
> one fixes line is:
> Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
> 
> and the second line is ! NOT !, you need to go deeper in the logs 
> !!
> Fixes: f7c6a7b5d599 ("IB/uverbs: Export ib_umem_get()/ib_umem_release() to 
> modules")
> 
> >
> > Could you please give more suggestion? Thanks.
> 
> Please, don't use top-posting for this mailing list.
> It is really-really annoying.
> 
> >
> > On Thu, Dec 29, 2016 at 10:17:56AM +0200, Leon Romanovsky wrote:
> > > Date: Thu, 29 Dec 2016 10:17:56 +0200
> > > From: Leon Romanovsky 
> > > To: Kenneth Lee 
> > > CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
> > >  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
> > >  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
> > >  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
> > >  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> > > User-Agent: Mutt/1.7.2 (2016-11-26)
> > > Message-ID: <20161229081756.GI26885@mtr-leonro.local>
> > >
> > > On Thu, Dec 29, 2016 at 04:27:28PM +0800, Kenneth Lee wrote:
> > > > There are two bugfixes in this patch:
> > > >
> > > > 1. When the execution go to the ib_umem_odp_get() path, pid should be 
> > > > put
> > > >back.
> > > > 2. When the memory allocation fail, the pid also should be put back 
> > > > before
> > > >exit.
> > > >
> > > > Signed-off-by: Kenneth Lee 
> > > > Reviewed-by: Haggai Eran 
> > > > ---
> > > > Change from v1 to v2:
> > > >   Correcting the patch title and description
> > >
> > > I don't see any changes except version in the title.
> > > What about anything like this?
> > > [PATCH v3] IB/umem: Release pid in error and ODP flows
> > >
> > > And Fixes line please, it will help to forward it to stable trees.
> > >
> > > Thanks
> >
> >
> >
> > --
> > -Kenneth(Hisilicon)

Very helpful. Thank you. I will send the Patch v3 soon.

-- 
-Kenneth(Hisilicon)


Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-29 Thread Kenneth Lee
Hi, Leon,

1. I do change the title except for the version number itself:) But my English
is quite bad, maybe the title is still quite stupid. I can update it according
to your advice.

2. I catched the bug by reading the final code, not by bisect-ing the old
commit. Do you means I should find out which commit introducing the bug? It will
not be easily to say which it is because it is a "missing bug", rather than a
"introduced bug". Indicate the commit may not help to remove a patch/commit from
the stable tree.

Could you please give more suggestion? Thanks.

On Thu, Dec 29, 2016 at 10:17:56AM +0200, Leon Romanovsky wrote:
> Date: Thu, 29 Dec 2016 10:17:56 +0200
> From: Leon Romanovsky <l...@kernel.org>
> To: Kenneth Lee <liguo...@hisilicon.com>
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20161229081756.GI26885@mtr-leonro.local>
> 
> On Thu, Dec 29, 2016 at 04:27:28PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > 1. When the execution go to the ib_umem_odp_get() path, pid should be put
> >back.
> > 2. When the memory allocation fail, the pid also should be put back before
> >exit.
> >
> > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > Reviewed-by: Haggai Eran <hagg...@mellanox.com>
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> 
> I don't see any changes except version in the title.
> What about anything like this?
> [PATCH v3] IB/umem: Release pid in error and ODP flows
> 
> And Fixes line please, it will help to forward it to stable trees.
> 
> Thanks



-- 
-Kenneth(Hisilicon)


Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-29 Thread Kenneth Lee
Hi, Leon,

1. I do change the title except for the version number itself:) But my English
is quite bad, maybe the title is still quite stupid. I can update it according
to your advice.

2. I catched the bug by reading the final code, not by bisect-ing the old
commit. Do you means I should find out which commit introducing the bug? It will
not be easily to say which it is because it is a "missing bug", rather than a
"introduced bug". Indicate the commit may not help to remove a patch/commit from
the stable tree.

Could you please give more suggestion? Thanks.

On Thu, Dec 29, 2016 at 10:17:56AM +0200, Leon Romanovsky wrote:
> Date: Thu, 29 Dec 2016 10:17:56 +0200
> From: Leon Romanovsky 
> To: Kenneth Lee 
> CC: dledf...@redhat.com, sean.he...@intel.com, hal.rosenst...@gmail.com,
>  robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, ma...@mellanox.com,
>  linux-r...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()
> User-Agent: Mutt/1.7.2 (2016-11-26)
> Message-ID: <20161229081756.GI26885@mtr-leonro.local>
> 
> On Thu, Dec 29, 2016 at 04:27:28PM +0800, Kenneth Lee wrote:
> > There are two bugfixes in this patch:
> >
> > 1. When the execution go to the ib_umem_odp_get() path, pid should be put
> >back.
> > 2. When the memory allocation fail, the pid also should be put back before
> >exit.
> >
> > Signed-off-by: Kenneth Lee 
> > Reviewed-by: Haggai Eran 
> > ---
> > Change from v1 to v2:
> >   Correcting the patch title and description
> 
> I don't see any changes except version in the title.
> What about anything like this?
> [PATCH v3] IB/umem: Release pid in error and ODP flows
> 
> And Fixes line please, it will help to forward it to stable trees.
> 
> Thanks



-- 
-Kenneth(Hisilicon)


[PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-29 Thread Kenneth Lee
There are two bugfixes in this patch:

1. When the execution go to the ib_umem_odp_get() path, pid should be put
   back.
2. When the memory allocation fail, the pid also should be put back before
   exit.

Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
Reviewed-by: Haggai Eran <hagg...@mellanox.com>
---
Change from v1 to v2:
  Correcting the patch title and description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH v2] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-29 Thread Kenneth Lee
There are two bugfixes in this patch:

1. When the execution go to the ib_umem_odp_get() path, pid should be put
   back.
2. When the memory allocation fail, the pid also should be put back before
   exit.

Signed-off-by: Kenneth Lee 
Reviewed-by: Haggai Eran 
---
Change from v1 to v2:
  Correcting the patch title and description

 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



Re: [PATCH] ib umem: bug: put pid back before return from error path

2016-12-28 Thread Kenneth Lee
Hi, 

Sorry for the delay (I'd got some problem in my procmailrc file, and miss this 
mail).

The new patch, with title "[PATCH] ib umem: bugfix: mixed put_pid()s in
ib_umem_get()", has been sent.

On Thu, Dec 22, 2016 at 10:00:57AM +0200, Mark Bloch wrote:
> Date: Thu, 22 Dec 2016 10:00:57 +0200
> From: Mark Bloch <ma...@mellanox.com>
> To: Kenneth Lee <liguo...@hisilicon.com>, dledf...@redhat.com,
>  sean.he...@intel.com, hal.rosenst...@gmail.com
> CC: robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, linux-r...@vger.kernel.org,
>  linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ib umem: bug: put pid back before return from error
>  path
> User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101
>  Thunderbird/45.5.1
> Message-ID: <e470fbab-a330-b9c4-f2b3-65fb45f24...@mellanox.com>
> 
> Hi,
> 
> You have two bugs here:
> 1) When using ODP, ib_umem_release() checks for umem->odp_data != NULL 
>calls ib_umem_odp_release() and returns immediately without calling 
> put_pid().
>This one isn't in the error path so the title doesn't fit.
> 
> 2) In case the allocation failed, we return in -ENOMEM without calling 
> put_pid().
> 
> Can you please resend this with proper fixes line and a better description of 
> what is going on.
> 
> On 22/12/2016 09:11, Kenneth Lee wrote:
> > I catched this bug when reading the code. I'm sorry I have no hardware to 
> > test
> > it. But it is abviously a bug.
> > 
> > Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
> > ---
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >  
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > 
> 
> Mark.

-- 
-Kenneth(Hisilicon)


Re: [PATCH] ib umem: bug: put pid back before return from error path

2016-12-28 Thread Kenneth Lee
Hi, 

Sorry for the delay (I'd got some problem in my procmailrc file, and miss this 
mail).

The new patch, with title "[PATCH] ib umem: bugfix: mixed put_pid()s in
ib_umem_get()", has been sent.

On Thu, Dec 22, 2016 at 10:00:57AM +0200, Mark Bloch wrote:
> Date: Thu, 22 Dec 2016 10:00:57 +0200
> From: Mark Bloch 
> To: Kenneth Lee , dledf...@redhat.com,
>  sean.he...@intel.com, hal.rosenst...@gmail.com
> CC: robin.mur...@arm.com, jroe...@suse.de, egtv...@samfundet.no,
>  vgu...@synopsys.com, dave.han...@linux.intel.com, lstoa...@gmail.com,
>  k...@kernel.org, seb...@linux.vnet.ibm.com, linux-r...@vger.kernel.org,
>  linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ib umem: bug: put pid back before return from error
>  path
> User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101
>  Thunderbird/45.5.1
> Message-ID: 
> 
> Hi,
> 
> You have two bugs here:
> 1) When using ODP, ib_umem_release() checks for umem->odp_data != NULL 
>calls ib_umem_odp_release() and returns immediately without calling 
> put_pid().
>This one isn't in the error path so the title doesn't fit.
> 
> 2) In case the allocation failed, we return in -ENOMEM without calling 
> put_pid().
> 
> Can you please resend this with proper fixes line and a better description of 
> what is going on.
> 
> On 22/12/2016 09:11, Kenneth Lee wrote:
> > I catched this bug when reading the code. I'm sorry I have no hardware to 
> > test
> > it. But it is abviously a bug.
> > 
> > Signed-off-by: Kenneth Lee 
> > ---
> >  drivers/infiniband/core/umem.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 1e62a5f..4609b92 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >  
> > if (access & IB_ACCESS_ON_DEMAND) {
> > +   put_pid(umem->pid);
> > ret = ib_umem_odp_get(context, umem);
> > if (ret) {
> > kfree(umem);
> > @@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> >  
> > page_list = (struct page **) __get_free_page(GFP_KERNEL);
> > if (!page_list) {
> > +   put_pid(umem->pid);
> > kfree(umem);
> > return ERR_PTR(-ENOMEM);
> > }
> > 
> 
> Mark.

-- 
-Kenneth(Hisilicon)


[PATCH] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-28 Thread Kenneth Lee
There are two bugfixes in this patch:

1. When the execution go to the ib_umem_odp_get() path, pid should be put
   back.
2. When the memory allocation fail, the pid also should be put back before
   exit.

Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
---
 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH] ib umem: bugfix: mixed put_pid()s in ib_umem_get()

2016-12-28 Thread Kenneth Lee
There are two bugfixes in this patch:

1. When the execution go to the ib_umem_odp_get() path, pid should be put
   back.
2. When the memory allocation fail, the pid also should be put back before
   exit.

Signed-off-by: Kenneth Lee 
---
 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH] ib umem: bug: put pid back before return from error path

2016-12-21 Thread Kenneth Lee
I catched this bug when reading the code. I'm sorry I have no hardware to test
it. But it is abviously a bug.

Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
---
 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH] ib umem: bug: put pid back before return from error path

2016-12-21 Thread Kenneth Lee
I catched this bug when reading the code. I'm sorry I have no hardware to test
it. But it is abviously a bug.

Signed-off-by: Kenneth Lee 
---
 drivers/infiniband/core/umem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 1e62a5f..4609b92 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -134,6 +134,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
 
if (access & IB_ACCESS_ON_DEMAND) {
+   put_pid(umem->pid);
ret = ib_umem_odp_get(context, umem);
if (ret) {
kfree(umem);
@@ -149,6 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
 
page_list = (struct page **) __get_free_page(GFP_KERNEL);
if (!page_list) {
+   put_pid(umem->pid);
kfree(umem);
return ERR_PTR(-ENOMEM);
}
-- 
1.9.1



[PATCH] [bugfix] replace unnessary ldax with common ldr

2016-08-30 Thread Kenneth Lee
(add comment for the previous mail, sorry for the duplication)

There is no store_ex pairing with this load_ex. It is not necessary and
gave wrong hint to the cache system.

Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
---
 arch/arm64/include/asm/spinlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index c85e96d..3334c4f 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -63,7 +63,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
 */
 "  sevl\n"
 "2:wfe\n"
-"  ldaxrh  %w2, %4\n"
+"  ldrh%w2, %4\n"
 "  eor %w1, %w2, %w0, lsr #16\n"
 "  cbnz%w1, 2b\n"
/* We got the lock. Critical section starts here. */
-- 
1.9.1



[PATCH] [bugfix] replace unnessary ldax with common ldr

2016-08-30 Thread Kenneth Lee
(add comment for the previous mail, sorry for the duplication)

There is no store_ex pairing with this load_ex. It is not necessary and
gave wrong hint to the cache system.

Signed-off-by: Kenneth Lee 
---
 arch/arm64/include/asm/spinlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index c85e96d..3334c4f 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -63,7 +63,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
 */
 "  sevl\n"
 "2:wfe\n"
-"  ldaxrh  %w2, %4\n"
+"  ldrh%w2, %4\n"
 "  eor %w1, %w2, %w0, lsr #16\n"
 "  cbnz%w1, 2b\n"
/* We got the lock. Critical section starts here. */
-- 
1.9.1



[PATCH] [bugfix] replace unnessary ldax with common ldr

2016-08-29 Thread Kenneth Lee
Signed-off-by: Kenneth Lee <liguo...@hisilicon.com>
---
 arch/arm64/include/asm/spinlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index c85e96d..3334c4f 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -63,7 +63,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
 */
 "  sevl\n"
 "2:wfe\n"
-"  ldaxrh  %w2, %4\n"
+"  ldrh%w2, %4\n"
 "  eor %w1, %w2, %w0, lsr #16\n"
 "  cbnz%w1, 2b\n"
/* We got the lock. Critical section starts here. */
-- 
1.9.1



[PATCH] [bugfix] replace unnessary ldax with common ldr

2016-08-29 Thread Kenneth Lee
Signed-off-by: Kenneth Lee 
---
 arch/arm64/include/asm/spinlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/spinlock.h 
b/arch/arm64/include/asm/spinlock.h
index c85e96d..3334c4f 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -63,7 +63,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
 */
 "  sevl\n"
 "2:wfe\n"
-"  ldaxrh  %w2, %4\n"
+"  ldrh%w2, %4\n"
 "  eor %w1, %w2, %w0, lsr #16\n"
 "  cbnz%w1, 2b\n"
/* We got the lock. Critical section starts here. */
-- 
1.9.1



Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once HANDEL_TX_MSG macro

2015-10-15 Thread Kenneth Lee
On Tue, Oct 13, 2015 at 04:18:23PM +0200, Arnd Bergmann wrote:
> Date: Tue, 13 Oct 2015 16:18:23 +0200
> From: Arnd Bergmann 
> To: Kenneth Lee 
> Cc: da...@davemloft.net, j...@perches.com, liguo...@hisilicon.com,
>  yisen.zhu...@huawei.com, net...@vger.kernel.org, linux...@huawei.com,
>  salil.me...@huawei.com, kenneth-lee-2...@foxmail.com,
>  xuw...@hisilicon.com, lisheng...@huawei.com, linux-kernel@vger.kernel.org,
>  huangdaode 
> Subject: Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the
>  once HANDEL_TX_MSG macro
> Message-ID: <6914069.XLdT9Eli48@wuerfel>
> 
> On Tuesday 13 October 2015 21:27:12 Kenneth Lee wrote:
> > 
> > Hi, Arnd,
> > 
> > Thank you for the comment. Yes, the io_base is a security problem, we
> > will fix it in coming patch soon.
> > 
> > But can we keep the sysfs? The interface from hnae is not used only by
> > ethernet driver but also by Open Data Plane driver. If we more it to
> > upper layers. Both drivers will have the same logic. 
> > 
> > So how about we just add documents to Documention/ABI?
> 
> Hi Kenneth,
> 
> In the end this is up to David Miller of course, but I'd say we are
> better off not introducing any ABIs for ODP prematurely.
> 
> We are talking about very generic statistics data, and you should
> already provide them for the ethernet driver using the standard
> interfaces.
> 
> I have not seen any discussion about adding an ODP subsystem for
> the Linux kernel, or what the API will be, but I think we should
> not export any interfaces from a particular device driver directly
> but always go through a common layer here and use an extensible
> interface that can be implemented by everyone.
> 
> The API has not been part of a release yet, so I'd say we should
> remove it for now. Once we have a net/odp/ directory, we can
> add a driver-independent implementation there and call it from
> the hisi driver.
> 
>   Arnd

Hi, Arnd,

Agree. We will remove the interface for the time being. Thank you.

-- 
-Kenneth Lee (Hisilicon)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once HANDEL_TX_MSG macro

2015-10-15 Thread Kenneth Lee
On Tue, Oct 13, 2015 at 04:18:23PM +0200, Arnd Bergmann wrote:
> Date: Tue, 13 Oct 2015 16:18:23 +0200
> From: Arnd Bergmann <a...@arndb.de>
> To: Kenneth Lee <kenneth-lee-2...@foxmail.com>
> Cc: da...@davemloft.net, j...@perches.com, liguo...@hisilicon.com,
>  yisen.zhu...@huawei.com, net...@vger.kernel.org, linux...@huawei.com,
>  salil.me...@huawei.com, kenneth-lee-2...@foxmail.com,
>  xuw...@hisilicon.com, lisheng...@huawei.com, linux-kernel@vger.kernel.org,
>  huangdaode <huangda...@hisilicon.com>
> Subject: Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the
>  once HANDEL_TX_MSG macro
> Message-ID: <6914069.XLdT9Eli48@wuerfel>
> 
> On Tuesday 13 October 2015 21:27:12 Kenneth Lee wrote:
> > 
> > Hi, Arnd,
> > 
> > Thank you for the comment. Yes, the io_base is a security problem, we
> > will fix it in coming patch soon.
> > 
> > But can we keep the sysfs? The interface from hnae is not used only by
> > ethernet driver but also by Open Data Plane driver. If we more it to
> > upper layers. Both drivers will have the same logic. 
> > 
> > So how about we just add documents to Documention/ABI?
> 
> Hi Kenneth,
> 
> In the end this is up to David Miller of course, but I'd say we are
> better off not introducing any ABIs for ODP prematurely.
> 
> We are talking about very generic statistics data, and you should
> already provide them for the ethernet driver using the standard
> interfaces.
> 
> I have not seen any discussion about adding an ODP subsystem for
> the Linux kernel, or what the API will be, but I think we should
> not export any interfaces from a particular device driver directly
> but always go through a common layer here and use an extensible
> interface that can be implemented by everyone.
> 
> The API has not been part of a release yet, so I'd say we should
> remove it for now. Once we have a net/odp/ directory, we can
> add a driver-independent implementation there and call it from
> the hisi driver.
> 
>   Arnd

Hi, Arnd,

Agree. We will remove the interface for the time being. Thank you.

-- 
-Kenneth Lee (Hisilicon)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once HANDEL_TX_MSG macro

2015-10-13 Thread Kenneth Lee
On Tue, Oct 13, 2015 at 03:06:21PM +0800, huangdaode wrote:
> Date: Tue, 13 Oct 2015 15:06:21 +0800
> From: huangdaode 
> To: Kenneth Lee 
> Subject: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once
>  HANDEL_TX_MSG macro
> Message-ID: <561cad6d.2060...@hisilicon.com>
> 
> Forwarded Message 
> 
> Subject: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once  
>  HANDEL_TX_MSG macro  
>Date: Mon, 12 Oct 2015 13:59:39 +0200  
>From: Arnd Bergmann 
>  To: huangdaode 
>  CC: da...@davemloft.net, j...@perches.com, liguo...@hisilicon.com,   
>  
>  yisen.zhu...@huawei.com, net...@vger.kernel.org, 
>  linux...@huawei.com, salil.me...@huawei.com, 
>  kenneth-lee-2...@foxmail.com, xuw...@hisilicon.com,  
>  lisheng...@huawei.com, linux-kernel@vger.kernel.org  
> 
>  On Monday 12 October 2015 11:23:44 huangdaode wrote:
>  > +   s += sprintf(s,
>  > +   "\t\ttx_ring on 
> %p:%u,%u,%u,%u,%u,%llu,%llu\n",
>  > +   h->qs[i]->tx_ring.io_base,
>  > +   h->qs[i]->tx_ring.buf_size,
>  > +   h->qs[i]->tx_ring.desc_num,
>  > +   h->qs[i]->tx_ring.max_desc_num_per_pkt,
>  > +   
> h->qs[i]->tx_ring.max_raw_data_sz_per_desc,
>  > +   h->qs[i]->tx_ring.max_pkt_size,
>  > +   h->qs[i]->tx_ring.stats.sw_err_cnt,
>  > +   h->qs[i]->tx_ring.stats.io_err_cnt);
> 
>  There is actually a more significant problem with this code, which I
>  failed to notice when doing the original bugfix:
> 
>  You have a sysfs interface here that exports internal data of the
>  device that should not be visible like this. One problem is that
>  the io_base is a kernel pointer that must not be visible to non-root
>  users (so we don't easily create an attack surface for exploits).
>  Another problem is that the format is not documented in Documentation/ABI/
>  and that you have multiple values in one sysfs file here.
> 
>  It would probably be better to completely remove that sysfs interface, and
>  to use the ethtool netlink interface to export them.
> 
>  Arnd
> 
>  .

Hi, Arnd,

Thank you for the comment. Yes, the io_base is a security problem, we
will fix it in coming patch soon.

But can we keep the sysfs? The interface from hnae is not used only by
ethernet driver but also by Open Data Plane driver. If we more it to
upper layers. Both drivers will have the same logic. 

So how about we just add documents to Documention/ABI?

Thanks
-- 
-Kenneth Lee (Hisilicon)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once HANDEL_TX_MSG macro

2015-10-13 Thread Kenneth Lee
On Tue, Oct 13, 2015 at 03:06:21PM +0800, huangdaode wrote:
> Date: Tue, 13 Oct 2015 15:06:21 +0800
> From: huangdaode <huangda...@hisilicon.com>
> To: Kenneth Lee <kenneth_lee_2...@126.com>
> Subject: Fwd: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once
>  HANDEL_TX_MSG macro
> Message-ID: <561cad6d.2060...@hisilicon.com>
> 
> Forwarded Message 
> 
> Subject: Re: [PATCH net-next v2 1/2] hisilicon net: removes the once  
>  HANDEL_TX_MSG macro  
>Date: Mon, 12 Oct 2015 13:59:39 +0200  
>From: Arnd Bergmann <a...@arndb.de>
>  To: huangdaode <huangda...@hisilicon.com>
>  CC: da...@davemloft.net, j...@perches.com, liguo...@hisilicon.com,   
>  
>  yisen.zhu...@huawei.com, net...@vger.kernel.org, 
>      linux...@huawei.com, salil.me...@huawei.com, 
>  kenneth-lee-2...@foxmail.com, xuw...@hisilicon.com,  
>  lisheng...@huawei.com, linux-kernel@vger.kernel.org  
> 
>  On Monday 12 October 2015 11:23:44 huangdaode wrote:
>  > +   s += sprintf(s,
>  > +   "\t\ttx_ring on 
> %p:%u,%u,%u,%u,%u,%llu,%llu\n",
>  > +   h->qs[i]->tx_ring.io_base,
>  > +   h->qs[i]->tx_ring.buf_size,
>  > +   h->qs[i]->tx_ring.desc_num,
>  > +   h->qs[i]->tx_ring.max_desc_num_per_pkt,
>  > +   
> h->qs[i]->tx_ring.max_raw_data_sz_per_desc,
>  > +   h->qs[i]->tx_ring.max_pkt_size,
>  > +   h->qs[i]->tx_ring.stats.sw_err_cnt,
>  > +   h->qs[i]->tx_ring.stats.io_err_cnt);
> 
>  There is actually a more significant problem with this code, which I
>  failed to notice when doing the original bugfix:
> 
>  You have a sysfs interface here that exports internal data of the
>  device that should not be visible like this. One problem is that
>  the io_base is a kernel pointer that must not be visible to non-root
>  users (so we don't easily create an attack surface for exploits).
>  Another problem is that the format is not documented in Documentation/ABI/
>  and that you have multiple values in one sysfs file here.
> 
>  It would probably be better to completely remove that sysfs interface, and
>  to use the ethtool netlink interface to export them.
> 
>  Arnd
> 
>  .

Hi, Arnd,

Thank you for the comment. Yes, the io_base is a security problem, we
will fix it in coming patch soon.

But can we keep the sysfs? The interface from hnae is not used only by
ethernet driver but also by Open Data Plane driver. If we more it to
upper layers. Both drivers will have the same logic. 

So how about we just add documents to Documention/ABI?

Thanks
-- 
-Kenneth Lee (Hisilicon)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-27 Thread Kenneth Lee
On Fri, Aug 21, 2015 at 04:00:35PM +0200, Arnd Bergmann wrote:
> Date: Fri, 21 Aug 2015 16:00:35 +0200
> From: Arnd Bergmann 
> To: "Liguozhu (Kenneth)" 
> CC: "mark.rutl...@arm.com" ,
>  "devicet...@vger.kernel.org" ,
>  "pawel.m...@arm.com" , "ijc+devicet...@hellion.org.uk"
>  , "catalin.mari...@arm.com"
>  , "will.dea...@arm.com" ,
>  "linux-kernel@vger.kernel.org" , Linuxarm
>  , "paul.gortma...@windriver.com"
>  , "robh...@kernel.org" ,
>  "ga...@codeaurora.org" , "zhangfei@linaro.org"
>  , "net...@vger.kernel.org"
>  , "da...@davemloft.net" ,
>  "linux-arm-ker...@lists.infradead.org"
>  
> Subject: Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem
>  support (config and documents)
> User-Agent: KMail/4.11.5 (Linux/3.16.0-10-generic; KDE/4.11.5; x86_64; ; )
> Message-ID: <2543796.7JthO5WCfI@wuerfel>
> 
> On Monday 17 August 2015 01:28:07 Liguozhu wrote:
> > Thanks, Arnd. 
> > 
> > Regarding the ae-name: it is the name of the Acceleration Engine. It is 
> > provided
> > by the BIOS according to the position and the feature enabled of the IP.
> > So "soc0" means it is on SoC No. 0, while "n4" means it is running on 
> >"Non-dsaf mode 4". Ideally, we should setup the rule to name it. But as I
> > said in the patchset, the IP is original designed for a bare metal solution,
> > it is worthless to export all modes and we are planning to add more mode
> > for Linux itself in the IP in future version. So I think the better way is
> > to leave it as a "name" but add more meaning in the future.
> 
> The name property is a bit awkward. The position is normally implied by
> the location of the parent device in the DT, so you should not need that
> at all and instead derive it elsewhere. You can also add strings to the
> compatible property instead of this, to signify differences in the programming
> that are based on how the IP block is used.
>  
> > Regarding the ae-opts: it is the initial value for the AE's runtime options,
> > Currently, we have only "port number" (there are 6XGE+2GE port for a DSAF 
> > AE)
> > as option. But for future version, we will add other options such as "enable
> > Spanning Tree Protocol algorithm)" and so on. 
> 
> I think these can easily be converted into an index property and boolean
> flags (present if true, absent otherwise) for additional features.
>  
> > Should I add these background to somewhere?
> 
> The binding document needs to list all supported configurations, if you
> have a string property, describe specifically what strings are allowed
> and what they mean, but better try to avoid strings altogether.
> 
>   Arnd
> ___
> linuxarm mailing list
> linux...@huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

Dear Arnd,

We are working on the new PatchSet. I describe the new design here so in case
you can tell us if we make something wrong.

So now we will keep some attributes in enthernet node like this:

ethernet@0{
compatible = "hisilicon,hns-nic";
ae-name = "dsaf1";
port-id = <0>;
};

ae-name is simply a name referring to the name of dsa_name in SAF node.

port-id is the index of port provided by DSAF (the accelerator). DSAF can
connect to 8 PHYs. Port 0 to 1 are both used for adminstration purpose. They
are called debug ports. 

The remaining 6 PHYs are taken according to the mode of DSAF.

In NIC mode of DSAF, all 6 PHYs are taken as ethernet ports to the CPU. The
port-id can be 2 to 7. Here is the diagram:

+-+---+
|CPU  |
+-+-+-+---+-+-+-+-+-+-+
  | | | | | | | |
 debug   service 
 port port
 (0,1)   (2-7)

In Switch mode of DSAF, all 6 PHYs are taken as physical ports connect to a
LAN Switch while the CPU side assume itself have one single NIC connect to
this switch. In this case, the port-id will be 2 only.

+-+---+
|CPU  |
+-+-+-+---+-+-+-+-+-+-+
  | | | service port(2)
 debug   ++
 port|   switch   |
 (0,1)   +-+-+-+-+-+-++
   | | | | | |
external
 port


-- 
-Kenneth(Hisilicon)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-27 Thread Kenneth Lee
On Fri, Aug 21, 2015 at 04:00:35PM +0200, Arnd Bergmann wrote:
 Date: Fri, 21 Aug 2015 16:00:35 +0200
 From: Arnd Bergmann a...@arndb.de
 To: Liguozhu (Kenneth) liguo...@hisilicon.com
 CC: mark.rutl...@arm.com mark.rutl...@arm.com,
  devicet...@vger.kernel.org devicet...@vger.kernel.org,
  pawel.m...@arm.com pawel.m...@arm.com, ijc+devicet...@hellion.org.uk
  ijc+devicet...@hellion.org.uk, catalin.mari...@arm.com
  catalin.mari...@arm.com, will.dea...@arm.com will.dea...@arm.com,
  linux-kernel@vger.kernel.org linux-kernel@vger.kernel.org, Linuxarm
  linux...@huawei.com, paul.gortma...@windriver.com
  paul.gortma...@windriver.com, robh...@kernel.org robh...@kernel.org,
  ga...@codeaurora.org ga...@codeaurora.org, zhangfei@linaro.org
  zhangfei@linaro.org, net...@vger.kernel.org
  net...@vger.kernel.org, da...@davemloft.net da...@davemloft.net,
  linux-arm-ker...@lists.infradead.org
  linux-arm-ker...@lists.infradead.org
 Subject: Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem
  support (config and documents)
 User-Agent: KMail/4.11.5 (Linux/3.16.0-10-generic; KDE/4.11.5; x86_64; ; )
 Message-ID: 2543796.7JthO5WCfI@wuerfel
 
 On Monday 17 August 2015 01:28:07 Liguozhu wrote:
  Thanks, Arnd. 
  
  Regarding the ae-name: it is the name of the Acceleration Engine. It is 
  provided
  by the BIOS according to the position and the feature enabled of the IP.
  So soc0 means it is on SoC No. 0, while n4 means it is running on 
 Non-dsaf mode 4. Ideally, we should setup the rule to name it. But as I
  said in the patchset, the IP is original designed for a bare metal solution,
  it is worthless to export all modes and we are planning to add more mode
  for Linux itself in the IP in future version. So I think the better way is
  to leave it as a name but add more meaning in the future.
 
 The name property is a bit awkward. The position is normally implied by
 the location of the parent device in the DT, so you should not need that
 at all and instead derive it elsewhere. You can also add strings to the
 compatible property instead of this, to signify differences in the programming
 that are based on how the IP block is used.
  
  Regarding the ae-opts: it is the initial value for the AE's runtime options,
  Currently, we have only port number (there are 6XGE+2GE port for a DSAF 
  AE)
  as option. But for future version, we will add other options such as enable
  Spanning Tree Protocol algorithm) and so on. 
 
 I think these can easily be converted into an index property and boolean
 flags (present if true, absent otherwise) for additional features.
  
  Should I add these background to somewhere?
 
 The binding document needs to list all supported configurations, if you
 have a string property, describe specifically what strings are allowed
 and what they mean, but better try to avoid strings altogether.
 
   Arnd
 ___
 linuxarm mailing list
 linux...@huawei.com
 http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

Dear Arnd,

We are working on the new PatchSet. I describe the new design here so in case
you can tell us if we make something wrong.

So now we will keep some attributes in enthernet node like this:

ethernet@0{
compatible = hisilicon,hns-nic;
ae-name = dsaf1;
port-id = 0;
};

ae-name is simply a name referring to the name of dsa_name in SAF node.

port-id is the index of port provided by DSAF (the accelerator). DSAF can
connect to 8 PHYs. Port 0 to 1 are both used for adminstration purpose. They
are called debug ports. 

The remaining 6 PHYs are taken according to the mode of DSAF.

In NIC mode of DSAF, all 6 PHYs are taken as ethernet ports to the CPU. The
port-id can be 2 to 7. Here is the diagram:

+-+---+
|CPU  |
+-+-+-+---+-+-+-+-+-+-+
  | | | | | | | |
 debug   service 
 port port
 (0,1)   (2-7)

In Switch mode of DSAF, all 6 PHYs are taken as physical ports connect to a
LAN Switch while the CPU side assume itself have one single NIC connect to
this switch. In this case, the port-id will be 2 only.

+-+---+
|CPU  |
+-+-+-+---+-+-+-+-+-+-+
  | | | service port(2)
 debug   ++
 port|   switch   |
 (0,1)   +-+-+-+-+-+-++
   | | | | | |
external
 port


-- 
-Kenneth(Hisilicon)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support

2015-08-21 Thread Kenneth Lee
Thanks, Klimov,

You are right. I will fix it in next patches.

On Tue, Aug 18, 2015 at 03:12:02AM +0300, Alexey Klimov wrote:
> Date: Tue, 18 Aug 2015 03:12:02 +0300
> From: Alexey Klimov 
> To: Kenneth Lee 
> CC: robh...@kernel.org, pawel.m...@arm.com, Mark Rutland
>  , ijc+devicet...@hellion.org.uk, Kumar Gala
>  , Catalin Marinas , Will
>  Deacon , yisen.zhu...@huawei.com, "David S. Miller"
>  , paul.gortma...@windriver.com,
>  dingtianh...@huawei.com, zhangfei@linaro.org,
>  devicet...@vger.kernel.org, Linux Kernel Mailing List
>  , linux-arm-ker...@lists.infradead.org,
>  net...@vger.kernel.org, linux...@huawei.com, salil.me...@huawei.com,
>  huangda...@hisilicon.com, Kenneth Lee , Yury Norov
>  
> Subject: Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae
>  framework support
> Message-ID: 
> 
> 
> Hi Kenneth,
> 
> just small minor question.
> 
> On Fri, Aug 14, 2015 at 1:30 PM, Kenneth Lee  wrote:
> > HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
> > unified ring buffer interface for Hisilicon Network Acceleration Engines.
> >
> > With the interface, upper layer can work as ethernet driver, ODP driver or
> > other service driver on purpose.
> >
> > Signed-off-by: Kenneth Lee 
> > Signed-off-by: Yisen Zhuang 
> > ---
> >  drivers/net/ethernet/hisilicon/Kconfig  |  33 +-
> >  drivers/net/ethernet/hisilicon/Makefile |   1 +
> >  drivers/net/ethernet/hisilicon/hns/Makefile |  15 +
> >  drivers/net/ethernet/hisilicon/hns/hnae.c   | 494 +++
> >  drivers/net/ethernet/hisilicon/hns/hnae.h   | 582 
> > 
> >  5 files changed, 1124 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
> >  create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
> >  create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h
> >
> > diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
> > b/drivers/net/ethernet/hisilicon/Kconfig
> > index dead17b..1e4f5a7 100644
> > --- a/drivers/net/ethernet/hisilicon/Kconfig
> > +++ b/drivers/net/ethernet/hisilicon/Kconfig
> > @@ -5,7 +5,7 @@
> >  config NET_VENDOR_HISILICON
> > bool "Hisilicon devices"
> > default y
> > -   depends on ARM
> > +   depends on ARM || ARM64
> > ---help---
> >   If you have a network (Ethernet) card belonging to this class, 
> > say Y.
> >
> > @@ -31,4 +31,35 @@ config HIP04_ETH
> >   If you wish to compile a kernel for a hardware with hisilicon p04 
> > SoC and
> >   want to use the internal ethernet then you should answer Y to 
> > this.
> >
> > +config HNS
> > +   tristate "Hisilicon Network Subsystem Support (Framework)"
> > +   ---help---
> > + This selects the framework support for Hisilicon Network 
> > Subsystem. It
> > + is needed by any driver which provides HNS acceleration engine or 
> > make
> > + use of the engine
> > +
> > +config HNS_DSAF
> > +   tristate "Hisilicon HNS DSAF device Support"
> > +   select HNS
> > +   select HNS_MDIO
> > +   ---help---
> > + This selects the DSAF (Distributed System Area Frabric) network
> > + acceleration engine support. The engine is used in Hisilicon P660,
> > + Hi1610 and further ICT SoC
> > +
> > +config HNS_MDIO
> > +   tristate "Hisilicon HNS MDIO device Support"
> > +   select MDIO
> > +   ---help---
> > + This selects the HNS MDIO support. It is needed by HNS_DSAF to 
> > access
> > + the PHY
> > +
> > +config HNS_ENET
> > +   tristate "Hisilicon HNS Ethernet Device Support"
> > +   select PHYLIB
> > +   select HNS
> > +   ---help---
> > + This selects the general ethernet driver for HNS.  This module 
> > make
> > + use of any HNS AE driver, such as HNS_DSAF
> > +
> >  endif # NET_VENDOR_HISILICON
> > diff --git a/drivers/net/ethernet/hisilicon/Makefile 
> > b/drivers/net/ethernet/hisilicon/Makefile
> > index 6c14540..2503a9b 100644
> > --- a/drivers/net/ethernet/hisilicon/Makefile
> > +++ b/drivers/net/ethernet/hisilicon/Makefile
> > @@ -4,3 +4,4 @@
> >
> >  obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
> >  obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
> > +obj-$(CONFIG_HNS) += hns/
> > diff --git a/drivers/net/ethernet/

Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support

2015-08-21 Thread Kenneth Lee
Thanks, Klimov,

You are right. I will fix it in next patches.

On Tue, Aug 18, 2015 at 03:12:02AM +0300, Alexey Klimov wrote:
 Date: Tue, 18 Aug 2015 03:12:02 +0300
 From: Alexey Klimov klimov.li...@gmail.com
 To: Kenneth Lee liguo...@hisilicon.com
 CC: robh...@kernel.org, pawel.m...@arm.com, Mark Rutland
  mark.rutl...@arm.com, ijc+devicet...@hellion.org.uk, Kumar Gala
  ga...@codeaurora.org, Catalin Marinas catalin.mari...@arm.com, Will
  Deacon will.dea...@arm.com, yisen.zhu...@huawei.com, David S. Miller
  da...@davemloft.net, paul.gortma...@windriver.com,
  dingtianh...@huawei.com, zhangfei@linaro.org,
  devicet...@vger.kernel.org, Linux Kernel Mailing List
  linux-kernel@vger.kernel.org, linux-arm-ker...@lists.infradead.org,
  net...@vger.kernel.org, linux...@huawei.com, salil.me...@huawei.com,
  huangda...@hisilicon.com, Kenneth Lee liguo...@huawei.com, Yury Norov
  yury.no...@gmail.com
 Subject: Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae
  framework support
 Message-ID: 
 CALW4P+J8LkLshu5TuRT+8c__KRwJ8XAdMV4yA0KEnrfUg=m...@mail.gmail.com
 
 Hi Kenneth,
 
 just small minor question.
 
 On Fri, Aug 14, 2015 at 1:30 PM, Kenneth Lee liguo...@hisilicon.com wrote:
  HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
  unified ring buffer interface for Hisilicon Network Acceleration Engines.
 
  With the interface, upper layer can work as ethernet driver, ODP driver or
  other service driver on purpose.
 
  Signed-off-by: Kenneth Lee liguo...@huawei.com
  Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
  ---
   drivers/net/ethernet/hisilicon/Kconfig  |  33 +-
   drivers/net/ethernet/hisilicon/Makefile |   1 +
   drivers/net/ethernet/hisilicon/hns/Makefile |  15 +
   drivers/net/ethernet/hisilicon/hns/hnae.c   | 494 +++
   drivers/net/ethernet/hisilicon/hns/hnae.h   | 582 
  
   5 files changed, 1124 insertions(+), 1 deletion(-)
   create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
   create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
   create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h
 
  diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
  b/drivers/net/ethernet/hisilicon/Kconfig
  index dead17b..1e4f5a7 100644
  --- a/drivers/net/ethernet/hisilicon/Kconfig
  +++ b/drivers/net/ethernet/hisilicon/Kconfig
  @@ -5,7 +5,7 @@
   config NET_VENDOR_HISILICON
  bool Hisilicon devices
  default y
  -   depends on ARM
  +   depends on ARM || ARM64
  ---help---
If you have a network (Ethernet) card belonging to this class, 
  say Y.
 
  @@ -31,4 +31,35 @@ config HIP04_ETH
If you wish to compile a kernel for a hardware with hisilicon p04 
  SoC and
want to use the internal ethernet then you should answer Y to 
  this.
 
  +config HNS
  +   tristate Hisilicon Network Subsystem Support (Framework)
  +   ---help---
  + This selects the framework support for Hisilicon Network 
  Subsystem. It
  + is needed by any driver which provides HNS acceleration engine or 
  make
  + use of the engine
  +
  +config HNS_DSAF
  +   tristate Hisilicon HNS DSAF device Support
  +   select HNS
  +   select HNS_MDIO
  +   ---help---
  + This selects the DSAF (Distributed System Area Frabric) network
  + acceleration engine support. The engine is used in Hisilicon P660,
  + Hi1610 and further ICT SoC
  +
  +config HNS_MDIO
  +   tristate Hisilicon HNS MDIO device Support
  +   select MDIO
  +   ---help---
  + This selects the HNS MDIO support. It is needed by HNS_DSAF to 
  access
  + the PHY
  +
  +config HNS_ENET
  +   tristate Hisilicon HNS Ethernet Device Support
  +   select PHYLIB
  +   select HNS
  +   ---help---
  + This selects the general ethernet driver for HNS.  This module 
  make
  + use of any HNS AE driver, such as HNS_DSAF
  +
   endif # NET_VENDOR_HISILICON
  diff --git a/drivers/net/ethernet/hisilicon/Makefile 
  b/drivers/net/ethernet/hisilicon/Makefile
  index 6c14540..2503a9b 100644
  --- a/drivers/net/ethernet/hisilicon/Makefile
  +++ b/drivers/net/ethernet/hisilicon/Makefile
  @@ -4,3 +4,4 @@
 
   obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
   obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
  +obj-$(CONFIG_HNS) += hns/
  diff --git a/drivers/net/ethernet/hisilicon/hns/Makefile 
  b/drivers/net/ethernet/hisilicon/hns/Makefile
  new file mode 100644
  index 000..6680602
  --- /dev/null
  +++ b/drivers/net/ethernet/hisilicon/hns/Makefile
  @@ -0,0 +1,15 @@
  +#
  +# Makefile for the HISILICON network device drivers.
  +#
  +
  +obj-$(CONFIG_HNS) += hnae.o
  +
  +obj-$(CONFIG_HNS_DSAF) += hns_dsaf.o
  +hns_dsaf-objs = hns_ae_adapt.o hns_dsaf_gmac.o hns_dsaf_mac.o 
  hns_dsaf_misc.o \
  +   hns_dsaf_main.o hns_dsaf_ppe.o hns_dsaf_rcb.o hns_dsaf_xgmac.o
  +
  +obj

Re: [PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-17 Thread Kenneth Lee
Thanks, Arnd, 

You are right. This is the same IP as hip04_mdio.c. We just mis-understand the
hardware design. We will merge them and re-submit the patches.

On Fri, Aug 14, 2015 at 10:57:28PM +0200, Arnd Bergmann wrote:
> On Friday 14 August 2015 18:30:20 Kenneth Lee wrote:
> 
> > +#define MDIO_BASE_ADDR 0x403C
> 
> Does not belong in here (and is not used)
> 
> > +#define MDIO_COMMAND_REG   0x0
> > +#define MDIO_ADDR_REG  0x4
> > +#define MDIO_WDATA_REG 0x8
> > +#define MDIO_RDATA_REG 0xc
> > +#define MDIO_STA_REG   0x10
> 
> These look suspiciously similar to definitions from
> drivers/net/ethernet/hisilicon/hip04_mdio.c.
> 
> Could the hardware be related? If so, please try to share
> the common parts.
> 
> > +static inline void mdio_write_reg(void *base, u32 reg, u32 value)
> > +{
> > +   u8 __iomem *reg_addr = ACCESS_ONCE(base);
> > +
> > +   writel(value, reg_addr + reg);
> > +}
> > +
> > +#define MDIO_WRITE_REG(a, reg, value) \
> > +   mdio_write_reg((a)->vbase, (reg), (value))
> > 
> 
> Something seems wrong here: why do you have an ACCESS_ONCE() on a
> local variable? Doesn't this just make the code less efficient
> without providing lockless access to shared variables?
> 
> The types are inconsistent here, you should get a warning from
> running this through 'make C=1' because of the missing __iomem
> annotation of the pointer.
> 
> Also, why both a macro and an inline function? Just use an inline
> function.
> 
>   Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-17 Thread Kenneth Lee
Thanks, Arnd, 

You are right. This is the same IP as hip04_mdio.c. We just mis-understand the
hardware design. We will merge them and re-submit the patches.

On Fri, Aug 14, 2015 at 10:57:28PM +0200, Arnd Bergmann wrote:
 On Friday 14 August 2015 18:30:20 Kenneth Lee wrote:
 
  +#define MDIO_BASE_ADDR 0x403C
 
 Does not belong in here (and is not used)
 
  +#define MDIO_COMMAND_REG   0x0
  +#define MDIO_ADDR_REG  0x4
  +#define MDIO_WDATA_REG 0x8
  +#define MDIO_RDATA_REG 0xc
  +#define MDIO_STA_REG   0x10
 
 These look suspiciously similar to definitions from
 drivers/net/ethernet/hisilicon/hip04_mdio.c.
 
 Could the hardware be related? If so, please try to share
 the common parts.
 
  +static inline void mdio_write_reg(void *base, u32 reg, u32 value)
  +{
  +   u8 __iomem *reg_addr = ACCESS_ONCE(base);
  +
  +   writel(value, reg_addr + reg);
  +}
  +
  +#define MDIO_WRITE_REG(a, reg, value) \
  +   mdio_write_reg((a)-vbase, (reg), (value))
  
 
 Something seems wrong here: why do you have an ACCESS_ONCE() on a
 local variable? Doesn't this just make the code less efficient
 without providing lockless access to shared variables?
 
 The types are inconsistent here, you should get a warning from
 running this through 'make C=1' because of the missing __iomem
 annotation of the pointer.
 
 Also, why both a macro and an inline function? Just use an inline
 function.
 
   Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] net: add Hisilicon Network Subsystem basic ethernet support

2015-08-14 Thread Kenneth Lee
This is to add basic ethernet support for HNS. It is one of the way to use
the HNS acceleration engine. But most of the decoding/encoding capability
of the AE cannot be used in this way.

This submit contains the basic feature as a ethernet driver. More will be
added later.

Signed-off-by: Kenneth Lee 
Signed-off-by: Yisen Zhuang 
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c| 1552 ++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h|   81 ++
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 1174 
 3 files changed, 2807 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
new file mode 100644
index 000..b58d5ab
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -0,0 +1,1552 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hnae.h"
+#include "hns_enet.h"
+
+#define NIC_MAX_Q_PER_VF 16
+#define HNS_NIC_TX_TIMEOUT (5 * HZ)
+
+#define SERVICE_TIMER_HZ (1 * HZ)
+
+#define NIC_TX_CLEAN_MAX_NUM 256
+#define NIC_RX_CLEAN_MAX_NUM 64
+
+#define RCB_ERR_PRINT_CYCLE 1000
+
+static inline void fill_desc(struct hnae_ring *ring, void *priv,
+int size, dma_addr_t dma, int frag_end,
+int buf_num, enum hns_desc_type type)
+{
+   struct hnae_desc *desc = >desc[ring->next_to_use];
+   struct hnae_desc_cb *desc_cb = >desc_cb[ring->next_to_use];
+   struct sk_buff *skb;
+   __be16 protocol;
+   u32 ip_offset;
+   u32 asid_bufnum_pid = 0;
+   u32 flag_ipoffset = 0;
+
+   desc_cb->priv = priv;
+   desc_cb->length = size;
+   desc_cb->dma = dma;
+   desc_cb->type = type;
+
+   desc->addr = cpu_to_le64(dma);
+   desc->tx.send_size = cpu_to_le16((u16)size);
+
+   /*config bd buffer end */
+   flag_ipoffset |= 1 << HNS_TXD_VLD_B;
+
+   asid_bufnum_pid |= buf_num << HNS_TXD_BUFNUM_S;
+
+   if (type == DESC_TYPE_SKB) {
+   skb = (struct sk_buff *)priv;
+
+   if (skb->ip_summed == CHECKSUM_PARTIAL) {
+   protocol = skb->protocol;
+   ip_offset = ETH_HLEN;
+
+   /*if it is a SW VLAN check the next protocol*/
+   if (protocol == htons(ETH_P_8021Q)) {
+   ip_offset += VLAN_HLEN;
+   protocol = vlan_get_protocol(skb);
+   skb->protocol = protocol;
+   }
+
+   if (skb->protocol == ntohs(ETH_P_IP)) {
+   flag_ipoffset |= 1 << HNS_TXD_L3CS_B;
+   /* check for tcp/udp header */
+   flag_ipoffset |= 1 << HNS_TXD_L4CS_B;
+
+   } else if (skb->protocol == ntohs(ETH_P_IPV6)) {
+   /* ipv6 has not l3 cs, check for L4 header */
+   flag_ipoffset |= 1 << HNS_TXD_L4CS_B;
+   }
+
+   flag_ipoffset |= ip_offset << HNS_TXD_IPOFFSET_S;
+   }
+   }
+
+   flag_ipoffset |= frag_end << HNS_TXD_FE_B;
+
+   desc->tx.asid_bufnum_pid = cpu_to_le16(asid_bufnum_pid);
+   desc->tx.flag_ipoffset = cpu_to_le32(flag_ipoffset);
+
+   ring_ptr_move_fw(ring, next_to_use);
+}
+
+static inline void unfill_desc(struct hnae_ring *ring)
+{
+   ring_ptr_move_bw(ring, next_to_use);
+}
+
+int hns_nic_net_xmit_hw(struct net_device *ndev,
+   struct sk_buff *skb,
+   struct hns_nic_ring_data *ring_data)
+{
+   struct hns_nic_priv *priv = netdev_priv(ndev);
+   struct device *dev = priv->dev;
+   struct hnae_ring *ring = ring_data->ring;
+   struct netdev_queue *dev_queue;
+   struct skb_frag_struct *frag;
+   int buf_num;
+   dma_addr_t dma;
+   int size, next_to_use;
+   int i, j;
+   struct sk_buff *new_skb;
+
+   assert(ring->max_desc_num_per_pkt <= ring->desc_num);
+
+   /* no. of segments (plus a header) */
+   buf_num = skb_shinfo(skb)->nr_frags + 1;
+
+   if (unlikely(buf_num > ring->max_desc_num_per_pkt)) {
+   

[PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-14 Thread Kenneth Lee
The MDIO support for Hisilicon Network Subsystem. It is used in Hislicon
P660 and Hi1610 SoC to control the external PHY

Signed-off-by: Yisen Zhuang 
Signed-off-by: Kenneth Lee 
---
 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c | 597 +
 1 file changed, 597 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c
new file mode 100644
index 000..7113fa8
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c
@@ -0,0 +1,597 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MDIO_DRV_NAME "hi-mdio"
+#define MDIO_BUS_NAME "Hisilicon MII Bus"
+#define MDIO_DRV_VERSION "1.1.0"
+#define MDIO_COPYRIGHT "Copyright(c) 2015 Huawei Corporation."
+#define MDIO_DRV_STRING MDIO_BUS_NAME
+#define MDIO_DEFAULT_DEVICE_DESCR MDIO_BUS_NAME
+
+#define MDIO_CTL_DEV_ADDR(x)   (x & 0x1f)
+#define MDIO_CTL_PORT_ADDR(x)  ((x & 0x1f) << 5)
+
+#define MDIO_BASE_ADDR 0x403C
+#define MDIO_REG_ADDR_LEN  0x1000
+#define MDIO_PHY_GRP_LEN   0x100
+#define MDIO_REG_LEN   0x10
+#define MDIO_PHY_ADDR_NUM  5
+#define MDIO_MAX_PHY_ADDR  0x1F
+#define MDIO_MAX_PHY_REG_ADDR  0x
+
+#define MDIO_TIMEOUT   100
+
+struct hns_mdio_device {
+   struct device *dev;
+   void *vbase;/* mdio reg base address */
+   u8 phy_class[PHY_MAX_ADDR];
+   u8 index;
+   u8 chip_id;
+   u8 gidx;/* global index */
+};
+
+#define MDIO_COMMAND_REG   0x0
+#define MDIO_ADDR_REG  0x4
+#define MDIO_WDATA_REG 0x8
+#define MDIO_RDATA_REG 0xc
+#define MDIO_STA_REG   0x10
+
+#define MDIO_CMD_DEVAD_M   0x1f
+#define MDIO_CMD_DEVAD_S   0
+#define MDIO_CMD_PRTAD_M   0x1f
+#define MDIO_CMD_PRTAD_S   5
+#define MDIO_CMD_OP_M  0x3
+#define MDIO_CMD_OP_S  10
+#define MDIO_CMD_ST_M  0x3
+#define MDIO_CMD_ST_S  12
+#define MDIO_CMD_START_B   14
+
+#define MDIO_ADDR_DATA_M   0x
+#define MDIO_ADDR_DATA_S   0
+
+#define MDIO_WDATA_DATA_M  0x
+#define MDIO_WDATA_DATA_S  0
+
+#define MDIO_RDATA_DATA_M  0x
+#define MDIO_RDATA_DATA_S  0
+
+#define MDIO_STATE_STA_B   0
+
+enum mdio_st_clause {
+   MDIO_ST_CLAUSE_45 = 0,
+   MDIO_ST_CLAUSE_22
+};
+
+enum mdio_c22_op_seq {
+   MDIO_C22_WRITE = 1,
+   MDIO_C22_READ = 2
+};
+
+enum mdio_c45_op_seq {
+   MDIO_C45_WRITE_ADDR = 0,
+   MDIO_C45_WRITE_DATA,
+   MDIO_C45_READ_INCREMENT,
+   MDIO_C45_READ
+};
+
+static inline void mdio_write_reg(void *base, u32 reg, u32 value)
+{
+   u8 __iomem *reg_addr = ACCESS_ONCE(base);
+
+   writel(value, reg_addr + reg);
+}
+
+#define MDIO_WRITE_REG(a, reg, value) \
+   mdio_write_reg((a)->vbase, (reg), (value))
+
+static inline u32 mdio_read_reg(void *base, u32 reg)
+{
+   u8 __iomem *reg_addr = ACCESS_ONCE(base);
+
+   return readl(reg_addr + reg);
+}
+
+#define MDIO_READ_REG(a, reg) \
+   mdio_read_reg((a)->vbase, (reg))
+
+#define mdio_set_field(origin, mask, shift, val) \
+   do { \
+   (origin) &= (~((mask) << (shift))); \
+   (origin) |= (((val) & (mask)) << (shift)); \
+   } while (0)
+
+#define mdio_get_field(origin, mask, shift) (((origin) >> (shift)) & (mask))
+
+static void mdio_set_reg_field(void *base, u32 reg, u32 mask, u32 shift,
+  u32 val)
+{
+   u32 origin = mdio_read_reg(base, reg);
+
+   mdio_set_field(origin, mask, shift, val);
+   mdio_write_reg(base, reg, origin);
+}
+
+#define MDIO_SET_REG_FIELD(dev, reg, mask, shift, val) \
+   mdio_set_reg_field((dev)->vbase, (reg), (mask), (shift), (val))
+
+static u32 mdio_get_reg_field(void *base, u32 reg, u32 mask, u32 shift)
+{
+   u32 origin;
+
+   origin = mdio_read_reg(base, reg);
+   return mdio_get_field(origin, mask, shift);
+}
+
+#define MDIO_GET_REG_FIELD(dev, reg, mask, shift) \
+   mdio_get_reg_field((dev)->vbase, (reg), (mask), (shift))
+
+#define MDIO_SET_REG_BIT(dev, reg, bit, val) \
+   mdio_set_reg_field((dev)->vbase, (reg), 0x1ull, (bit), (val))
+
+#define MDIO_GET_REG_BIT(dev, reg, bit) \
+   mdio_get_reg_fiel

[PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support

2015-08-14 Thread Kenneth Lee
HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
unified ring buffer interface for Hisilicon Network Acceleration Engines.

With the interface, upper layer can work as ethernet driver, ODP driver or
other service driver on purpose.

Signed-off-by: Kenneth Lee 
Signed-off-by: Yisen Zhuang 
---
 drivers/net/ethernet/hisilicon/Kconfig  |  33 +-
 drivers/net/ethernet/hisilicon/Makefile |   1 +
 drivers/net/ethernet/hisilicon/hns/Makefile |  15 +
 drivers/net/ethernet/hisilicon/hns/hnae.c   | 494 +++
 drivers/net/ethernet/hisilicon/hns/hnae.h   | 582 
 5 files changed, 1124 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h

diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
b/drivers/net/ethernet/hisilicon/Kconfig
index dead17b..1e4f5a7 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -5,7 +5,7 @@
 config NET_VENDOR_HISILICON
bool "Hisilicon devices"
default y
-   depends on ARM
+   depends on ARM || ARM64
---help---
  If you have a network (Ethernet) card belonging to this class, say Y.
 
@@ -31,4 +31,35 @@ config HIP04_ETH
  If you wish to compile a kernel for a hardware with hisilicon p04 SoC 
and
  want to use the internal ethernet then you should answer Y to this.
 
+config HNS
+   tristate "Hisilicon Network Subsystem Support (Framework)"
+   ---help---
+ This selects the framework support for Hisilicon Network Subsystem. It
+ is needed by any driver which provides HNS acceleration engine or make
+ use of the engine
+
+config HNS_DSAF
+   tristate "Hisilicon HNS DSAF device Support"
+   select HNS
+   select HNS_MDIO
+   ---help---
+ This selects the DSAF (Distributed System Area Frabric) network
+ acceleration engine support. The engine is used in Hisilicon P660,
+ Hi1610 and further ICT SoC
+
+config HNS_MDIO
+   tristate "Hisilicon HNS MDIO device Support"
+   select MDIO
+   ---help---
+ This selects the HNS MDIO support. It is needed by HNS_DSAF to access
+ the PHY
+
+config HNS_ENET
+   tristate "Hisilicon HNS Ethernet Device Support"
+   select PHYLIB
+   select HNS
+   ---help---
+ This selects the general ethernet driver for HNS.  This module make
+ use of any HNS AE driver, such as HNS_DSAF
+
 endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile 
b/drivers/net/ethernet/hisilicon/Makefile
index 6c14540..2503a9b 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -4,3 +4,4 @@
 
 obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
 obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
+obj-$(CONFIG_HNS) += hns/
diff --git a/drivers/net/ethernet/hisilicon/hns/Makefile 
b/drivers/net/ethernet/hisilicon/hns/Makefile
new file mode 100644
index 000..6680602
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+obj-$(CONFIG_HNS) += hnae.o
+
+obj-$(CONFIG_HNS_DSAF) += hns_dsaf.o
+hns_dsaf-objs = hns_ae_adapt.o hns_dsaf_gmac.o hns_dsaf_mac.o hns_dsaf_misc.o \
+   hns_dsaf_main.o hns_dsaf_ppe.o hns_dsaf_rcb.o hns_dsaf_xgmac.o
+
+obj-$(CONFIG_HNS_MDIO) += hns_mdio.o
+hns_mdio-objs = hns_mdio_main.o
+
+obj-$(CONFIG_HNS_ENET) += hns_enet_drv.o
+hns_enet_drv-objs = hns_enet.o hns_ethtool.o
diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c 
b/drivers/net/ethernet/hisilicon/hns/hnae.c
new file mode 100644
index 000..fd09768
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -0,0 +1,494 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "hnae.h"
+
+#define cls_to_ae_dev(dev) container_of(dev, struct hnae_ae_dev, cls_dev)
+
+static struct class *hnae_class;
+
+static inline void hnae_list_add(spinlock_t *lock, struct list_head *node,
+struct list_head *head)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(lock, flags);
+   list_add_tail_rcu(node, head);
+   spin_unlock_irqrestore(lock, flags);
+}
+
+static inline void hnae_list_del(spinlock_t *lock, struct list_head *node)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(lock, flags);
+   list_del_rcu(node);
+   spin_unlock_irqrestore(lock, flags

[PATCH 0/5] net: Hisilicon Network Subsystem support

2015-08-14 Thread Kenneth Lee
This patchset add Hisilicon Network Subsystem support. The subsystem
provides a long term developing network accelerate engine with ring buffer
interface. The network interface can be used as standard ethernet network
interface card or be made use by a network application with decoded L2 to
L4 data.

The patchset is porting from some internal-use drivers, it is tested and
working fine with the hardware. But some detail design is not that good.
But we want to know if the community can accept the structure/arch before
refining it. Thank you.

Kenneth Lee (5):
  net: add Hisilicon Network Subsystem support (config and documents)
  net: add Hisilicon Network Subsystem hnae framework support
  net: add Hisilicon Network Subsystem MDIO support
  net: add Hisilicon Network Subsystem DSAF support
  net: add Hisilicon Network Subsystem basic ethernet support

 .../devicetree/bindings/net/hisilicon-hns-dsaf.txt |   40 +
 .../devicetree/bindings/net/hisilicon-hns-mdio.txt |   22 +
 .../devicetree/bindings/net/hisilicon-hns-nic.txt  |   14 +
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi   |  197 ++
 drivers/net/ethernet/hisilicon/Kconfig |   33 +-
 drivers/net/ethernet/hisilicon/Makefile|1 +
 drivers/net/ethernet/hisilicon/hns/Makefile|   15 +
 drivers/net/ethernet/hisilicon/hns/hnae.c  |  494 
 drivers/net/ethernet/hisilicon/hns/hnae.h  |  582 +
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c  |  766 ++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c |  705 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.h |   45 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  |  942 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h  |  462 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 2681 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h |  438 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c |  311 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h |   45 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c  |  582 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.h  |  105 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c  |  972 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h  |  136 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h  |  958 +++
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c|  826 ++
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.h|   15 +
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  | 1552 +++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h  |   81 +
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c   | 1174 +
 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c |  597 +
 29 files changed, 14790 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 create mode 100644 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
 create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-14 Thread Kenneth Lee
The Hisilicon Network Subsystem is a long term evolution IP which is
supposed to be used in Hisilicon ICT SoC. The IP, which is called hns for
short, is a TCP/IP acceleration engine, which can directly decode TCP/IP
stream and distribute them to different ring buffers.

HNS can be configured to work on different mode for different scenario.
This patch make use only some of the mode to make it as standard ethernet
NIC.  The other mode will be added soon.

The whole function has 4 kernel sub-modules:

hnae: the HNS acceleration engine framework. It provides a abstract
interface between the engine and the upper layers which make use of the
engine by ring buffer.

hns_enet_drv: a standard ethernet driver that base on the ring buffer.

hns_dsaf: one of the implementation of HNS acceleration engine, which is
applied on Hililicon P660, Hi1610 and other later-on SoCs

hns_mdio: the mdio control to the PHY, used by acceleration engine

This submit add basic config and documents

Signed-off-by: Kenneth Lee 
Signed-off-by: Yisen Zhuang 
---
 .../devicetree/bindings/net/hisilicon-hns-dsaf.txt |  40 +
 .../devicetree/bindings/net/hisilicon-hns-mdio.txt |  22 +++
 .../devicetree/bindings/net/hisilicon-hns-nic.txt  |  14 ++
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi   | 197 +
 4 files changed, 273 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 create mode 100644 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi

diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
new file mode 100644
index 000..038c03d
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
@@ -0,0 +1,40 @@
+Hisilicon DSA Fabric device controller
+
+Required properties:
+- compatible: should be "hisilicon,dsaf".
+- dsa-name: dsa fabric name who provide this interface
+- interrupt-parent: the interrupt parent of this device.
+- interrupts: should contain the DSA Fabric and rcb interrupt.
+- reg: specifies base physical address(es) and size of the device registers.
+  The first region is external interface control register base and size.
+  The second region is SerDes base register and size.
+  The third region is the PPE register base and size.
+  The fourth region is dsa fabric base register and size.
+- phy-handle: phy handle of physicl port, 0 if not any phy device. see 
ethernet.txt [1].
+- buf-size: rx buffer size, should be 16-1024.
+- desc-num: number of description in TX and RX queue, should be 512, 1024, 
2048 or 4096.
+
+[1] Documentation/devicetree/bindings/net/phy.txt
+
+Example:
+
+dsa: dsa@c700 {
+   compatible = "hisilicon,dsaf";
+   dsa_name = "soc0-n4";
+   interrupt-parent = <_dsa>;
+   reg = <0x0 0xC000 0x0 0x42
+  0x0 0xC200 0x0 0x30
+  0x0 0xc500 0x0 0x89
+  0x0 0xc700 0x0 0x6>;
+   phy-handle = <0 0 0 0 _phy4 _phy5 0 0>;
+   interrupts = <131 4>,<132 4>, <133 4>,<134 4>,
+<135 4>,<136 4>, <137 4>,<138 4>,
+<139 4>,<140 4>, <141 4>,<142 4>,
+<143 4>,<144 4>, <145 4>,<146 4>,
+<147 4>,<148 4>, <384 1>,<385 1>,
+<386 1>,<387 1>, <388 1>,<389 1>,
+<390 1>,<391 1>,
+   buf-size = <4096>;
+   desc-num = <1024>;
+   dma-coherent;
+};
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
new file mode 100644
index 000..205e803
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
@@ -0,0 +1,22 @@
+Hisilicon MDIO bus controller
+
+Properties:
+- compatible: "hisilicon,mdio"
+- reg: The base address of the MDIO bus controller register bank.
+- #address-cells: Must be <1>.
+- #size-cells: Must be <0>.  MDIO addresses have no size component.
+
+Typically an MDIO bus might have several children.
+
+Example:
+ mdio@803c {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   compatible = "hisilicon,mdio";
+   reg = <0x0 0x803c 0x0 0x1>;
+
+   ethernet-phy@0 {
+...
+reg = <0>;
+   };
+ };
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
ne

[PATCH 5/5] net: add Hisilicon Network Subsystem basic ethernet support

2015-08-14 Thread Kenneth Lee
This is to add basic ethernet support for HNS. It is one of the way to use
the HNS acceleration engine. But most of the decoding/encoding capability
of the AE cannot be used in this way.

This submit contains the basic feature as a ethernet driver. More will be
added later.

Signed-off-by: Kenneth Lee liguo...@huawei.com
Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c| 1552 ++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h|   81 ++
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 1174 
 3 files changed, 2807 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
new file mode 100644
index 000..b58d5ab
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -0,0 +1,1552 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include linux/module.h
+#include linux/interrupt.h
+#include linux/etherdevice.h
+#include linux/platform_device.h
+#include linux/clk.h
+#include linux/skbuff.h
+#include linux/phy.h
+#include linux/io.h
+#include linux/ip.h
+#include linux/ipv6.h
+#include linux/if_vlan.h
+#include hnae.h
+#include hns_enet.h
+
+#define NIC_MAX_Q_PER_VF 16
+#define HNS_NIC_TX_TIMEOUT (5 * HZ)
+
+#define SERVICE_TIMER_HZ (1 * HZ)
+
+#define NIC_TX_CLEAN_MAX_NUM 256
+#define NIC_RX_CLEAN_MAX_NUM 64
+
+#define RCB_ERR_PRINT_CYCLE 1000
+
+static inline void fill_desc(struct hnae_ring *ring, void *priv,
+int size, dma_addr_t dma, int frag_end,
+int buf_num, enum hns_desc_type type)
+{
+   struct hnae_desc *desc = ring-desc[ring-next_to_use];
+   struct hnae_desc_cb *desc_cb = ring-desc_cb[ring-next_to_use];
+   struct sk_buff *skb;
+   __be16 protocol;
+   u32 ip_offset;
+   u32 asid_bufnum_pid = 0;
+   u32 flag_ipoffset = 0;
+
+   desc_cb-priv = priv;
+   desc_cb-length = size;
+   desc_cb-dma = dma;
+   desc_cb-type = type;
+
+   desc-addr = cpu_to_le64(dma);
+   desc-tx.send_size = cpu_to_le16((u16)size);
+
+   /*config bd buffer end */
+   flag_ipoffset |= 1  HNS_TXD_VLD_B;
+
+   asid_bufnum_pid |= buf_num  HNS_TXD_BUFNUM_S;
+
+   if (type == DESC_TYPE_SKB) {
+   skb = (struct sk_buff *)priv;
+
+   if (skb-ip_summed == CHECKSUM_PARTIAL) {
+   protocol = skb-protocol;
+   ip_offset = ETH_HLEN;
+
+   /*if it is a SW VLAN check the next protocol*/
+   if (protocol == htons(ETH_P_8021Q)) {
+   ip_offset += VLAN_HLEN;
+   protocol = vlan_get_protocol(skb);
+   skb-protocol = protocol;
+   }
+
+   if (skb-protocol == ntohs(ETH_P_IP)) {
+   flag_ipoffset |= 1  HNS_TXD_L3CS_B;
+   /* check for tcp/udp header */
+   flag_ipoffset |= 1  HNS_TXD_L4CS_B;
+
+   } else if (skb-protocol == ntohs(ETH_P_IPV6)) {
+   /* ipv6 has not l3 cs, check for L4 header */
+   flag_ipoffset |= 1  HNS_TXD_L4CS_B;
+   }
+
+   flag_ipoffset |= ip_offset  HNS_TXD_IPOFFSET_S;
+   }
+   }
+
+   flag_ipoffset |= frag_end  HNS_TXD_FE_B;
+
+   desc-tx.asid_bufnum_pid = cpu_to_le16(asid_bufnum_pid);
+   desc-tx.flag_ipoffset = cpu_to_le32(flag_ipoffset);
+
+   ring_ptr_move_fw(ring, next_to_use);
+}
+
+static inline void unfill_desc(struct hnae_ring *ring)
+{
+   ring_ptr_move_bw(ring, next_to_use);
+}
+
+int hns_nic_net_xmit_hw(struct net_device *ndev,
+   struct sk_buff *skb,
+   struct hns_nic_ring_data *ring_data)
+{
+   struct hns_nic_priv *priv = netdev_priv(ndev);
+   struct device *dev = priv-dev;
+   struct hnae_ring *ring = ring_data-ring;
+   struct netdev_queue *dev_queue;
+   struct skb_frag_struct *frag;
+   int buf_num;
+   dma_addr_t dma;
+   int size, next_to_use;
+   int i, j;
+   struct sk_buff *new_skb;
+
+   assert(ring-max_desc_num_per_pkt = ring-desc_num);
+
+   /* no. of segments (plus a header) */
+   buf_num = skb_shinfo(skb)-nr_frags + 1;
+
+   if (unlikely(buf_num  ring-max_desc_num_per_pkt

[PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-14 Thread Kenneth Lee
The Hisilicon Network Subsystem is a long term evolution IP which is
supposed to be used in Hisilicon ICT SoC. The IP, which is called hns for
short, is a TCP/IP acceleration engine, which can directly decode TCP/IP
stream and distribute them to different ring buffers.

HNS can be configured to work on different mode for different scenario.
This patch make use only some of the mode to make it as standard ethernet
NIC.  The other mode will be added soon.

The whole function has 4 kernel sub-modules:

hnae: the HNS acceleration engine framework. It provides a abstract
interface between the engine and the upper layers which make use of the
engine by ring buffer.

hns_enet_drv: a standard ethernet driver that base on the ring buffer.

hns_dsaf: one of the implementation of HNS acceleration engine, which is
applied on Hililicon P660, Hi1610 and other later-on SoCs

hns_mdio: the mdio control to the PHY, used by acceleration engine

This submit add basic config and documents

Signed-off-by: Kenneth Lee liguo...@huawei.com
Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
---
 .../devicetree/bindings/net/hisilicon-hns-dsaf.txt |  40 +
 .../devicetree/bindings/net/hisilicon-hns-mdio.txt |  22 +++
 .../devicetree/bindings/net/hisilicon-hns-nic.txt  |  14 ++
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi   | 197 +
 4 files changed, 273 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 create mode 100644 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi

diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
new file mode 100644
index 000..038c03d
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
@@ -0,0 +1,40 @@
+Hisilicon DSA Fabric device controller
+
+Required properties:
+- compatible: should be hisilicon,dsaf.
+- dsa-name: dsa fabric name who provide this interface
+- interrupt-parent: the interrupt parent of this device.
+- interrupts: should contain the DSA Fabric and rcb interrupt.
+- reg: specifies base physical address(es) and size of the device registers.
+  The first region is external interface control register base and size.
+  The second region is SerDes base register and size.
+  The third region is the PPE register base and size.
+  The fourth region is dsa fabric base register and size.
+- phy-handle: phy handle of physicl port, 0 if not any phy device. see 
ethernet.txt [1].
+- buf-size: rx buffer size, should be 16-1024.
+- desc-num: number of description in TX and RX queue, should be 512, 1024, 
2048 or 4096.
+
+[1] Documentation/devicetree/bindings/net/phy.txt
+
+Example:
+
+dsa: dsa@c700 {
+   compatible = hisilicon,dsaf;
+   dsa_name = soc0-n4;
+   interrupt-parent = mbigen_dsa;
+   reg = 0x0 0xC000 0x0 0x42
+  0x0 0xC200 0x0 0x30
+  0x0 0xc500 0x0 0x89
+  0x0 0xc700 0x0 0x6;
+   phy-handle = 0 0 0 0 soc0_phy4 soc0_phy5 0 0;
+   interrupts = 131 4,132 4, 133 4,134 4,
+135 4,136 4, 137 4,138 4,
+139 4,140 4, 141 4,142 4,
+143 4,144 4, 145 4,146 4,
+147 4,148 4, 384 1,385 1,
+386 1,387 1, 388 1,389 1,
+390 1,391 1,
+   buf-size = 4096;
+   desc-num = 1024;
+   dma-coherent;
+};
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
new file mode 100644
index 000..205e803
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
@@ -0,0 +1,22 @@
+Hisilicon MDIO bus controller
+
+Properties:
+- compatible: hisilicon,mdio
+- reg: The base address of the MDIO bus controller register bank.
+- #address-cells: Must be 1.
+- #size-cells: Must be 0.  MDIO addresses have no size component.
+
+Typically an MDIO bus might have several children.
+
+Example:
+ mdio@803c {
+   #address-cells = 1;
+   #size-cells = 0;
+   compatible = hisilicon,mdio;
+   reg = 0x0 0x803c 0x0 0x1;
+
+   ethernet-phy@0 {
+...
+reg = 0;
+   };
+ };
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt 
b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
new file mode 100644
index 000..5ab6969
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
@@ -0,0 +1,14 @@
+Hisilicon Network Subsystem NIC controller
+
+Required properties:
+- compatible: hisilicon,hns-nic
+- ae-name: accelerator name who provide this interface

[PATCH 0/5] net: Hisilicon Network Subsystem support

2015-08-14 Thread Kenneth Lee
This patchset add Hisilicon Network Subsystem support. The subsystem
provides a long term developing network accelerate engine with ring buffer
interface. The network interface can be used as standard ethernet network
interface card or be made use by a network application with decoded L2 to
L4 data.

The patchset is porting from some internal-use drivers, it is tested and
working fine with the hardware. But some detail design is not that good.
But we want to know if the community can accept the structure/arch before
refining it. Thank you.

Kenneth Lee (5):
  net: add Hisilicon Network Subsystem support (config and documents)
  net: add Hisilicon Network Subsystem hnae framework support
  net: add Hisilicon Network Subsystem MDIO support
  net: add Hisilicon Network Subsystem DSAF support
  net: add Hisilicon Network Subsystem basic ethernet support

 .../devicetree/bindings/net/hisilicon-hns-dsaf.txt |   40 +
 .../devicetree/bindings/net/hisilicon-hns-mdio.txt |   22 +
 .../devicetree/bindings/net/hisilicon-hns-nic.txt  |   14 +
 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi   |  197 ++
 drivers/net/ethernet/hisilicon/Kconfig |   33 +-
 drivers/net/ethernet/hisilicon/Makefile|1 +
 drivers/net/ethernet/hisilicon/hns/Makefile|   15 +
 drivers/net/ethernet/hisilicon/hns/hnae.c  |  494 
 drivers/net/ethernet/hisilicon/hns/hnae.h  |  582 +
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c  |  766 ++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c |  705 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.h |   45 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  |  942 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h  |  462 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 2681 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h |  438 
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c |  311 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h |   45 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c  |  582 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.h  |  105 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c  |  972 +++
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h  |  136 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h  |  958 +++
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c|  826 ++
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.h|   15 +
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  | 1552 +++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h  |   81 +
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c   | 1174 +
 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c |  597 +
 29 files changed, 14790 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
 create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 create mode 100644 arch/arm64/boot/dts/hisilicon/hip05_hns.dtsi
 create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_dsaf_xgmac.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_enet.h
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support

2015-08-14 Thread Kenneth Lee
HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
unified ring buffer interface for Hisilicon Network Acceleration Engines.

With the interface, upper layer can work as ethernet driver, ODP driver or
other service driver on purpose.

Signed-off-by: Kenneth Lee liguo...@huawei.com
Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
---
 drivers/net/ethernet/hisilicon/Kconfig  |  33 +-
 drivers/net/ethernet/hisilicon/Makefile |   1 +
 drivers/net/ethernet/hisilicon/hns/Makefile |  15 +
 drivers/net/ethernet/hisilicon/hns/hnae.c   | 494 +++
 drivers/net/ethernet/hisilicon/hns/hnae.h   | 582 
 5 files changed, 1124 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h

diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
b/drivers/net/ethernet/hisilicon/Kconfig
index dead17b..1e4f5a7 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -5,7 +5,7 @@
 config NET_VENDOR_HISILICON
bool Hisilicon devices
default y
-   depends on ARM
+   depends on ARM || ARM64
---help---
  If you have a network (Ethernet) card belonging to this class, say Y.
 
@@ -31,4 +31,35 @@ config HIP04_ETH
  If you wish to compile a kernel for a hardware with hisilicon p04 SoC 
and
  want to use the internal ethernet then you should answer Y to this.
 
+config HNS
+   tristate Hisilicon Network Subsystem Support (Framework)
+   ---help---
+ This selects the framework support for Hisilicon Network Subsystem. It
+ is needed by any driver which provides HNS acceleration engine or make
+ use of the engine
+
+config HNS_DSAF
+   tristate Hisilicon HNS DSAF device Support
+   select HNS
+   select HNS_MDIO
+   ---help---
+ This selects the DSAF (Distributed System Area Frabric) network
+ acceleration engine support. The engine is used in Hisilicon P660,
+ Hi1610 and further ICT SoC
+
+config HNS_MDIO
+   tristate Hisilicon HNS MDIO device Support
+   select MDIO
+   ---help---
+ This selects the HNS MDIO support. It is needed by HNS_DSAF to access
+ the PHY
+
+config HNS_ENET
+   tristate Hisilicon HNS Ethernet Device Support
+   select PHYLIB
+   select HNS
+   ---help---
+ This selects the general ethernet driver for HNS.  This module make
+ use of any HNS AE driver, such as HNS_DSAF
+
 endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile 
b/drivers/net/ethernet/hisilicon/Makefile
index 6c14540..2503a9b 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -4,3 +4,4 @@
 
 obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
 obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
+obj-$(CONFIG_HNS) += hns/
diff --git a/drivers/net/ethernet/hisilicon/hns/Makefile 
b/drivers/net/ethernet/hisilicon/hns/Makefile
new file mode 100644
index 000..6680602
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/Makefile
@@ -0,0 +1,15 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+obj-$(CONFIG_HNS) += hnae.o
+
+obj-$(CONFIG_HNS_DSAF) += hns_dsaf.o
+hns_dsaf-objs = hns_ae_adapt.o hns_dsaf_gmac.o hns_dsaf_mac.o hns_dsaf_misc.o \
+   hns_dsaf_main.o hns_dsaf_ppe.o hns_dsaf_rcb.o hns_dsaf_xgmac.o
+
+obj-$(CONFIG_HNS_MDIO) += hns_mdio.o
+hns_mdio-objs = hns_mdio_main.o
+
+obj-$(CONFIG_HNS_ENET) += hns_enet_drv.o
+hns_enet_drv-objs = hns_enet.o hns_ethtool.o
diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c 
b/drivers/net/ethernet/hisilicon/hns/hnae.c
new file mode 100644
index 000..fd09768
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -0,0 +1,494 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include linux/dma-mapping.h
+#include linux/interrupt.h
+#include linux/skbuff.h
+#include linux/slab.h
+
+#include hnae.h
+
+#define cls_to_ae_dev(dev) container_of(dev, struct hnae_ae_dev, cls_dev)
+
+static struct class *hnae_class;
+
+static inline void hnae_list_add(spinlock_t *lock, struct list_head *node,
+struct list_head *head)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(lock, flags);
+   list_add_tail_rcu(node, head);
+   spin_unlock_irqrestore(lock, flags);
+}
+
+static inline void hnae_list_del(spinlock_t *lock, struct list_head *node)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(lock, flags);
+   list_del_rcu(node

[PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-14 Thread Kenneth Lee
The MDIO support for Hisilicon Network Subsystem. It is used in Hislicon
P660 and Hi1610 SoC to control the external PHY

Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
Signed-off-by: Kenneth Lee liguo...@huawei.com
---
 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c | 597 +
 1 file changed, 597 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c
new file mode 100644
index 000..7113fa8
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns/hns_mdio_main.c
@@ -0,0 +1,597 @@
+/*
+ * Copyright (c) 2014-2015 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include linux/errno.h
+#include linux/etherdevice.h
+#include linux/init.h
+#include linux/kernel.h
+#include linux/module.h
+#include linux/mutex.h
+#include linux/netdevice.h
+#include linux/of_address.h
+#include linux/of.h
+#include linux/of_mdio.h
+#include linux/of_platform.h
+#include linux/phy.h
+#include linux/platform_device.h
+#include linux/spinlock_types.h
+
+#define MDIO_DRV_NAME hi-mdio
+#define MDIO_BUS_NAME Hisilicon MII Bus
+#define MDIO_DRV_VERSION 1.1.0
+#define MDIO_COPYRIGHT Copyright(c) 2015 Huawei Corporation.
+#define MDIO_DRV_STRING MDIO_BUS_NAME
+#define MDIO_DEFAULT_DEVICE_DESCR MDIO_BUS_NAME
+
+#define MDIO_CTL_DEV_ADDR(x)   (x  0x1f)
+#define MDIO_CTL_PORT_ADDR(x)  ((x  0x1f)  5)
+
+#define MDIO_BASE_ADDR 0x403C
+#define MDIO_REG_ADDR_LEN  0x1000
+#define MDIO_PHY_GRP_LEN   0x100
+#define MDIO_REG_LEN   0x10
+#define MDIO_PHY_ADDR_NUM  5
+#define MDIO_MAX_PHY_ADDR  0x1F
+#define MDIO_MAX_PHY_REG_ADDR  0x
+
+#define MDIO_TIMEOUT   100
+
+struct hns_mdio_device {
+   struct device *dev;
+   void *vbase;/* mdio reg base address */
+   u8 phy_class[PHY_MAX_ADDR];
+   u8 index;
+   u8 chip_id;
+   u8 gidx;/* global index */
+};
+
+#define MDIO_COMMAND_REG   0x0
+#define MDIO_ADDR_REG  0x4
+#define MDIO_WDATA_REG 0x8
+#define MDIO_RDATA_REG 0xc
+#define MDIO_STA_REG   0x10
+
+#define MDIO_CMD_DEVAD_M   0x1f
+#define MDIO_CMD_DEVAD_S   0
+#define MDIO_CMD_PRTAD_M   0x1f
+#define MDIO_CMD_PRTAD_S   5
+#define MDIO_CMD_OP_M  0x3
+#define MDIO_CMD_OP_S  10
+#define MDIO_CMD_ST_M  0x3
+#define MDIO_CMD_ST_S  12
+#define MDIO_CMD_START_B   14
+
+#define MDIO_ADDR_DATA_M   0x
+#define MDIO_ADDR_DATA_S   0
+
+#define MDIO_WDATA_DATA_M  0x
+#define MDIO_WDATA_DATA_S  0
+
+#define MDIO_RDATA_DATA_M  0x
+#define MDIO_RDATA_DATA_S  0
+
+#define MDIO_STATE_STA_B   0
+
+enum mdio_st_clause {
+   MDIO_ST_CLAUSE_45 = 0,
+   MDIO_ST_CLAUSE_22
+};
+
+enum mdio_c22_op_seq {
+   MDIO_C22_WRITE = 1,
+   MDIO_C22_READ = 2
+};
+
+enum mdio_c45_op_seq {
+   MDIO_C45_WRITE_ADDR = 0,
+   MDIO_C45_WRITE_DATA,
+   MDIO_C45_READ_INCREMENT,
+   MDIO_C45_READ
+};
+
+static inline void mdio_write_reg(void *base, u32 reg, u32 value)
+{
+   u8 __iomem *reg_addr = ACCESS_ONCE(base);
+
+   writel(value, reg_addr + reg);
+}
+
+#define MDIO_WRITE_REG(a, reg, value) \
+   mdio_write_reg((a)-vbase, (reg), (value))
+
+static inline u32 mdio_read_reg(void *base, u32 reg)
+{
+   u8 __iomem *reg_addr = ACCESS_ONCE(base);
+
+   return readl(reg_addr + reg);
+}
+
+#define MDIO_READ_REG(a, reg) \
+   mdio_read_reg((a)-vbase, (reg))
+
+#define mdio_set_field(origin, mask, shift, val) \
+   do { \
+   (origin) = (~((mask)  (shift))); \
+   (origin) |= (((val)  (mask))  (shift)); \
+   } while (0)
+
+#define mdio_get_field(origin, mask, shift) (((origin)  (shift))  (mask))
+
+static void mdio_set_reg_field(void *base, u32 reg, u32 mask, u32 shift,
+  u32 val)
+{
+   u32 origin = mdio_read_reg(base, reg);
+
+   mdio_set_field(origin, mask, shift, val);
+   mdio_write_reg(base, reg, origin);
+}
+
+#define MDIO_SET_REG_FIELD(dev, reg, mask, shift, val) \
+   mdio_set_reg_field((dev)-vbase, (reg), (mask), (shift), (val))
+
+static u32 mdio_get_reg_field(void *base, u32 reg, u32 mask, u32 shift)
+{
+   u32 origin;
+
+   origin = mdio_read_reg(base, reg);
+   return mdio_get_field(origin, mask, shift);
+}
+
+#define MDIO_GET_REG_FIELD(dev, reg, mask, shift) \
+   mdio_get_reg_field((dev)-vbase, (reg), (mask), (shift))
+
+#define MDIO_SET_REG_BIT(dev, reg, bit, val

Re: An small ftrace enhancement idea

2013-10-31 Thread Kenneth Lee
Thank you, Steve, 

Yes, with a separated instance, I can measure the latency for a stimulation 
while capture the other schedule events which I am interesting in. This is a 
better solution. I don’t know this “instance” stuff before. I don’t need to 
create another axe. I am sorry for my ignorance. 

Thanks and regards.

--Kenneth

在 2013年10月31日,下午1:50,Steven Rostedt  写道:

> On Wed, 30 Oct 2013 15:39:50 -0700
> Kenneth Lee  wrote:
> 
>> Dear Steven,
>> 
>> I want to add a new function to ftrace subsystem. Sometimes, we will face 
>> such a problem: system do not response to the input on time one to two times 
>> everyday. It is not easy to capture because it rarely happens. So I want to 
>> add a function to the kernel. If I have such problem, I insert a kernel 
>> module, who add a hook to the position that receive the input and another to 
>> the position that response to the input (with a session id if necessary). 
>> And I can compare the time between them and if the period is longer then a 
>> pre-set threshold, I can give a signal to a user helper application (maybe a 
>> script waiting on the file), which then can save the trace event to a file 
>> for later inspection.
> 
> I'm a little confused in what you want.
> 
> 
>> 
>> 
>> 
>> The user helper script may look like this:
>> 
>> 
>> 
>> #/bin/sh
>> 
>> 
>> 
>> echo ‘sched:*’ > /sys/kernel/debug/tracing/set_event
>> 
>> modprobe delay_inspector.ko threshold=500
>> 
>> cat /sys/kernel/debug/tracing/waiter #wait for signal
>> 
>> cp /sys/kernel/debug/tracing/trace /var/log/delay_infomation
>> 
>> 
>> 
>> 
>> 
>> It looks like a standalone function. But I don’t have place to put it. Do 
>> you think I can implement it in ftrace? And do you think if there are better 
>> solution?
>> 
> 
> You want something to wake up if it takes too long before an event
> happens?
> 
> If so, why not just use a select() on the trace_pipe and if it times
> out, then dump the trace. You can even set up a separate instance.
> 
> (this is waiting for a schedule switch to pid 1)
> 
> cd /sys/kernel/debug/tracing
> mkdir instances/mine
> echo 'next_pid == 1' > instances/mine/events/sched/sched_switch/filter
> echo 1 > instances/mine/events/sched/sched_switch/enable
> 
> 
> The in a userspace program, I open "instances/mine/events/trace_pipe"
> and run a select() on that file descriptor with a given timeout. If the
> event does not happen within the expected time frame, the select
> returns zero, and this userspace program can deal with it.
> 
> Is that the functionality you are trying to achieve?
> 
> -- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: An small ftrace enhancement idea

2013-10-31 Thread Kenneth Lee
Thank you, Steve, 

Yes, with a separated instance, I can measure the latency for a stimulation 
while capture the other schedule events which I am interesting in. This is a 
better solution. I don’t know this “instance” stuff before. I don’t need to 
create another axe. I am sorry for my ignorance. 

Thanks and regards.

--Kenneth

在 2013年10月31日,下午1:50,Steven Rostedt rost...@goodmis.org 写道:

 On Wed, 30 Oct 2013 15:39:50 -0700
 Kenneth Lee nek.in...@gmail.com wrote:
 
 Dear Steven,
 
 I want to add a new function to ftrace subsystem. Sometimes, we will face 
 such a problem: system do not response to the input on time one to two times 
 everyday. It is not easy to capture because it rarely happens. So I want to 
 add a function to the kernel. If I have such problem, I insert a kernel 
 module, who add a hook to the position that receive the input and another to 
 the position that response to the input (with a session id if necessary). 
 And I can compare the time between them and if the period is longer then a 
 pre-set threshold, I can give a signal to a user helper application (maybe a 
 script waiting on the file), which then can save the trace event to a file 
 for later inspection.
 
 I'm a little confused in what you want.
 
 
 
 
 
 The user helper script may look like this:
 
 
 
 #/bin/sh
 
 
 
 echo ‘sched:*’  /sys/kernel/debug/tracing/set_event
 
 modprobe delay_inspector.ko threshold=500
 
 cat /sys/kernel/debug/tracing/waiter #wait for signal
 
 cp /sys/kernel/debug/tracing/trace /var/log/delay_infomation
 
 
 
 
 
 It looks like a standalone function. But I don’t have place to put it. Do 
 you think I can implement it in ftrace? And do you think if there are better 
 solution?
 
 
 You want something to wake up if it takes too long before an event
 happens?
 
 If so, why not just use a select() on the trace_pipe and if it times
 out, then dump the trace. You can even set up a separate instance.
 
 (this is waiting for a schedule switch to pid 1)
 
 cd /sys/kernel/debug/tracing
 mkdir instances/mine
 echo 'next_pid == 1'  instances/mine/events/sched/sched_switch/filter
 echo 1  instances/mine/events/sched/sched_switch/enable
 
 
 The in a userspace program, I open instances/mine/events/trace_pipe
 and run a select() on that file descriptor with a given timeout. If the
 event does not happen within the expected time frame, the select
 returns zero, and this userspace program can deal with it.
 
 Is that the functionality you are trying to achieve?
 
 -- Steve

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


An small ftrace enhancement idea

2013-10-30 Thread Kenneth Lee
Dear Steven,

I want to add a new function to ftrace subsystem. Sometimes, we will face such 
a problem: system do not response to the input on time one to two times 
everyday. It is not easy to capture because it rarely happens. So I want to add 
a function to the kernel. If I have such problem, I insert a kernel module, who 
add a hook to the position that receive the input and another to the position 
that response to the input (with a session id if necessary). And I can compare 
the time between them and if the period is longer then a pre-set threshold, I 
can give a signal to a user helper application (maybe a script waiting on the 
file), which then can save the trace event to a file for later inspection.

 

The user helper script may look like this:

 

#/bin/sh

 

echo ‘sched:*’ > /sys/kernel/debug/tracing/set_event

modprobe delay_inspector.ko threshold=500

cat /sys/kernel/debug/tracing/waiter #wait for signal

cp /sys/kernel/debug/tracing/trace /var/log/delay_infomation

 

 

It looks like a standalone function. But I don’t have place to put it. Do you 
think I can implement it in ftrace? And do you think if there are better 
solution?

 

Thank you.

 

Kenneth Lee--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


An small ftrace enhancement idea

2013-10-30 Thread Kenneth Lee
Dear Steven,

I want to add a new function to ftrace subsystem. Sometimes, we will face such 
a problem: system do not response to the input on time one to two times 
everyday. It is not easy to capture because it rarely happens. So I want to add 
a function to the kernel. If I have such problem, I insert a kernel module, who 
add a hook to the position that receive the input and another to the position 
that response to the input (with a session id if necessary). And I can compare 
the time between them and if the period is longer then a pre-set threshold, I 
can give a signal to a user helper application (maybe a script waiting on the 
file), which then can save the trace event to a file for later inspection.

 

The user helper script may look like this:

 

#/bin/sh

 

echo ‘sched:*’  /sys/kernel/debug/tracing/set_event

modprobe delay_inspector.ko threshold=500

cat /sys/kernel/debug/tracing/waiter #wait for signal

cp /sys/kernel/debug/tracing/trace /var/log/delay_infomation

 

 

It looks like a standalone function. But I don’t have place to put it. Do you 
think I can implement it in ftrace? And do you think if there are better 
solution?

 

Thank you.

 

Kenneth Lee--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/