Re: [PATCH 0/4] bug fixes in ntb_hw_amd and ntb_perf

2019-02-15 Thread Allen Hubbe
On Fri, Feb 15, 2019 at 4:17 AM Mehta, Sanju  wrote:
>
> From: Sanjay R Mehta 
>
> Add bug fix for ntb_perf and ntb_hw_amd
>
> Sanjay R Mehta (4):
>   NTB: ntb_perf: Increased the number of message retries to 1000
>   NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers
>   NTB: ntb_perf: Clear stale values in doorbell and command SPAD
> register
>   NTB: ntb_hw_amd: set peer limit register

This series,
Acked-by: Allen Hubbe 

>
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  8 
>  drivers/ntb/test/ntb_perf.c | 14 +++---
>  2 files changed, 15 insertions(+), 7 deletions(-)
>
> --
> 2.7.4
>


Re: [PATCH] ntb: ntb_transport: Mark expected switch fall-throughs

2018-10-05 Thread Allen Hubbe
On Fri, Oct 5, 2018 at 3:12 AM Gustavo A. R. Silva
 wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
>
> Addresses-Coverity-ID: 1373888 ("Missing break in switch")
> Addresses-Coverity-ID: 1373889 ("Missing break in switch")
> Signed-off-by: Gustavo A. R. Silva 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/ntb_transport.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index 9398959..c643b9c 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -1278,6 +1278,7 @@ static void ntb_rx_copy_callback(void *data,
> case DMA_TRANS_READ_FAILED:
> case DMA_TRANS_WRITE_FAILED:
> entry->errors++;
> +   /* fall through */
> case DMA_TRANS_ABORTED:
> {
> struct ntb_transport_qp *qp = entry->qp;
> @@ -1533,6 +1534,7 @@ static void ntb_tx_copy_callback(void *data,
> case DMA_TRANS_READ_FAILED:
> case DMA_TRANS_WRITE_FAILED:
> entry->errors++;
> +   /* fall through */
> case DMA_TRANS_ABORTED:
> {
> void __iomem *offset =
> --
> 2.7.4


Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

2018-07-24 Thread Allen Hubbe
On Tue, Jul 24, 2018 at 1:37 PM Logan Gunthorpe  wrote:
> Not really. Given that we know there are only two peers, we always use
> the other side's doorbell register. You'd only use the nearby doorbell
> register if you wanted to trigger your own interrupt -- that would be
> weird and we don't really have the API sophistication to do that.
>
> If we wanted to support multiple peers with some number in crosslink
> then we'd need to revamp things _significantly_. In this case we'd have
> multiple doorbell registers which each apply to a different subset of
> peers. This gets _very_ complicated and hurts my head.

...huh, looks like peer index was omitted from ntb_peer_db_set and
friends.  Adding peer index there would make the interface consistent
with other ntb_peer functions.  Peer index would allow the hw driver
to select which doorbell register to use for each peer.  Adding a
ntb_peer_db_valid_bits to that would allow a subset of bits in the
shared register to be associated with each peer.

I think that's all that would need to change, not significantly more,
to support multiple doorbell registers associated with different
subsets of peers.  The complication would at least be hidden in the hw
driver, where it would need to maintain some mapping from peer index
to the right set of registers.

> But as I said,
> I'm not trying to add new functionality for multi-peer crosslink or
> anything like that. I'm just trying to fix the 2 crosslink peer case so
> it works like it did when it was originally merged.

I thought for sure ntb_peer_db_set already had peer index, and I was
wrong.  Go ahead with the change as in your patch, I won't force the
issue or that you to do that extra work and touch all the drivers
again for this.  It can be addressed when there is renued interest in
making things work more than one peer.

This patch, and the others in this series:
Acked-by: Allen Hubbe 


Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

2018-07-24 Thread Allen Hubbe
On Mon, Jul 23, 2018 at 12:08 PM Logan Gunthorpe  wrote:
> I don't think you'll ever have a case where two peers have the same
> index, as the index is really an abstract concept the hardware doesn't
> really know about.

That is the point of index, that there should never be two peers with
the same index, and also that the range of index values is bounded.
Port numbers are problematic, so I'm worried about the change to use
port number in the client drivers instead of using index.  For
example, this change assumes that the index value is < bits per long
long, because the value is used in BIT_ULL(port number).

Maybe I'm missing something...  In the crosslink case, there is
another doorbell register on the other side of the crosslink.  Whether
to use the nearby or via-crosslink doorbell depends on the peer
index... making assumptions about the hw driver, but is that about
right?  Then you are selecting bits in the doorbell register based on
port number, ok, that must be how the bits of the shared db register
are associated with ports in your configuration.  Maybe what's
actually needed is a ntb_peer_db_valid_mask(peer index), and if only
the port-numbered db bit (or any other combination) is valid for that
peer, so be it, that can be an implementation choice of the hw driver
and below.


Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

2018-07-23 Thread Allen Hubbe
On Fri, Jul 20, 2018 at 2:00 PM Logan Gunthorpe  wrote:
>
> This commit fixes pingpong support for existing drivers that do not
> implement ntb_default_port_number() and ntb_default_peer_port_number().
> This is required for hardware (like the crosslink topology of
> switchtec) which cannot assign reasonable port numbers to each port due
> to its perfect symmetry.
>
> Instead of picking the doorbell to use based on the the index of the
> peer, we use the peer's port number. This is a bit clearer and easier
> to understand.

Does this solve the issue where two of the the port numbers are the
same, because of symmetry over a crosslink?  I think the two ports
with the "same" number should be identified as different peer index,
even if the port numbers are the same.

Maybe the port of any peer connected over the crosslink should be the
local switch's crosslink port.  The local port number might be needed
to configure translation tables in the local switch.  If a globally
unique port number is needed, maybe encode a chip number in some high
bits of the port number?  If a locally unique port number is needed,
maybe encode a path, that could be useful for configuring address
translations across multiple crosslinks.  Encoding a path, then each
port will have a different number, depending on the perspective of the
source port, which could be confusing (already, peer index is local
perspective, so can cause the same kind of confusion).  IMO port
number can be anything useful for specific ntb driver and devices, or
maybe just be informative, but peer index should be useful for ntb
client drivers.

> Fixes: c7aeb0afdcc2 ("NTB: ntb_pp: Add full multi-port NTB API support")
> Signed-off-by: Logan Gunthorpe 
> ---
>  drivers/ntb/test/ntb_pingpong.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
> index 65865e460ab8..18d00eec7b02 100644
> --- a/drivers/ntb/test/ntb_pingpong.c
> +++ b/drivers/ntb/test/ntb_pingpong.c
> @@ -121,15 +121,14 @@ static int pp_find_next_peer(struct pp_ctx *pp)
> link = ntb_link_is_up(pp->ntb, NULL, NULL);
>
> /* Find next available peer */
> -   if (link & pp->nmask) {
> +   if (link & pp->nmask)
> pidx = __ffs64(link & pp->nmask);
> -   out_db = BIT_ULL(pidx + 1);

Without +1 here, does this ring the same bell again?

> -   } else if (link & pp->pmask) {
> +   else if (link & pp->pmask)
> pidx = __ffs64(link & pp->pmask);
> -   out_db = BIT_ULL(pidx);
> -   } else {
> +   else
> return -ENODEV;
> -   }
> +
> +   out_db = BIT_ULL(ntb_peer_port_number(pp->ntb, pidx));

Can it not be made to work with peer index?

>
> spin_lock(&pp->lock);
> pp->out_pidx = pidx;
> @@ -303,7 +302,7 @@ static void pp_init_flds(struct pp_ctx *pp)
> break;
> }
>
> -   pp->in_db = BIT_ULL(pidx);
> +   pp->in_db = BIT_ULL(lport);
> pp->pmask = GENMASK_ULL(pidx, 0) >> 1;
> pp->nmask = GENMASK_ULL(pcnt - 1, pidx);
>
> @@ -435,4 +434,3 @@ static void __exit pp_exit(void)
> debugfs_remove_recursive(pp_dbgfs_topdir);
>  }
>  module_exit(pp_exit);
> -
> --
> 2.11.0


Re: [PATCH 0/8] Fix breakage caused by the NTB multi-port patchset

2018-06-09 Thread Allen Hubbe
On Fri, Jun 8, 2018 at 8:08 PM, Logan Gunthorpe  wrote:
> Hey,
>
> Here are all the fixes required to get ntb_test on switchtec working
> again after the multi-port test patches were merged.
>
> I'd appreciate it if future changes can be a) more careful about
> not breaking things, b) communicated more clearly so that better
> review can be done, and c) not merged until sufficient review actually
> is done.
>
> Note, I sent the first patch in this series earlier; please disregard
> the earlier one.
>
> Thanks,
>
> Logan
>
> Logan Gunthorpe (8):
>   NTB: ntb_tool: reading the link file should not end in a NULL byte
>   NTB: Setup the DMA mask globally for all drivers
>   NTB: Fix the default port and peer numbers for legacy drivers
>   NTB: ntb_pingpong: Choose doorbells based on port number
>   NTB: perf: Don't require one more memory window than number of peers
>   NTB: perf: Fix support for hardware that doesn't have port numbers
>   NTB: perf: Fix race condition when run with ntb_test
>   NTB: ntb_test: Fix bug when counting remote files

Thanks Logan.

Series:
Acked-by: Allen Hubbe 

>
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  4 
>  drivers/ntb/hw/idt/ntb_hw_idt.c |  6 --
>  drivers/ntb/hw/intel/ntb_hw_intel.c |  4 
>  drivers/ntb/ntb.c   | 22 ++
>  drivers/ntb/test/ntb_perf.c | 22 +++---
>  drivers/ntb/test/ntb_pingpong.c | 14 ++
>  drivers/ntb/test/ntb_tool.c |  3 +--
>  tools/testing/selftests/ntb/ntb_test.sh |  2 +-
>  8 files changed, 41 insertions(+), 36 deletions(-)
>
> --
> 2.11.0


[PATCH] MAINTAINERS: NTB: Update contact info

2017-12-12 Thread Allen Hubbe
I am no longer employed by Dell EMC.  For the purposes of NTB driver
development and maintenance, please contact me via my personal email.

Signed-off-by: Allen Hubbe 
Signed-off-by: Allen Hubbe 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 82ad0eabce4f..cb7344e37fc5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9712,7 +9712,7 @@ F:drivers/ntb/hw/amd/
 NTB DRIVER CORE
 M: Jon Mason 
 M: Dave Jiang 
-M: Allen Hubbe 
+M: Allen Hubbe 
 L: linux-...@googlegroups.com
 S: Supported
 W: https://github.com/jonmason/ntb/wiki
-- 
2.14.1



RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

2017-12-11 Thread Allen Hubbe
From: Logan Gunthorpe
> On 11/12/17 12:17 PM, Allen Hubbe wrote:
> >> mw_get_align doesn't communicate the fact that the buffer has to be
> >> aligned by its size.
> >
> > Is that not the purpose of the addr_align out parameter of 
> > ntb_mw_get_align()?
> 
> addr_align provides the minimum alignment required by the device but it
> has no idea how big a buffer the caller is trying to create so it can't
> express that it needs to be aligned by its size.
> 
> To be clear, the minimum alignment the Switchtec device requires is 4KB
> so it will return 4k in addr_align. Thus, if you have a 4KB buffer it
> may be aligned to 4KB. But if you have a 1MB buffer it must be aligned
> to the nearest 1M.

In switchtec_ntb_mw_get_align, for the lut case it seems to require alignment 
the same as Intel, aligned to mw size, but for the non-lut case you are saying 
that SZ_4K is not necessarily correct.  The SZ_4K is the minimum, but the 
actual alignment restriction depends on the size of the buffer actually 
translated.  Right?

Also, for the lut case, it looks like the size also has to be the same size as 
the mw.  So, a client can't allocate a smaller buffer, assume we can get one 
that is aligned, point the start of the mw at it, and limit the size of the mw?

For the non-lut case I wonder, with the restriction that addr needs to be 
aligned to the size of the buffer, does the size of the buffer also need to be 
some power of two?  That would make sense, if it determines the alignment.  If 
so, SZ_4K wouldn't be correct for size_align, either.

Do you need the intended buffer size passed in as another parameter to 
ntb_mw_get_align?  The point of ntb_mw_get_align is to figure out all the 
alignment restrictions before allocating memory.

> >> It may also be that all hardware does not have this
> >> restriction (ie. if the hardware adds to the base address instead of
> >> just replacing the lower bits).
> >>
> >> There is definitely a need to print this error somewhere as I hit this
> >> case and it caused very weird behavior. It was a huge pain to debug.
> >> Also, it's a security issue and huge bug if we end up mapping the memory
> >> we didn't think we were mapping.
> >
> > Of course the driver should validate its parameters not allow bad mappings. 
> >  I was only commenting
> on the dev_err() message to the console.
> 
> Ok. I still feel like it would be difficult to debug if ntb_transport
> simply was unable to establish a connection without some message in
> dmesg telling the user why.
> 
> Also, keep in mind this is a somewhat unusual occurrence. In most cases
> dma_alloc_coherent() always provides a buffer that is aligned to it's
> size. It's just that the CMA (if used) provides a tunable config option
> which allows for larger buffers to not be aligned to their size.
> 
> Logan



RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

2017-12-11 Thread Allen Hubbe
From: Logan Gunthorpe

> mw_get_align doesn't communicate the fact that the buffer has to be
> aligned by its size.

Is that not the purpose of the addr_align out parameter of ntb_mw_get_align()?

> It may also be that all hardware does not have this
> restriction (ie. if the hardware adds to the base address instead of
> just replacing the lower bits).
> 
> There is definitely a need to print this error somewhere as I hit this
> case and it caused very weird behavior. It was a huge pain to debug.
> Also, it's a security issue and huge bug if we end up mapping the memory
> we didn't think we were mapping.

Of course the driver should validate its parameters not allow bad mappings.  I 
was only commenting on the dev_err() message to the console.

What makes best sense for client drivers with regards to ntb api changes is a 
fair argument.  Let's see what others say.

> I don't think it's a good idea for us
> to require clients to check this as that requires a number of checks and
> a client author may forget to add it to their driver. I'd maybe go with
> a check in ntb_mw_set_trans before calling the driver, but that only
> makes sense if all hardware has the same requirement.
> 
> Logan



RE: [PATCH 2/2] ntb_hw_switchtec: Check for alignment of the buffer in mw_set_trans()

2017-12-11 Thread Allen Hubbe
From: Logan Gunthorpe
> With Switchtec hardware, the buffer used for a memory window must be
> aligned to its size (the hardware only replaces the lower bits). In
> certain circumstances dma_alloc_coherent() will not provide a buffer
> that adheres to this requirement like when using the CMA and
> CONFIG_CMA_ALIGNMENT is set lower than the buffer size.
> 
> When we get an unaligned buffer mw_set_trans() should return an error.
> We also log an error so we know the cause of the problem.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Jon Mason 
> ---
>  drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 
> b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> index 709f37fbe232..984b83bc7dd3 100644
> --- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> +++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
> @@ -315,6 +315,19 @@ static int switchtec_ntb_mw_set_trans(struct ntb_dev 
> *ntb, int pidx, int widx,
>   if (xlate_pos < 12)
>   return -EINVAL;
> 
> + if (addr & ((1 << xlate_pos) - 1)) {

!IS_ALIGNED(addr, BIT_ULL(xlate_pos))

> + /*
> +  * In certain circumstances we can get a buffer that is
> +  * not aligned to its size. (Most of the time
> +  * dma_alloc_coherent ensures this). This can happen when
> +  * using large buffers allocated by the CMA
> +  * (see CMA_CONFIG_ALIGNMENT)
> +  */
> + dev_err(&sndev->stdev->dev,
> + "ERROR: Memory window address is not aligned to it's 
> size!\n");

This would be the only ntb hw driver that prints an error in this situation.  
The ntb_mw_get_align() should provide enough information to client drivers to 
determine the alignment requirements before calling ntb_mw_set_trans().  IMO no 
need to print here, but let's see what others say.

> + return -EINVAL;
> + }
> +
>   rc = switchtec_ntb_part_op(sndev, ctl, NTB_CTRL_PART_OP_LOCK,
>  NTB_CTRL_PART_STATUS_LOCKED);
>   if (rc)
> --
> 2.11.0



RE: [PATCH 1/2] ntb_transport: Fix bug with max_mw_size parameter

2017-12-11 Thread Allen Hubbe
From: Logan Gunthorpe
> When using the max_mw_size parameter of ntb_transport to limit the size of
> the Memory windows, communication cannot be established and the queues
> freeze.
> 
> This is because the mw_size that's reported to the peer is correctly
> limited but the size used locally is not. So the MW is initialized
> with a buffer smaller than the window but the TX side is using the
> full window. This means the TX side will be writing to a region of the
> window that points nowhere.
> 
> This is easily fixed by applying the same limit to tx_size in
> ntb_transport_init_queue().
> 
> Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
> Signed-off-by: Logan Gunthorpe 
> Cc: Jon Mason 
> Cc: Dave Jiang 
> Cc: Allen Hubbe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/ntb_transport.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index 045e3dd4750e..9878c48826e3 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -1003,6 +1003,9 @@ static int ntb_transport_init_queue(struct 
> ntb_transport_ctx *nt,
>   mw_base = nt->mw_vec[mw_num].phys_addr;
>   mw_size = nt->mw_vec[mw_num].phys_size;
> 
> + if (max_mw_size && mw_size > max_mw_size)
> + mw_size = max_mw_size;
> +
>   tx_size = (unsigned int)mw_size / num_qps_mw;
>   qp_offset = tx_size * (qp_num / mw_count);
> 
> --
> 2.11.0



RE: [PATCH v2 00/15] NTB: Add full multi-port API support to the test drivers

2017-12-04 Thread Allen Hubbe
From: Serge Semin
> The multi-port NTB API was introduced in kernel 4.13 as well as the
> first driver for the true multi-port devices of IDT PCIe-switches
> series. But the test drivers still were left almost unchanged. Yes,
> they didn't fail being used with new NTB API, but they only worked
> with two-ports NTB devices. This patchset is intended to fix the
> issue, by amending the NTB test drivers and script so they would be
> fully compatible with multi-port NTB API.
> 
> Additionally I found a few NTB subsystem issues while developing the
> submitted patches. So they are also fixed in this patchset

Thanks for bringing the multiport support to the ntb tools and tests, and 
getting it all tested with help of @pallam.  I wondered about setting the dma 
mask on the ntb device object, and it is better now that you have done that.  
In addition to the contact information to the file comments, MODULE_AUTHOR can 
also be specified more than once per module.

As Logan said, some of the renaming was not really necessary and made those 
patches more noisy than the needed to be.  I am not as much bothered by it, but 
it is a valid criticism.

I can't find an earlier comment I thought I had made regarding the changes in 
ntb_tool.  Maybe it was only on IRC.  I think the use of the anonymous unions 
tool_mw is confusing.  It seems to require the reader to first understand how 
local and peer mw setup works, and then see that the anonymous unions are 
accessed by the proper member names depending on the situation.  In my review, 
the use of those members appears to be correct in your code.  Since these 
drivers are also intended to be example code, I would have preferred distinct 
types instead of a common type with anonymous union members.  Distinct types 
would make type checking by the compiler more effective at catching improper 
use, and I think it would make the code more clear in its use as an example ntb 
driver.  Also, there might be some confusion in the naming of members "mw", 
since it might refer to a tool_mw or a tool_mw_wrap, and the name alone doesn't 
disclose the type, that requires reading and understanding the surrounding 
context in the code.

I am satisfied with these patches.  This is important for getting the multiport 
support established, and other than some comments about style and presentation 
of the patch set, nothing looks obviously wrong.  You should probably also seek 
Dave's ack on at least ntb_perf.

Acked-by: Allen Hubbe 

> Serge Semin (15):
>   NTB: Rename NTB messaging API methods
>   NTB: Set dma mask and dma coherent mask to NTB devices
>   NTB: Fix UB/bug in ntb_mw_get_align()
>   NTB: ntb_pp: Add full multi-port NTB API support
>   NTB: ntb_tool: Add full multi-port NTB API support
>   NTB: ntb_perf: Add full multi-port NTB API support
>   NTB: ntb_test: Safely use paths with whitespace
>   NTB: ntb_test: Add ntb_tool port tests
>   NTB: ntb_test: Update ntb_tool link tests
>   NTB: ntb_test: Update ntb_tool DB tests
>   NTB: ntb_test: Update ntb_tool Scratchpad tests
>   NTB: ntb_test: Add ntb_tool Message tests
>   NTB: ntb_test: Update ntb_tool MW tests
>   NTB: ntb_test: Update ntb_perf tests
>   NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology
> 
>  drivers/ntb/hw/amd/ntb_hw_amd.c |4 +
>  drivers/ntb/hw/idt/ntb_hw_idt.c |   37 +-
>  drivers/ntb/hw/intel/ntb_hw_intel.c |4 +
>  drivers/ntb/ntb.c   |1 -
>  drivers/ntb/test/ntb_perf.c | 1826 
> +--
>  drivers/ntb/test/ntb_pingpong.c |  450 +---
>  drivers/ntb/test/ntb_tool.c | 1805 --
>  include/linux/ntb.h |   36 +-
>  tools/testing/selftests/ntb/ntb_test.sh |  307 --
>  9 files changed, 3013 insertions(+), 1457 deletions(-)
> 
> --
> 2.12.0



RE: [PATCH v3 14/16] switchtec_ntb: implement scratchpad registers

2017-08-02 Thread Allen Hubbe
From: Logan Gunthorpe
> On 01/08/17 01:10 PM, Jon Mason wrote:
> > It would probaly be better if I remarked about the SPADs in the actual
> > patch about the SPADS :)
> >
> > The whole point of using the SPADs in the NTB driver was to workaround
> > the problems establishing a connection between the two sides of the
> > NTB and where everything lives.  So, using a MW to get around the
> > SPADs is sort of backwards (and slightly funny).  I realize you are
> > trying to use the existing transport with minimal changes to enable
> > your hardware, and thus this makes logical sense to you.  However, if
> > the SPADs are not really needed, then we should either remove them
> > from the transport (or use them for something else).
> >
> > Per my comment in the other patch, I'm amenable to take this series
> > as-is, assuming you are willing to address this design issue in the
> > near future.  Thoughts?
> 
> Yes, I agree. I'd be willing to help but it seems the clients are
> written the way they are for the other drivers, so it's their needs
> (which I'm not fully aware of) that have to be considered.

The proposed change, removing use of spads from transport, would not affect 
ntrdma.

> I've also made all the other changes you sent as well as the file rename
> Dave requested. Once I see the bug fix patch you were going to pull hit
> ntb-next I'll rebase, test and resubmit.
> 
> Thanks,
> 
> Logan



RE: [PATCH v3 00/16] Switchtec NTB Support

2017-07-26 Thread Allen Hubbe
From: Logan Gunthorpe
> Changes since v2:
> 
> - Reordered the ntb_test link patch per Allen
> - Removed an extra call to switchtec_ntb_init_mw
> - Fixed a typo in the switchtec.txt documentation.

Patches 5..16 (also 5 [was 6], and 14, objections notwithstanding):

Acked-by: Allen Hubbe 

> --
> 
> Changes since v1:
> 
> - Rebased onto latest ntb-next branch (with v4.13-rc1)
> - Reworked ntb_mw_count() function so that it can be called all the
>   time (per discussion with Allen)
> - Various spelling and formatting cleanups from Bjorn
> - Added request_module() call such that the NTB module is automatically
>   loaded when appropriate hardware exists.
> 
> --
> 
> Changes since the rfc:
> 
> - Rebased on ntb-next
> - Switched ntb_part_op to use sleep instead of delay
> - Dropped a number of useless dbg __func__ prints
> - Went back to the dynamic instead of the static class
> - Swapped the notifier block for a simple callback
> - Modified the new ntb api so that a couple functions with pidx
>   now must be called after link up. Per our discussion on the list.
> 
> --
> 
> This patchset implements Non-Transparent Bridge (NTB) support for the
> Microsemi Switchtec series of switches. We're looking for some
> review from the community at this point but hope to get it upstreamed
> for v4.14.
> 
> Switchtec NTB support is configured over the same function and bar
> as the management endpoint. Thus, the new driver hooks into the
> management driver which we had merged in v4.12. We use the class
> interface API to register an NTB device for every switchtec device
> which supports NTB (not all do).
> 
> The Switchtec hardware supports doorbells, memory windows and messages.
> Seeing there is no native scratchpad support, 128 spads are emulated
> through the use of a pre-setup memory window. The switch has 64
> doorbells which are shared between the two partitions and a
> configurable set of memory windows. While the hardware supports more
> than 2 partitions, this driver only supports the first two seeing
> the current NTB API only supports two hosts.
> 
> The driver has been tested with ntb_netdev and fully passes the
> ntb_test script.
> 
> This patchset is based off of ntb-next and can be found in this
> git repo:
> 
> https://github.com/sbates130272/linux-p2pmem.git switchtec_ntb_v3
> 
> *** BLURB HERE ***
> 
> Logan Gunthorpe (16):
>   switchtec: move structure definitions into a common header
>   switchtec: export class symbol for use in upper layer driver
>   switchtec: add NTB hardware register definitions
>   switchtec: add link event notifier callback
>   ntb: ntb_test: ensure the link is up before trying to configure the
> mws
>   ntb: ensure ntb_mw_get_align() is only called when the link is up
>   ntb: add check and comment for link up to mw_count() and
> mw_get_align()
>   switchtec_ntb: introduce initial NTB driver
>   switchtec_ntb: initialize hardware for memory windows
>   switchtec_ntb: initialize hardware for doorbells and messages
>   switchtec_ntb: add skeleton NTB driver
>   switchtec_ntb: add link management
>   switchtec_ntb: implement doorbell registers
>   switchtec_ntb: implement scratchpad registers
>   switchtec_ntb: add memory window support
>   switchtec_ntb: update switchtec documentation with notes for NTB
> 
>  Documentation/switchtec.txt |   12 +
>  MAINTAINERS |2 +
>  drivers/ntb/hw/Kconfig  |1 +
>  drivers/ntb/hw/Makefile |1 +
>  drivers/ntb/hw/mscc/Kconfig |9 +
>  drivers/ntb/hw/mscc/Makefile|1 +
>  drivers/ntb/hw/mscc/switchtec_ntb.c | 1211 
> +++
>  drivers/ntb/ntb_transport.c |   20 +-
>  drivers/ntb/test/ntb_perf.c |   18 +-
>  drivers/ntb/test/ntb_tool.c |6 +-
>  drivers/pci/switch/switchtec.c  |  316 ++--
>  include/linux/ntb.h |   11 +-
>  include/linux/switchtec.h   |  373 ++
>  tools/testing/selftests/ntb/ntb_test.sh |4 +
>  14 files changed, 1702 insertions(+), 283 deletions(-)
>  create mode 100644 drivers/ntb/hw/mscc/Kconfig
>  create mode 100644 drivers/ntb/hw/mscc/Makefile
>  create mode 100644 drivers/ntb/hw/mscc/switchtec_ntb.c
>  create mode 100644 include/linux/switchtec.h
> 
> --
> 2.11.0



RE: [PATCH v2 00/16] Switchtec NTB Support

2017-07-18 Thread Allen Hubbe
From: Logan Gunthorpe
> Changes since v1:
> 
> - Rebased onto latest ntb-next branch (with v4.13-rc1)
> - Reworked ntb_mw_count() function so that it can be called all the
>   time (per discussion with Allen)
> - Various spelling and formatting cleanups from Bjorn
> - Added request_module() call such that the NTB module is automatically
>   loaded when appropriate hardware exists.

Patches 5..16:

Acked-by: Allen Hubbe 

Should the order of 6 and 7 be swapped?



While I still have some objections to this series, we have already been over 
them, and I won't let these stand in the way:

6 - I think just the comment is best.  Rather than prohibit the use of 
functionality for hardware that does support the calls, in my opinion it should 
be left to specific hardware drivers to return an error.

14 - I would prefer that a non-hardware-supported implementation of spads via a 
memory window should be common code, not reinvented in each specific hardware 
driver that lacks hardware spads.  You pointed out that the spads implemented 
here use the shared_mw construct, which is specific to this driver.  I am 
concerned that spads in the shared_mw (really anything in shared_mw, including 
this driver's indication of the peer link state) limits the applicability of 
this driver to just configurations of two ntb ports.  You stated this 
limitation upfront.

> 
> --
> 
> Changes since the rfc:
> 
> - Rebased on ntb-next
> - Switched ntb_part_op to use sleep instead of delay
> - Dropped a number of useless dbg __func__ prints
> - Went back to the dynamic instead of the static class
> - Swapped the notifier block for a simple callback
> - Modified the new ntb api so that a couple functions with pidx
>   now must be called after link up. Per our discussion on the list.
> 
> --
> 
> This patchset implements Non-Transparent Bridge (NTB) support for the
> Microsemi Switchtec series of switches. We're looking for some
> review from the community at this point but hope to get it upstreamed
> for v4.14.
> 
> Switchtec NTB support is configured over the same function and bar
> as the management endpoint. Thus, the new driver hooks into the
> management driver which we had merged in v4.12. We use the class
> interface API to register an NTB device for every switchtec device
> which supports NTB (not all do).
> 
> The Switchtec hardware supports doorbells, memory windows and messages.
> Seeing there is no native scratchpad support, 128 spads are emulated
> through the use of a pre-setup memory window. The switch has 64
> doorbells which are shared between the two partitions and a
> configurable set of memory windows. While the hardware supports more
> than 2 partitions, this driver only supports the first two seeing
> the current NTB API only supports two hosts.
> 
> The driver has been tested with ntb_netdev and fully passes the
> ntb_test script.
> 
> This patchset is based off of ntb-next and can be found in this
> git repo:
> 
> https://github.com/sbates130272/linux-p2pmem.git switchtec_ntb_v2
> 
> Logan Gunthorpe (16):
>   switchtec: move structure definitions into a common header
>   switchtec: export class symbol for use in upper layer driver
>   switchtec: add NTB hardware register definitions
>   switchtec: add link event notifier callback
>   ntb: ensure ntb_mw_get_align() is only called when the link is up
>   ntb: add check and comment for link up to mw_count() and
> mw_get_align()
>   ntb: ntb_test: ensure the link is up before trying to configure the
> mws
>   switchtec_ntb: introduce initial NTB driver
>   switchtec_ntb: initialize hardware for memory windows
>   switchtec_ntb: initialize hardware for doorbells and messages
>   switchtec_ntb: add skeleton NTB driver
>   switchtec_ntb: add link management
>   switchtec_ntb: implement doorbell registers
>   switchtec_ntb: implement scratchpad registers
>   switchtec_ntb: add memory window support
>   switchtec_ntb: update switchtec documentation with notes for NTB
> 
>  Documentation/switchtec.txt |   12 +
>  MAINTAINERS |2 +
>  drivers/ntb/hw/Kconfig  |1 +
>  drivers/ntb/hw/Makefile |1 +
>  drivers/ntb/hw/mscc/Kconfig |9 +
>  drivers/ntb/hw/mscc/Makefile|1 +
>  drivers/ntb/hw/mscc/switchtec_ntb.c | 1213 
> +++
>  drivers/ntb/ntb_transport.c |   20 +-
>  drivers/ntb/test/ntb_perf.c |   18 +-
>  drivers/ntb/test/ntb_tool.c |6 +-
>  drivers/pci/switch/switchtec.c  |  316 ++--
>  include/linux/ntb.h |   11 +-
>  include/linux/switchtec.h   |  373 ++
>  tools

RE: [PATCH v4 4/5] ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks

2017-07-18 Thread Allen Hubbe
From: Logan Gunthorpe
> Now that ioread64 and iowrite64 are available in io-64-nonatomic,
> we can remove the hack at the top of ntb_hw_intel.c and replace it
> with an include.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Jon Mason 
> Cc: Allen Hubbe 
> Acked-by: Dave Jiang 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 30 +-
>  1 file changed, 1 insertion(+), 29 deletions(-)
> 
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 2557e2c05b90..606c90f59d4b 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "ntb_hw_intel.h"
> 
> @@ -155,35 +156,6 @@ MODULE_PARM_DESC(xeon_b2b_dsd_bar5_addr32,
>  static inline enum ntb_topo xeon_ppd_topo(struct intel_ntb_dev *ndev, u8 
> ppd);
>  static int xeon_init_isr(struct intel_ntb_dev *ndev);
> 
> -#ifndef ioread64
> -#ifdef readq
> -#define ioread64 readq
> -#else
> -#define ioread64 _ioread64
> -static inline u64 _ioread64(void __iomem *mmio)
> -{
> - u64 low, high;
> -
> - low = ioread32(mmio);
> - high = ioread32(mmio + sizeof(u32));
> - return low | (high << 32);
> -}
> -#endif
> -#endif
> -
> -#ifndef iowrite64
> -#ifdef writeq
> -#define iowrite64 writeq
> -#else
> -#define iowrite64 _iowrite64
> -static inline void _iowrite64(u64 val, void __iomem *mmio)
> -{
> - iowrite32(val, mmio);
> - iowrite32(val >> 32, mmio + sizeof(u32));
> -}
> -#endif
> -#endif
> -
>  static inline int pdev_is_atom(struct pci_dev *pdev)
>  {
>   switch (pdev->device) {
> --
> 2.11.0



RE: [PATCH 06/16] ntb: add check and comment for link up to mw_count and mw_get_align

2017-06-29 Thread Allen Hubbe
Let me try that again...

From: Hubbe, Allen
> From: Logan Gunthorpe
> > On 6/29/2017 2:13 PM, Allen Hubbe wrote:
> > > Unfortunately, it is to work around hardware errata.  That is not so 
> > > trivial to fix.
> >
> > Can you describe more what the work around is doing? Can you share the
> > code? It seems odd that a workaround is based on the alignment
> > restrictions of the mws.
> 
> Sure, while not making any claim that this is ready for upstream.
> 
> It is not a workaround for alignment restrictions of the mws.  It is a 
> restriction to avoid the use of
> doorbells and scratchpads.  Memory windows are used exclusively.
> 
> Read msi-x local  and send that to the peer:

https://github.com/ntrdma/ntrdma/blob/master/drivers/ntc/ntc_ntb_msi.c#L583

> Transform peer's addr to the memory window region:

https://github.com/ntrdma/ntrdma/blob/master/drivers/ntc/ntc_ntb_msi.c#L603

> Append a dma immediate value operation after other operations, to write the 
> data at addr:

https://github.com/ntrdma/ntrdma/blob/master/drivers/ntc/ntc_ntb_msi.c#L1195

> 
> Above describes the workaround.
> 
> Used in the context of a rdma-over-ntb driver here:
> https://github.com/ntrdma/ntrdma/blob/master/drivers/infiniband/hw/ntrdma/ntrdma_qp.c#L1585
> 
> >
> > Logan



RE: [PATCH 06/16] ntb: add check and comment for link up to mw_count and mw_get_align

2017-06-29 Thread Allen Hubbe
From: Logan Gunthorpe
> On 6/29/2017 12:11 PM, Allen Hubbe wrote:
> > Nak.  This breaks a work around for stability issues on some hardware.  I 
> > am ok with changing the
> comment to say, this is only supported when called after link up.  I would 
> still like to allow these
> to be called at any time.  Specific hardware drivers like Switchtec may 
> return an error.  Upstream
> drivers, of course, should call these after link up: patch 5/16 part of this 
> set looks good.
> 
> If absolutely necessary I can leave this out. But in terms of interface
> design it's _so_ much better to have it in. This change would bring the
> score from a 3 to a 5 on Rusty Russel's Hard to Misuse ranking[1]. To
> quote Rusty:
> 
> "3. Read the documentation and you'll get it right.
> People only read instructions after they've already tied themselves into
> a knot. Then they skim them for keywords and don't read your warnings. I
> don't give an example of this; if this is the best an interface can get
> do, it's in trouble."
> 
> Can someone not just fix the out-of-tree code? And since when is
> out-of-tree code reasonable justification for what's done in upstream?

Unfortunately, it is to work around hardware errata.  That is not so trivial to 
fix.

The workaround in software is also not acceptable upstream, for doing things 
like writing directly to the peer's interrupt controller registers, bypassing 
the ntb doorbells.

> 
> Logan
> 
> [1]http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html



RE: [PATCH 06/16] ntb: add check and comment for link up to mw_count and mw_get_align

2017-06-29 Thread Allen Hubbe
From: Logan Gunthorpe
> This patch adds a comment and a check to the ntb_mw_get_align and
> ntb_mw_count functions so that they always fail if the function is
> called before the link is up.
> 
> This is to prevent accidental mis-use in clients that are testing
> on hardware that this doesn't matter for.
> 
> Signed-off-by: Logan Gunthorpe 

Nak.  This breaks a work around for stability issues on some hardware.  I am ok 
with changing the comment to say, this is only supported when called after link 
up.  I would still like to allow these to be called at any time.  Specific 
hardware drivers like Switchtec may return an error.  Upstream drivers, of 
course, should call these after link up: patch 5/16 part of this set looks good.

> ---
>  include/linux/ntb.h | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 609e232c00da..de4f800c82b6 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -730,12 +730,16 @@ static inline int ntb_link_disable(struct ntb_dev *ntb)
>   * Hardware and topology may support a different number of memory windows.
>   * Moreover different peer devices can support different number of memory
>   * windows. Simply speaking this method returns the number of possible 
> inbound
> - * memory windows to share with specified peer device.
> + * memory windows to share with specified peer device. Note: this must only 
> be
> + * called when the link is up.
>   *
>   * Return: the number of memory windows.
>   */
>  static inline int ntb_mw_count(struct ntb_dev *ntb, int pidx)
>  {
> + if (!(ntb_link_is_up(ntb, NULL, NULL) & (1 << pidx)))
> + return 0;
> +
>   return ntb->ops->mw_count(ntb, pidx);
>  }
> 
> @@ -751,7 +755,7 @@ static inline int ntb_mw_count(struct ntb_dev *ntb, int 
> pidx)
>   * Get the alignments of an inbound memory window with specified index.
>   * NULL may be given for any output parameter if the value is not needed.
>   * The alignment and size parameters may be used for allocation of proper
> - * shared memory.
> + * shared memory. Note: this must only be called when the link is up.
>   *
>   * Return: Zero on success, otherwise a negative error number.
>   */
> @@ -760,6 +764,9 @@ static inline int ntb_mw_get_align(struct ntb_dev *ntb, 
> int pidx, int widx,
>  resource_size_t *size_align,
>  resource_size_t *size_max)
>  {
> + if (!(ntb_link_is_up(ntb, NULL, NULL) & (1 << pidx)))
> + return -ENOTCONN;
> +
>   return ntb->ops->mw_get_align(ntb, pidx, widx, addr_align, size_align,
> size_max);
>  }
> --
> 2.11.0



RE: [PATCH 05/16] ntb: ensure ntb_mw_get_align is only called when the link is up

2017-06-29 Thread Allen Hubbe
From: Logan Gunthorpe
> With switchtec hardware it's impossible to get the alignment parameters
> for a peer's memory window until the peer's driver has configured it's
> windows. Strictly speaking, the link doesn't have to be up for this,
> but the link being up is the only way the client can tell that
> the otherside has been configured.
> 
> This patch converts ntb_transport and ntb_perf to use this function after
> the link goes up. This simplifies these clients slightly because they
> no longer have to store the alignment parameters. It also tweaks
> ntb_tool so that peer_mw_trans will print zero if it is run before
> the link goes up.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/ntb_transport.c | 20 ++--
>  drivers/ntb/test/ntb_perf.c | 18 +-
>  drivers/ntb/test/ntb_tool.c |  6 +++---
>  3 files changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index b29558ddfe95..7ed745aaa213 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -191,8 +191,6 @@ struct ntb_transport_qp {
>  struct ntb_transport_mw {
>   phys_addr_t phys_addr;
>   resource_size_t phys_size;
> - resource_size_t xlat_align;
> - resource_size_t xlat_align_size;
>   void __iomem *vbase;
>   size_t xlat_size;
>   size_t buff_size;
> @@ -687,13 +685,20 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int 
> num_mw,
>   struct ntb_transport_mw *mw = &nt->mw_vec[num_mw];
>   struct pci_dev *pdev = nt->ndev->pdev;
>   size_t xlat_size, buff_size;
> + resource_size_t xlat_align;
> + resource_size_t xlat_align_size;
>   int rc;
> 
>   if (!size)
>   return -EINVAL;
> 
> - xlat_size = round_up(size, mw->xlat_align_size);
> - buff_size = round_up(size, mw->xlat_align);
> + rc = ntb_mw_get_align(nt->ndev, PIDX, num_mw, &xlat_align,
> +   &xlat_align_size, NULL);
> + if (rc)
> + return rc;
> +
> + xlat_size = round_up(size, xlat_align_size);
> + buff_size = round_up(size, xlat_align);
> 
>   /* No need to re-setup */
>   if (mw->xlat_size == xlat_size)
> @@ -722,7 +727,7 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int 
> num_mw,
>* is a requirement of the hardware. It is recommended to setup CMA
>* for BAR sizes equal or greater than 4MB.
>*/
> - if (!IS_ALIGNED(mw->dma_addr, mw->xlat_align)) {
> + if (!IS_ALIGNED(mw->dma_addr, xlat_align)) {
>   dev_err(&pdev->dev, "DMA memory %pad is not aligned\n",
>   &mw->dma_addr);
>   ntb_free_mw(nt, num_mw);
> @@ -1106,11 +,6 @@ static int ntb_transport_probe(struct ntb_client 
> *self, struct ntb_dev *ndev)
>   for (i = 0; i < mw_count; i++) {
>   mw = &nt->mw_vec[i];
> 
> - rc = ntb_mw_get_align(ndev, PIDX, i, &mw->xlat_align,
> -   &mw->xlat_align_size, NULL);
> - if (rc)
> - goto err1;
> -
>   rc = ntb_peer_mw_get_addr(ndev, i, &mw->phys_addr,
> &mw->phys_size);
>   if (rc)
> diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
> index 759f772fa00c..427112cf101a 100644
> --- a/drivers/ntb/test/ntb_perf.c
> +++ b/drivers/ntb/test/ntb_perf.c
> @@ -108,8 +108,6 @@ MODULE_PARM_DESC(on_node, "Run threads only on NTB device 
> node (default: true)")
>  struct perf_mw {
>   phys_addr_t phys_addr;
>   resource_size_t phys_size;
> - resource_size_t xlat_align;
> - resource_size_t xlat_align_size;
>   void __iomem*vbase;
>   size_t  xlat_size;
>   size_t  buf_size;
> @@ -472,13 +470,20 @@ static int perf_set_mw(struct perf_ctx *perf, 
> resource_size_t size)
>  {
>   struct perf_mw *mw = &perf->mw;
>   size_t xlat_size, buf_size;
> + resource_size_t xlat_align;
> + resource_size_t xlat_align_size;
>   int rc;
> 
>   if (!size)
>   return -EINVAL;
> 
> - xlat_size = round_up(size, mw->xlat_align_size);
> - buf_size = round_up(size, mw->xlat_align);
> + rc = ntb_mw_get_align(perf->ntb, PIDX, 0, &xlat_align,
> +   &xlat_align_size, NULL);
> + if (rc)
> + return rc;
> +
> + xlat_size = round_up(size, xlat_align_si

RE: [PATCH 12/16] switchtec_ntb: add link management

2017-06-29 Thread Allen Hubbe
From: Logan Gunthorpe
> switchtec_ntb checks for a link by looking at the shared memory
> window. If the magic number is correct and the otherside indicates
> their link is enabled then we take the link to be up.
> 
> Whenever we change our local link status we send a msg to the
> otherside to check whether it's up and change their status.
> 
> The current status is maintained in a flag so ntb_is_link_up
> can return quickly.
> 
> We utilize switchtec's link status notifier to also check link changes
> when the switch notices a port changes state.
> 
> Signed-off-by: Logan Gunthorpe 
> Reviewed-by: Stephen Bates 
> Reviewed-by: Kurt Schwemmer 
> ---
>  drivers/ntb/hw/mscc/switchtec_ntb.c | 130 
> +++-
>  1 file changed, 129 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/hw/mscc/switchtec_ntb.c 
> b/drivers/ntb/hw/mscc/switchtec_ntb.c
> index 0587b2380bcc..8ef84d45bda6 100644
> --- a/drivers/ntb/hw/mscc/switchtec_ntb.c
> +++ b/drivers/ntb/hw/mscc/switchtec_ntb.c
> @@ -64,6 +64,7 @@ static inline void _iowrite64(u64 val, void __iomem *mmio)
> 
>  struct shared_mw {
>   u32 magic;
> + u32 link_sta;
>   u32 partition_id;
>   u16 nr_direct_mw;
>   u16 nr_lut_mw;
> @@ -102,8 +103,17 @@ struct switchtec_ntb {
>   int nr_direct_mw;
>   int nr_lut_mw;
>   int direct_mw_to_bar[MAX_DIRECT_MW];
> +
> + bool link_is_up;
> + enum ntb_speed link_speed;
> + enum ntb_width link_width;
>  };
> 
> +static struct switchtec_ntb *ntb_sndev(struct ntb_dev *ntb)
> +{
> + return container_of(ntb, struct switchtec_ntb, ntb);
> +}
> +
>  static int switchtec_ntb_part_op(struct switchtec_ntb *sndev,
>struct ntb_ctrl_regs __iomem *ctl,
>u32 op, int wait_status)
> @@ -161,6 +171,17 @@ static int switchtec_ntb_part_op(struct switchtec_ntb 
> *sndev,
>   return -EIO;
>  }
> 
> +static int switchtec_ntb_send_msg(struct switchtec_ntb *sndev, int idx,
> +   u32 val)
> +{
> + if (idx < 0 || idx >= ARRAY_SIZE(sndev->mmio_self_dbmsg->omsg))
> + return -EINVAL;
> +
> + iowrite32(val, &sndev->mmio_self_dbmsg->omsg[idx].msg);
> +
> + return 0;
> +}
> +
>  static int switchtec_ntb_mw_count(struct ntb_dev *ntb, int pidx)
>  {
>   return 0;
> @@ -192,22 +213,124 @@ static int switchtec_ntb_peer_mw_get_addr(struct 
> ntb_dev *ntb, int idx,
>   return 0;
>  }
> 
> +static void switchtec_ntb_part_link_speed(struct switchtec_ntb *sndev,
> +   int partition,
> +   enum ntb_speed *speed,
> +   enum ntb_width *width)
> +{
> + struct switchtec_dev *stdev = sndev->stdev;
> +
> + u32 pff = ioread32(&stdev->mmio_part_cfg[partition].vep_pff_inst_id);
> + u32 linksta = ioread32(&stdev->mmio_pff_csr[pff].pci_cap_region[13]);
> +
> + if (speed)
> + *speed = (linksta >> 16) & 0xF;
> +
> + if (width)
> + *width = (linksta >> 20) & 0x3F;
> +}
> +
> +static void switchtec_ntb_set_link_speed(struct switchtec_ntb *sndev)
> +{
> + enum ntb_speed self_speed, peer_speed;
> + enum ntb_width self_width, peer_width;
> +
> + if (!sndev->link_is_up) {
> + sndev->link_speed = NTB_SPEED_NONE;
> + sndev->link_width = NTB_WIDTH_NONE;
> + return;
> + }
> +
> + switchtec_ntb_part_link_speed(sndev, sndev->self_partition,
> +   &self_speed, &self_width);
> + switchtec_ntb_part_link_speed(sndev, sndev->peer_partition,
> +   &peer_speed, &peer_width);

Should we only set self_partition?  I think each peer should be able to set 
preferred speed, and negotiate down.  As written here, the last peer to set the 
speed overrides the setting on the peer, and even that is not atomic if they 
race.

> +
> + sndev->link_speed = min(self_speed, peer_speed);
> + sndev->link_width = min(self_width, peer_width);
> +}
> +
> +enum {
> + LINK_MESSAGE = 0,
> + MSG_LINK_UP = 1,
> + MSG_LINK_DOWN = 2,
> + MSG_CHECK_LINK = 3,
> +};
> +
> +static void switchtec_ntb_check_link(struct switchtec_ntb *sndev)
> +{
> + int link_sta;
> + int old = sndev->link_is_up;
> +
> + link_sta = sndev->self_shared->link_sta;
> + if (link_sta) {
> + u64 peer = ioread64(&sndev->peer_shared->magic);
> +
> + if ((peer & 0x) == SWITCHTEC_NTB_MAGIC)
> + link_sta = peer >> 32;
> + else
> + link_sta = 0;
> + }
> +
> + sndev->link_is_up = link_sta;
> + switchtec_ntb_set_link_speed(sndev);
> +
> + if (link_sta != old) {
> + switchtec_ntb_send_msg(sndev, LINK_MESSAGE, MSG_CHECK_LINK);
> + ntb_link_event(&sndev->ntb);
> + dev_info(&sndev->stdev->dev, "ntb link %s",
> +   

RE: [PATCH 14/16] switchtec_ntb: implement scratchpad registers

2017-06-29 Thread Allen Hubbe
From: Logan Gunthorpe
> Seeing there is no dedicated hardware for this, we simply add
> these as entries in the shared memory window. Thus, we could support
> any number of them but 128 seems like enough, for now.

Scratchpads are not natively supported by this hardware,
 - but the hardware does natively support more than two ntb ports,
 - but this software substitute for scratchpads looks like it only works with 
two ntb ports:

> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;

This could get in the way of letting the driver support more than two ports 
later on.  Is there already a plan to change this to support more than two 
ports?

This is also not the only hardware to lack scratchpads, but does have memory 
windows.  Wouldn't it be better for a software substitute like this to be done 
in a way that it is not tied to a specific hardware driver?

> 
> Signed-off-by: Logan Gunthorpe 
> Reviewed-by: Stephen Bates 
> Reviewed-by: Kurt Schwemmer 
> ---
>  drivers/ntb/hw/mscc/switchtec_ntb.c | 75 
> -
>  1 file changed, 73 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ntb/hw/mscc/switchtec_ntb.c 
> b/drivers/ntb/hw/mscc/switchtec_ntb.c
> index 980acf2e5b42..a42e80742b52 100644
> --- a/drivers/ntb/hw/mscc/switchtec_ntb.c
> +++ b/drivers/ntb/hw/mscc/switchtec_ntb.c
> @@ -69,6 +69,7 @@ struct shared_mw {
>   u16 nr_direct_mw;
>   u16 nr_lut_mw;
>   u64 mw_sizes[MAX_MWS];
> + u32 spad[128];
>  };
> 
>  #define MAX_DIRECT_MW ARRAY_SIZE(((struct ntb_ctrl_regs *)(0))->bar_entry)
> @@ -453,22 +454,90 @@ static int switchtec_ntb_peer_db_set(struct ntb_dev 
> *ntb, u64 db_bits)
> 
>  static int switchtec_ntb_spad_count(struct ntb_dev *ntb)
>  {
> - return 0;
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> +
> + return ARRAY_SIZE(sndev->self_shared->spad);
>  }
> 
>  static u32 switchtec_ntb_spad_read(struct ntb_dev *ntb, int idx)
>  {
> - return 0;
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> +
> + if (idx < 0 || idx >= ARRAY_SIZE(sndev->self_shared->spad))
> + return 0;
> +
> + if (!sndev->self_shared)
> + return 0;
> +
> + return sndev->self_shared->spad[idx];
>  }
> 
>  static int switchtec_ntb_spad_write(struct ntb_dev *ntb, int idx, u32 val)
>  {
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> +
> + if (idx < 0 || idx >= ARRAY_SIZE(sndev->self_shared->spad))
> + return -EINVAL;
> +
> + if (!sndev->self_shared)
> + return -EIO;
> +
> + sndev->self_shared->spad[idx] = val;
> +
>   return 0;
>  }
> 
> +static u32 switchtec_ntb_peer_spad_read(struct ntb_dev *ntb, int pidx,
> + int sidx)
> +{
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> +
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
> + if (sidx < 0 || sidx >= ARRAY_SIZE(sndev->peer_shared->spad))
> + return 0;
> +
> + if (!sndev->peer_shared)
> + return 0;
> +
> + return ioread32(&sndev->peer_shared->spad[sidx]);
> +}
> +
>  static int switchtec_ntb_peer_spad_write(struct ntb_dev *ntb, int pidx,
>int sidx, u32 val)
>  {
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> +
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
> + if (sidx < 0 || sidx >= ARRAY_SIZE(sndev->peer_shared->spad))
> + return -EINVAL;
> +
> + if (!sndev->peer_shared)
> + return -EIO;
> +
> + iowrite32(val, &sndev->peer_shared->spad[sidx]);
> +
> + return 0;
> +}
> +
> +static int switchtec_ntb_peer_spad_addr(struct ntb_dev *ntb, int pidx,
> + int sidx, phys_addr_t *spad_addr)
> +{
> + struct switchtec_ntb *sndev = ntb_sndev(ntb);
> + unsigned long offset;
> +
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
> + offset = (unsigned long)&sndev->peer_shared->spad[sidx] -
> + (unsigned long)sndev->stdev->mmio;
> +
> + if (spad_addr)
> + *spad_addr = pci_resource_start(ntb->pdev, 0) + offset;
> +
>   return 0;
>  }
> 
> @@ -494,7 +563,9 @@ static const struct ntb_dev_ops switchtec_ntb_ops = {
>   .spad_count = switchtec_ntb_spad_count,
>   .spad_read  = switchtec_ntb_spad_read,
>   .spad_write = switchtec_ntb_spad_write,
> + .peer_spad_read = switchtec_ntb_peer_spad_read,
>   .peer_spad_write= switchtec_ntb_peer_spad_write,
> + .peer_spad_addr = switchtec_ntb_peer_spad_addr,
>  };
> 
>  static void switchtec_ntb_init_sndev(struct switchtec_ntb *sndev)
> --
> 2.11.0



RE: [PATCH] ntb: use correct mw_count function in ntb_tool and ntb_transport

2017-06-27 Thread Allen Hubbe
From: Logan Gunthorpe
> After converting to the new API, both ntb_tool and ntb_transport are
> using ntb_mw_count to iterate through ntb_peer_get_addr when they
> should be using ntb_peer_mw_count.
> 
> This probably isn't an issue with the Intel and AMD drivers but
> this will matter for any future driver with asymetric memory window
> counts.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Jon Mason 
> Cc: Dave Jiang 
> Cc: Allen Hubbe 
> Cc: Serge Semin 

Acked-by: Allen Hubbe 

> ---
> 
> Hi Guys,
> 
> I caught this issue while finishing up the switchtec rebase. (Which is
> now working again and I'll send the updated series after doing a bit more
> testing.)
> 
> However, seeing this is a bug in the new api patches I feel it should
> be applied to ntb-next tree before it's merged upstream.
> 
> Thanks,
> 
> Logan
> 
>  drivers/ntb/ntb_transport.c | 2 +-
>  drivers/ntb/test/ntb_tool.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index 9a03c5871efe..b29558ddfe95 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -1059,7 +1059,7 @@ static int ntb_transport_probe(struct ntb_client *self, 
> struct ntb_dev *ndev)
>   int node;
>   int rc, i;
> 
> - mw_count = ntb_mw_count(ndev, PIDX);
> + mw_count = ntb_peer_mw_count(ndev);
> 
>   if (!ndev->ops->mw_set_trans) {
>   dev_err(&ndev->dev, "Inbound MW based NTB API is required\n");
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index f002bf48a08d..a69815c45ce6 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -959,7 +959,7 @@ static int tool_probe(struct ntb_client *self, struct 
> ntb_dev *ntb)
>   tc->ntb = ntb;
>   init_waitqueue_head(&tc->link_wq);
> 
> - tc->mw_count = min(ntb_mw_count(tc->ntb, PIDX), MAX_MWS);
> + tc->mw_count = min(ntb_peer_mw_count(tc->ntb), MAX_MWS);
>   for (i = 0; i < tc->mw_count; i++) {
>   rc = tool_init_mw(tc, i);
>   if (rc)
> --
> 2.11.0



RE: New NTB API Issue

2017-06-23 Thread Allen Hubbe
From: Logan Gunthorpe
> But any translation can be
> programmed by any peer.

That doesn't seem safe.  Even though it can be done as you say, would it not be 
better to have each specific translation under the control of exactly one 
driver?

If drivers can reach across and set the translation of any peer bar, they would 
still need to negotiate among N peers which one sets which other's translation.



RE: New NTB API Issue

2017-06-23 Thread Allen Hubbe
From: Logan Gunthorpe
> Hey,
> 
> Thanks Serge for the detailed explanation. This is pretty much exactly
> as I had thought it should be interpreted. My only problem remains that
> my hardware can't provide ntb_mw_get_align until the port it is asking
> about has configured itself. The easiest way to solve this is to only
> allow access when the link to that port is up. It's not a complicated
> change and would actually simplify ntb_transport and ntb_perf a little.
> 
> Is this ok with you Allen? If it is, I can include a patch in my next
> switchtec submission.

Ok.

> 
> Thanks,
> 
> Logan



RE: New NTB API Issue

2017-06-23 Thread Allen Hubbe
From: Logan Gunthorpe
> On 23/06/17 07:18 AM, Allen Hubbe wrote:
> > By "remote" do you mean the source or destination of a write?
> 
> Look at how these calls are used in ntb_transport and ntb_perf:
> 
> They both call ntb_peer_mw_get_addr to get the size of the BAR. The size
> is sent via spads to the other side. The other side then uses
> ntb_mw_get_align and applies align_size to the received size.
> 
> > Yes, clients should transfer the address and size information to the peer.
> 
> But then they also need to technically transfer the alignment
> information as well. Which neither of the clients do.

The client's haven't been fully ported to the multi-port api yet.  They were 
only minimally changed to call the new api, but so far other than that they 
have only been made to work as they had before.

> > Maybe this is the confusion.  None of these api calls are to reach across 
> > to the peer port, as to
> get the size of the peer's bar.  They are to get information from the local 
> port, or to configure the
> local port.
> 
> I like the rule that these api calls are not to reach across the port.
> But then API looks very wrong. Why are we calling one peer_mw_get addr
> and the other mw_get_align? And why does mw_get_align have a peer index?

I regret that the term "peer" is used to distinguish the mw api.  Better names 
perhaps should be ntb_outbound_mw_foo, ntb_inbound_mw_foo; or ntb_src_mw_foo, 
ntb_dest_mw_foo.  I like outbound/inbound, although the names are longer, maybe 
out/in would be ok.

> And why does mw_get_align have a peer index?

Good question.  Serge?

For that matter, why do we not also have peer mw idx in the set of parameters.  
Working through the example below, it looks like we are lacking a way to say 
Outbound MW1 on A corresponds with Inbound MW0 on B.  It looks like we can only 
indicate that Port A (not which Outbound MW of Port A) corresponds with Inbound 
MW0 on B.

> > Some snippets of code would help me understand your interpretation of the 
> > api semantics more
> exactly.
> 
> I'm not sure the code to best show this in code, but let me try
> describing an example situation:
> 
> Lets say we have the following mws on each side (this is something that
> is possible with Switchtec hardware):
> 
> Host A BARs:
> mwA0: 2MB size, aligned to 4k, size aligned to 4k
> mwA1: 4MB size, aligned to 4k, size aligned to 4k
> mwA2: 64k size, aligned to 64k, size aligned to 64k
> 
> Host B BARs:
> mwB0: 2MB size, aligned to 4k, size aligned to 4k
> mwB1: 64k size, aligned to 64k, size aligned to 64k

If those are BARs, that corresponds to "outbound", writing something to the BAR 
at mwA0.

A more complete picture might be:

Host A BARs (aka "outbound" or "peer" memory windows):
peer_mwA0: resource at 0xA - 0xA0020 (2MB)
peer_mwA1: resource at 0xA1000 - 0xA1040 (4MB)
peer_mwA2: resource at 0xA2000 - 0xa2001 (64k)

Host A MWs (aka "inbound" memory windows):
mwA0: 64k max size, aligned to 64k, size aligned to 64k
mwA1: 2MB max size, aligned to 4k, size aligned to 4k

Host A sees Host B on port index 1


Host B BARs (aka "outbound" or "peer" memory windows):
peer_mwB0: resource at 0xB - 0xB0020 (2MB)
peer_mwB1: resource at 0xB1000 - 0xB1001 (64k)

Host B MWs (aka "inbound" memory windows):
mwB0: 1MB size, aligned to 4k, size aligned to 4k
mwB1: 2MB size, aligned to 4k, size aligned to 4k

Host B sees Host A on port index 4


Outbound memory (aka "peer mw") windows come with a pci resource.  We can get 
the size of the resource, it's physical address, and set up outbound 
translation if the hardware has that (IDT).

Inbound memory windows (aka "mw") are only used to set up inbound translation, 
if the hardware has that (Intel, AMD).

To set up end-to-end memory window so that A can write to B, let's use 
peer_mwA1 and mwB0.

A: ntb_peer_mw_get_addr(peer_mwA1) -> base 0xA1000, size 4MB
B: ntb_mw_get_align(port4**, mwB0) -> aligned 4k, aligned 4k, max size 1MB
** Serge: do we need port info here, why?

Side A has a resource size of 4MB, but B only supports inbound translation up 
to 1MB.  Side A can only use the first quarter of the 4MB resource.

Side B needs to allocate memory aligned to 4k (the dma address must be aligned 
to 4k after dma mapping), and a multiple of 4k in size.  B may need to set 
inbound translation so that incoming writes go into this memory.  A may also 
need to set outbound translation.

A: ntb_peer_mw_set_trans(port1**, peer_mwA1, dma_mem_addr, dma_mem_size)
B: ntb_mw_set_trans(port4**, mwB0, dma_mem_addr, dma_mem_size)
** Serge: do we also need the opposing side MW index here?

** Logan: would those changes to the api suit your n

RE: New NTB API Issue

2017-06-23 Thread Allen Hubbe
From: Logan Gunthorpe
> On 6/22/2017 4:42 PM, Allen Hubbe wrote:
> > From: Logan Gunthorpe
> >> Any thoughts on changing the semantics of mw_get_align so it must be
> >> called with the link up?
> >
> > The intention of these is that these calls return information from the 
> > local port.  The calls
> themselves don't reach across the link to the peer, but the information 
> returned from the local port
> needs to be communicated for setting up the translation end-to-end.
> 
> Ok, well if it's from the local port, then splitting up mw_get_range
> into peer_mw_get_addr and mw_get_align is confusing because one has the
> peer designation and the other doesn't. And all the clients apply the
> alignments to the remote bar so they'd technically need to transfer them
> across the link somehow.

By "remote" do you mean the source or destination of a write?

Yes, clients should transfer the address and size information to the peer.

> > I would like to understand why this hardware needs link up.  Are there 
> > registers on the local port
> that are only valid after link up?
> 
> We only need the link up if we are trying to find the alignment
> requirements (and max_size) of the peer's bar. In theory, switchtec

Maybe this is the confusion.  None of these api calls are to reach across to 
the peer port, as to get the size of the peer's bar.  They are to get 
information from the local port, or to configure the local port.

Some mw configuration is done at the destination, as for Intel and AMD, and 
some configuration is done on the source side, for IDT.  The local 
configuration of the port on one side could depend on information from the 
remote port on the other side.  For example in IDT, the mw translation 
configured on the source side needs to know destination address information.  
Likewise, if there is any difference in the size of the range that can be 
translated by ports on opposing sides, that needs to be negotiated.

> could have different sizes of bars on both sides of the link and
> different alignment requirements. Though, in practice, this is probably
> unlikely.
> 
> > Can you post snippets of how ntb_mw_get_align and ntb_peer_mw_get_addr 
> > might be implemented for the
> Switchtec?
> 
> Hmm, yes, but lets sort out my confusion on the semantics per above first.

Some snippets of code would help me understand your interpretation of the api 
semantics more exactly.

> Logan



RE: New NTB API Issue

2017-06-22 Thread Allen Hubbe
From: Logan Gunthorpe
> Any thoughts on changing the semantics of mw_get_align so it must be
> called with the link up?

The intention of these is that these calls return information from the local 
port.  The calls themselves don't reach across the link to the peer, but the 
information returned from the local port needs to be communicated for setting 
up the translation end-to-end.

I would like to understand why this hardware needs link up.  Are there 
registers on the local port that are only valid after link up?

Can you post snippets of how ntb_mw_get_align and ntb_peer_mw_get_addr might be 
implemented for the Switchtec?



RE: New NTB API Issue

2017-06-22 Thread Allen Hubbe
From: Logan Gunthorpe
> On 6/22/2017 12:32 PM, Allen Hubbe wrote:
> > From: Logan Gunthorpe
> >> 2) The changes to the Intel and AMD driver for mw_get_align sets
> >> *max_size to the local pci resource size. (Thus making the assumption
> >> that the local is the same as the peer, which is wrong). max_size isn't
> >> actually used for anything so it's not _really_ an issue, but I do think
> >> it's confusing and incorrect. I'd suggest we remove max_size until
> >> something actually needs it, or at least set it to zero in cases where
> >> the hardware doesn't support returning the size of the peer's memory
> >> window (ie. in the Intel and AMD drivers).
> >
> > You're right, and the b2b_split in the Intel driver even makes use of 
> > different primary/secondary
> bar sizes. For Intel and AMD, it would make more sense to use the secondary 
> bar size here.  The size
> of the secondary bar still not necessarily valid end-to-end, because in b2b 
> the peer's primary bar
> size could be even smaller.
> >
> > I'm not entirely convinced that this should represent the end-to-end size 
> > of local and peer memory
> window configurations.  I think it should represent the largest side that 
> would be valid to pass to
> ntb_mw_set_trans().  Then, the peers should communicate their respective max 
> sizes (along with
> translation addresses, etc) before setting up the translations, and that 
> exchange will ensure that the
> size finally used is valid end-to-end.
> 
> But why would the client ever need to use the max_size instead of the
> actual size of the bar as retrieved and exchanged from peer_mw_get_addr?

The resource size given by peer_mw_get_addr might be different than the 
max_size given by ntb_mw_get_align.

I am most familiar with the ntb_hw_intel driver and that type of ntb hardware.  
The peer_mw_get_addr size is of the primary bar on the side to be the source of 
the translated writes (or reads).  In b2b topology, at least, the first 
translation of that write lands it on the secondary bar of the peer ntb.  That 
size of that bar could different than the first.  The second translation lands 
the write in memory (eg).  So, the end-to-end translation is limited by the 
first AND second sizes.

The first point is, the *max_size returned by intel_ntb_mw_get_align looks 
wrong.  That should be the size of the secondary bar, not the resource size of 
the primary bar, of that device.

The second point is, because the sizes returned by peer_mw_get_addr, and 
ntb_mw_get_align, may be different, the two sides should communicate and 
reconcile the address and size information when setting up the translations.



RE: New NTB API Issue

2017-06-22 Thread Allen Hubbe
From: Logan Gunthorpe
> Hey Guys,
> 
> I've run into some subtle issues with the new API:
> 
> It has to do with splitting mw_get_range into mw_get_align and
> peer_mw_get_addr.
> 
> The original mw_get_range returned the size of the /local/ memory
> window's size, address and alignment requirements. The ntb clients then
> take the local size and transmit it via spads to the peer which would
> use it in setting up the memory window. However, it made the assumption
> that the alignment restrictions were symmetric on both hosts seeing they
> were not sent across the link.
> 
> The new API makes a sensible change for this in that mw_get_align
> appears to be intended to return the alignment restrictions (and now
> size) of the peer. This helps a bit for the Switchtec driver but appears
> to be a semantic change that wasn't really reflected in the changes to
> the other NTB code. So, I see a couple of issues:
> 
> 1) With our hardware, we can't actually know anything about the peer's
> memory windows until the peer has finished its setup (ie. the link is
> up). However, all the clients call the function during probe, before the
> link is ready. There's really no good reason for this, so I think we
> should change the clients so that mw_get_align is called only when the
> link is up.
> 
> 2) The changes to the Intel and AMD driver for mw_get_align sets
> *max_size to the local pci resource size. (Thus making the assumption
> that the local is the same as the peer, which is wrong). max_size isn't
> actually used for anything so it's not _really_ an issue, but I do think
> it's confusing and incorrect. I'd suggest we remove max_size until
> something actually needs it, or at least set it to zero in cases where
> the hardware doesn't support returning the size of the peer's memory
> window (ie. in the Intel and AMD drivers).

You're right, and the b2b_split in the Intel driver even makes use of different 
primary/secondary bar sizes. For Intel and AMD, it would make more sense to use 
the secondary bar size here.  The size of the secondary bar still not 
necessarily valid end-to-end, because in b2b the peer's primary bar size could 
be even smaller.

I'm not entirely convinced that this should represent the end-to-end size of 
local and peer memory window configurations.  I think it should represent the 
largest side that would be valid to pass to ntb_mw_set_trans().  Then, the 
peers should communicate their respective max sizes (along with translation 
addresses, etc) before setting up the translations, and that exchange will 
ensure that the size finally used is valid end-to-end.

> 
> Thoughts?
> 
> Logan



RE: [RFC PATCH 00/13] Switchtec NTB Support

2017-06-16 Thread Allen Hubbe
From: Logan Gunthorpe
> On 16/06/17 09:34 AM, Allen Hubbe wrote:
> > In code review, I really only have found minor nits.  Overall, the driver 
> > looks good.
> 
> Great, thanks for such a quick review!
> 
> > In switchtec_ntb_part_op, there is a delay of up to 50s (1000 * 50ms).  
> > This looks like a thread
> context, so it could involve the scheduler for the delay instead of spinning 
> for up to 50s before
> bailing.
> 
> Good point. If I were to change this to msleep_interruptible would that
> be acceptable?

I would be satisfied.

> 
> > There are a few instances like this:
> >> +  dev_dbg(&stdev->dev, "%s\n", __func__);
> 
> > Where the printing of __func__ could be controlled by dyndbg=+pf.  The 
> > debug message could be more
> useful.
> 
> Ok, I'll change that.
> 
> > In switchtec_ntb_db_set_mask and friends, an in-memory copy of the mask 
> > bits is protected by a
> spinlock.  Elsewhere, you noted that the db bits are shared between all 
> ports, so the db bitset is
> chopped up to be shared between the ports.  Is the db mask also shared, and 
> how is the spinlock
> sufficient for synchronizing access to the mask bits between multiple ports?
> 
> Well, there are 64 doorbells that are shared between ports but each port
> has it's own in and out registers for the doorbells. So triggering
> doorbell one on one port's ODB actually triggers it on every ports IDB.
> So these are shared only in the sense that each port needs to know which
> dbs it cares about. Seeing each port has their own registers they don't
> have to worry about synchronization.
> 
> The mask is only protected by a spin lock seeing multiple callers of
> db_set_mask and db_clr_mask on the same port may step on each others
> toes. So if two processes try to mask different bits they both must get
> masked in the end and therefore some kind of synchronization must be
> involved.

Thanks for clearing that up.  Now I understand, each port has its own 
independent set of mask bits.  So, while the doorbell numbers are assigned 
globally, the registers themselves are per port.  For the mask bits, the mask 
behavior only affects the local port.

> 
> > The IDT switch also does not have hardware scratchpads.  Could the code you 
> > wrote for emulated
> scratchpads be made into shared library code for ntb drivers?  Also, some ntb 
> clients may not need
> scratchpad support.  If it is not natively supported by a driver, can the 
> emulated scratchpad support
> be an optional feature?
> 
> Hmm, interesting idea. A few pieces could possibly be made common but it
> depends mostly on hardware having the resources to make use of it.
> Switchtec has extra LUT memory windows that made this possible. Unless
> you object I'm inclined to leave it as is and I'd be happy to work with
> the IDT folks to create a common solution in the future.

Alright.  I'll leave it to you to find and reconcile common functionalities of 
the drivers.  What about making spad emulation optional?

> 
> Logan

There was a comment on irc.oftc.net #ntb wishing for patch v2 to be fewer 
patches.  Something like, 1/2: prep the existing switch driver, 2/2: introduce 
the ntb driver.



RE: [RFC PATCH 00/13] Switchtec NTB Support

2017-06-16 Thread Allen Hubbe
From: Logan Gunthorpe
> On 16/06/17 07:53 AM, Allen Hubbe wrote:
> > See what is staged in https://github.com/jonmason/ntb.git ntb-next, with 
> > the addition of multi-peer
> support by Serge.  It would be good at this stage to understand whether the 
> api changes there would
> also support the Switchtec driver, and what if anything must change, or be 
> planned to change, to
> support the Switchtec driver.
> 
> Ah, yes I had seen that patchset some time ago but I wasn't aware of
> it's status or that it was queued up in ntb-next. I think it will be no
> problem to reconcile with the switchtec driver and I'll rebase onto
> ntb-next for the next posting of the patch set. However, I *may* save
> full multi-host switchtec support for a follow up submission. My initial
> impression is the new API will support the switchtec hardware well.

Alright!

In code review, I really only have found minor nits.  Overall, the driver looks 
good.

In switchtec_ntb_part_op, there is a delay of up to 50s (1000 * 50ms).  This 
looks like a thread context, so it could involve the scheduler for the delay 
instead of spinning for up to 50s before bailing.

There are a few instances like this:
> + dev_dbg(&stdev->dev, "%s\n", __func__);

Where the printing of __func__ could be controlled by dyndbg=+pf.  The debug 
message could be more useful.

In switchtec_ntb_db_set_mask and friends, an in-memory copy of the mask bits is 
protected by a spinlock.  Elsewhere, you noted that the db bits are shared 
between all ports, so the db bitset is chopped up to be shared between the 
ports.  Is the db mask also shared, and how is the spinlock sufficient for 
synchronizing access to the mask bits between multiple ports?

The IDT switch also does not have hardware scratchpads.  Could the code you 
wrote for emulated scratchpads be made into shared library code for ntb 
drivers?  Also, some ntb clients may not need scratchpad support.  If it is not 
natively supported by a driver, can the emulated scratchpad support be an 
optional feature?

> 
> Thanks,
> 
> Logan



RE: [RFC PATCH 00/13] Switchtec NTB Support

2017-06-16 Thread Allen Hubbe
From: Logan Gunthorpe
> Hi,
> 
> This patchset implements Non-Transparent Bridge (NTB) support for the
> Microsemi Switchtec series of switches. We're looking for some
> review from the community at this point but hope to get it upstreamed
> for v4.14.
> 
> Switchtec NTB support is configured over the same function and bar
> as the management endpoint. Thus, the new driver hooks into the
> management driver which we had merged in v4.12. We use the class
> interface API to register an NTB device for every switchtec device
> which supports NTB (not all do).
> 
> The Switchtec hardware supports doorbells, memory windows and messages.
> Seeing there is no native scratchpad support, 128 spads are emulated
> through the use of a pre-setup memory window. The switch has 64
> doorbells which are shared between the two partitions and a
> configurable set of memory windows. While the hardware supports more
> than 2 partitions, this driver only supports the first two seeing
> the current NTB API only supports two hosts.

See what is staged in https://github.com/jonmason/ntb.git ntb-next, with the 
addition of multi-peer support by Serge.  It would be good at this stage to 
understand whether the api changes there would also support the Switchtec 
driver, and what if anything must change, or be planned to change, to support 
the Switchtec driver.

Thanks for providing the patch set for the Switchtec driver.  My first 
impression is that it is a good patch set.  Only to be included, it needs to be 
reconciled with the api changes in ntb-next.  I will follow up with a more 
detailed review of patches in this series, but sending this now as I don't want 
to delay your review of ntb-next.

> 
> The driver has been tested with ntb_netdev and fully passes the
> ntb_test script.
> 
> This patchset is based off of v4.12-rc5 and can be found in this
> git repo:
> 
> https://github.com/sbates130272/linux-p2pmem.git switchtec_ntb
> 
> Thanks,
> 
> Logan
> 
> 
> Logan Gunthorpe (13):
>   switchtec: move structure definitions into a common header
>   switchtec: export class symbol for use in upper layer driver
>   switchtec: add ntb hardware register definitions
>   switchtec: add link event notifier block
>   switchtec_ntb: introduce initial ntb driver
>   switchtec_ntb: initialize hardware for memory windows
>   switchtec_ntb: initialize hardware for doorbells and messages
>   switchtec_ntb: add skeleton ntb driver
>   switchtec_ntb: add link management
>   switchtec_ntb: implement doorbell registers
>   switchtec_ntb: implement scratchpad registers
>   switchtec_ntb: add memory window support
>   switchtec_ntb: update switchtec documentation with notes for ntb
> 
>  Documentation/switchtec.txt |   12 +
>  MAINTAINERS |2 +
>  drivers/ntb/hw/Kconfig  |1 +
>  drivers/ntb/hw/Makefile |1 +
>  drivers/ntb/hw/mscc/Kconfig |9 +
>  drivers/ntb/hw/mscc/Makefile|1 +
>  drivers/ntb/hw/mscc/switchtec_ntb.c | 1144 
> +++
>  drivers/pci/switch/switchtec.c  |  319 ++
>  include/linux/ntb.h |3 +
>  include/linux/switchtec.h   |  365 +++
>  10 files changed, 1601 insertions(+), 256 deletions(-)
>  create mode 100644 drivers/ntb/hw/mscc/Kconfig
>  create mode 100644 drivers/ntb/hw/mscc/Makefile
>  create mode 100644 drivers/ntb/hw/mscc/switchtec_ntb.c
>  create mode 100644 include/linux/switchtec.h
> 
> --
> 2.11.0



RE: [BUG] ntb: Sleep in interrupt handling

2017-06-01 Thread Allen Hubbe
From: Jia-Ju Bai 
> According to ntb_transport.c, the driver may sleep in interrupt handling.
> The function call path is:
> ntb_transport_rxc_db (tasklet_init indicates it handles interrupt)
>ntb_process_rxc
>  ntb_async_rx
>ntb_async_rx_submit
>  schedule_timeout --> may sleep
> 
> This bug is found by my static analysis tool and my code review.
> I hope to fix it, but I do not have a good solution.

Thanks! There is a recovery path if ntb_async_tx_submit fails.  It will do the 
transmission with memcpy instead of dma.  So, rather than retry in 
ntb_async_tx_submit, just fail to the recovery path.  Basically, replace the 
whole for(retries) loop with just txd = prep();  Would you like to work on the 
patch?

> 
> Thanks,
> Jia-Ju Bai




RE: [PATCH v3] NTB: Add IDT 89HPESxNTx PCIe-switches support

2017-02-24 Thread Allen Hubbe
From: Serge Semin 
> IDT 89HPESxNTx device series is PCIe-switches, which support
...
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

With minor comments.  Please include my Ack if you send v4.


> +static u32 idt_nt_read(struct idt_ntb_dev *ndev, const unsigned int reg)
> +{
> + /*
> +  * It's obvious bug to request a register exceeding the maximum possible
> +  * value as well as to have it unaligned.
> +  */
> + WARN_ON(reg > IDT_REG_PCI_MAX || !IS_ALIGNED(reg, IDT_REG_ALIGN));

if (WARN) return ~0?


> +/*
> + * idt_sw_write() - Global registers read method

Doc does not match fn:

> +static u32 idt_sw_read(struct idt_ntb_dev *ndev, const unsigned int reg)
> +{
> + unsigned long irqflags;
> + u32 data;
> +
> + /*
> +  * It's obvious bug to request a register exceeding the maximum possible
> +  * value as well as to have it unaligned.
> +  */
> + WARN_ON(reg > IDT_REG_SW_MAX || !IS_ALIGNED(reg, IDT_REG_ALIGN));

if (WARN) return ~0?

> +
> + /* Lock GASA registers operations */
> + spin_lock_irqsave(&ndev->gasa_lock, irqflags);
> + /* Set the global register address */
> + writel((u32)reg, ndev->cfgspc + (ptrdiff_t)IDT_NT_GASAADDR);

I wonder how that HW will behave if we instruct it to do the general-purpose 
read out-of-bounds.  Better not.

> + /* Get the data of the register */
> + data = readl(ndev->cfgspc + (ptrdiff_t)IDT_NT_GASADATA);
> + /* Unlock GASA registers operations */
> + spin_unlock_irqrestore(&ndev->gasa_lock, irqflags);
> +
> + return data;
> +}


> +static inline int idt_reg_clear_bits(struct idt_ntb_dev *ndev,
> +  unsigned int reg, spinlock_t *reg_lock,
> +  u64 valid_mask, u64 clear_bits)
> +{
> + unsigned long irqflags;
> + u32 data;
> +
> + if (clear_bits & ~(u64)valid_mask)
> + return -EINVAL;

This check also exists in the Intel driver (blame me).

I've wondered: what if we just pretend any non-valid bits are always "cleared." 
 If they are cleared again, just silently allow it (they stay cleared).  Only 
have this check against attempting to "set" invalid bits.

In Logan's ntb self-test with ntb_tool, it is convenient at the start of the 
test to "clear all the bits" of the doorbell registers.  This would be more 
simple if the script would just say "clear the bits ~0" instead of "tell me the 
valid bits and clear those."  Currently ntb_tool doesn't report the valid bits, 
so the valid bitset is a runtime parameter passed to the script (yuck).

Set the bits: are they valid?  ok.
Clear the bits: ok!


> + /* It's useless to have this driver loaded if there is no any peer */
> + if (ndev->peer_cnt == 0) {
> + dev_err_pci(ndev, "No active peer found\n");
> + return -EINVAL;
> + }

Maybe it would be useful for development or debugging purposes?


> +static bool idt_ntb_local_link_is_up(struct idt_ntb_dev *ndev)
> +{
...
> + if (!(data & IDT_NTMTBLDATA_VALID))
> + return false;
> +
> + /* Local NTB link is enabled if got here */
> + return true;

Unnecessary branching logic.  Just:

return !!(data & IDT_NTMTBLDATA_VALID);

> +static bool idt_ntb_peer_link_is_up(struct idt_ntb_dev *ndev, int pidx)
> +{
...
> + if (!(data & IDT_NTMTBLDATA_VALID))
> + return false;
> +
> + /* Peer NTB link is enabled if got here */
> + return true;

return !!(data & IDT_NTMTBLDATA_VALID);




RE: [PATCH v2] NTB: Add IDT 89HPESxNTx PCIe-switches support

2017-02-21 Thread Allen Hubbe
From: Serge Semin
> +/*
> + * idt_nt_write() - PCI configuration space registers write method
> + * @ndev:IDT NTB hardware driver descriptor
> + * @reg: Register to write data to
> + * @data:Value to write to the register
> + *
> + * WARNING! IDT PCIe-switch registers are all Little endian. So corresponding
> + *   writel operations must have embedded endiannes conversion. If local
> + *   platform doesn't have it, the driver won't properly work.
> + */
> +static void idt_nt_write(struct idt_ntb_dev *ndev,
> +  const unsigned int reg, const u32 data)
> +{
> + /*
> +  * It's obvious bug to request a register exceeding the maximum possible
> +  * value as well as to have it unaligned.
> +  */
> + WARN_ON(reg > IDT_REG_PCI_MAX || !IS_ALIGNED(reg, IDT_REG_ALIGN));

If we perform the write anyway, I guess the effect of the write is unknown?

What about:
if (WARN_ON(stuff))
return;


> +/*
> + * idt_reg_set_bits() - set bits of a passed register
> + * @ndev:IDT NTB hardware driver descriptor
> + * @reg: Register to change bits of
> + * @valid_mask:  Mask of valid bits
> + * @set_bits:Bitmask to set
> + *
> + * Helper method to check whether a passed bitfield is valid and set
> + * corresponding bits of a register.
> + *
> + * Return: zero on success, negative error on invalid bitmask.
> + */
> +static inline int idt_reg_set_bits(struct idt_ntb_dev *ndev, unsigned int 
> reg,
> +u64 valid_mask, u64 set_bits)
> +{
> + u32 data;
> +
> + if (set_bits & ~(u64)valid_mask)
> + return -EINVAL;
> +
> + data = idt_nt_read(ndev, reg) | (u32)set_bits;
> + idt_nt_write(ndev, IDT_NT_INDBELLMSK, data);

Following this function call via itd_ntb_db_set_mask(), it does not appear that 
the register update is atomic here.  Two threads could read the same old 
register value and modify it differently, and the second write back to the 
register would clobber the first.

In the ntb_hw_intel driver there is a similar setting bits of the doorbell 
mask, and clearing bits.  Instead of reading the register, modifying it, and 
writing it back, the current value of the register is stored in memory.  With a 
spin lock held, the value is updated in memory and then written to the 
register.  That makes the update atomic, because the spin lock is held through 
the update and issuing the write, and it saves a trip to read the register.


> +/*
> + * idt_get_mw_type() - get memory window size

Doc doesn't match the function name.

> + * @mw_type: Memory window type
> + *
> + * Return: number of memory windows corresponding to the type

This is more like a "count" than a "size".

> + */
> +static inline unsigned char idt_get_mw_size(enum idt_mw_type mw_type)
> +{


> +/*
> + * idt_ntb_db_set_mask() - set bits in the local doorbell mask
> + *  (NTB API callback)
> + * @ntb: NTB device context.
> + * @db_bits: Doorbell mask bits to set.
> + *
> + * The inbound doorbell register mask value must be read, then OR'ed with
> + * passed field and only then set back.
> + *
> + * Return: zero on success, negative error if invalid argument passed.
> + */
> +static int idt_ntb_db_set_mask(struct ntb_dev *ntb, u64 db_bits)
> +{
> + struct idt_ntb_dev *ndev = to_ndev_ntb(ntb);
> +
> + return idt_reg_set_bits(ndev, IDT_NT_INDBELLMSK, IDT_DBELL_MASK,
> + db_bits);

As noted above, this does not appear to be atomic.



RE: [PATCH] NTB: Add IDT 89HPESxNTx PCIe-switches support

2017-02-02 Thread Allen Hubbe
From: Serge Semin
> +static void idt_nt_write(struct idt_ntb_dev *ndev,
> +  const unsigned int reg, const u32 data)
> +{
> + /*
> +  * It's obvious bug to request a register exceeding the maximum possible
> +  * value as well as to have it unaligned.
> +  */
> + BUG_ON(reg > IDT_REG_PCI_MAX || !IS_ALIGNED(reg, IDT_REG_ALIGN));

Avoid BUG_ON.  Just warn and do nothing (at least, do nothing destructive) 
instead of crashing the system.  Here, and throughout the driver.

> +#define to_dev_ndev(ndev) (&((ndev)->ntb.dev))
> +#define to_pci_ndev(ndev) ((ndev)->ntb.pdev)

See Logan's recent patches in "Style fixes: open code obfuscating macros."



RE: [PATCH v3 9/9] NTB: Add ntb.h comments

2016-12-13 Thread Allen Hubbe
From: Serge Semin
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> ---
>  include/linux/ntb.h | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 6d46179..dab0a1b 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -326,12 +326,17 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + /* Port operations are required for multiport devices */
>   !ops->peer_port_count == !ops->port_number  &&
>   !ops->peer_port_number == !ops->port_number &&
>   !ops->peer_port_idx == !ops->port_number&&
> +
> + /* Link operations are required */
>   ops->link_is_up &&
>   ops->link_enable&&
>   ops->link_disable   &&
> +
> + /* One or both MW interfaces should be developed */
>   ops->mw_count   &&
>   ops->mw_get_align   &&
>   (ops->mw_set_trans  ||
> @@ -341,12 +346,11 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>   ops->peer_mw_get_addr   &&
>   /* ops->peer_mw_clear_trans && */
> 
> + /* Doorbell operations are mostly required */
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> -
>   /* both set, or both unset */
> - (!ops->db_vector_count == !ops->db_vector_mask) &&
> -
> + (!ops->db_vector_count == !ops->db_vector_mask) &&
>   ops->db_read&&
>   /* ops->db_set  && */
>   ops->db_clear   &&
> @@ -360,6 +364,8 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* ops->peer_db_read_mask   && */
>   /* ops->peer_db_set_mask&& */
>   /* ops->peer_db_clear_mask  && */
> +
> + /* Scrachpads interface is optional */
>   /* !ops->spad_is_unsafe == !ops->spad_count && */
>   !ops->spad_read == !ops->spad_count &&
>   !ops->spad_write == !ops->spad_count&&
> @@ -367,6 +373,7 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* !ops->peer_spad_read == !ops->spad_count && */
>   !ops->peer_spad_write == !ops->spad_count   &&
> 
> + /* Messaging interface is optional */
>   !ops->msg_inbits == !ops->msg_count &&
>   !ops->msg_outbits == !ops->msg_count&&
>   !ops->msg_read_sts == !ops->msg_count   &&
> @@ -387,13 +394,12 @@ struct ntb_client {
>   struct device_driverdrv;
>   const struct ntb_client_ops ops;
>  };
> -
>  #define drv_ntb_client(__drv) container_of((__drv), struct ntb_client, drv)
> 
>  /**
>   * struct ntb_device - ntb device
>   * @dev: Linux device object.
> - * @pdev:Pci device entry of the ntb.
> + * @pdev:PCI device entry of the ntb.
>   * @topo:Detected topology of the ntb.
>   * @ops: See &ntb_dev_ops.
>   * @ctx: See &ntb_ctx_ops.
> @@ -414,7 +420,6 @@ struct ntb_dev {
>   /* block unregister until device is fully released */
>   struct completion   released;
>  };
> -
>  #define dev_ntb(__dev) container_of((__dev), struct ntb_dev, dev)
> 
>  /**
> @@ -511,7 +516,7 @@ void ntb_link_event(struct ntb_dev *ntb);
>   * multiple interrupt vectors for doorbells, the vector number indicates 
> which
>   * vector received the interrupt.  The vector number is relative to the first
>   * vector used for doorbells, starting at zero, and must be less than
> - ** ntb_db_vector_count().  The driver may call ntb_db_read() to check which
> + * ntb_db_vector_count().  The driver may call ntb_db_read() to check which
>   * doorbell bits need service, and ntb_db_vector_mask() to determine which of
>   * those bits are associated with the vector number.
>   */
> --
> 2.6.6




RE: [PATCH v3 9/9] NTB: Add ntb.h comments

2016-12-13 Thread Allen Hubbe
From: Serge Semin
> Signed-off-by: Serge Semin 
> ---
>  include/linux/ntb.h | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 6d46179..dab0a1b 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -326,12 +326,17 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + /* Port operations are required */

... for multiport devices.

>   !ops->peer_port_count == !ops->port_number  &&
>   !ops->peer_port_number == !ops->port_number &&
>   !ops->peer_port_idx == !ops->port_number&&
> +
> + /* Link operations are requiered */
>   ops->link_is_up &&
>   ops->link_enable&&
>   ops->link_disable   &&
> +
> + /* One or both MW interfaces should be developed */
>   ops->mw_count   &&
>   ops->mw_get_align   &&
>   (ops->mw_set_trans  ||
> @@ -341,12 +346,11 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>   ops->peer_mw_get_addr   &&
>   /* ops->peer_mw_clear_trans && */
> 
> + /* Doorbell operations are mostly required */
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> -
>   /* both set, or both unset */
> - (!ops->db_vector_count == !ops->db_vector_mask) &&
> -
> + (!ops->db_vector_count == !ops->db_vector_mask) &&
>   ops->db_read&&
>   /* ops->db_set  && */
>   ops->db_clear   &&
> @@ -360,6 +364,8 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* ops->peer_db_read_mask   && */
>   /* ops->peer_db_set_mask&& */
>   /* ops->peer_db_clear_mask  && */
> +
> + /* Scrachpads interface is optional */
>   /* !ops->spad_is_unsafe == !ops->spad_count && */
>   !ops->spad_read == !ops->spad_count &&
>   !ops->spad_write == !ops->spad_count&&
> @@ -367,6 +373,7 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* !ops->peer_spad_read == !ops->spad_count && */
>   !ops->peer_spad_write == !ops->spad_count   &&
> 
> + /* Messaging interface is optional */
>   !ops->msg_inbits == !ops->msg_count &&
>   !ops->msg_outbits == !ops->msg_count&&
>   !ops->msg_read_sts == !ops->msg_count   &&
> @@ -387,13 +394,12 @@ struct ntb_client {
>   struct device_driverdrv;
>   const struct ntb_client_ops ops;
>  };
> -
>  #define drv_ntb_client(__drv) container_of((__drv), struct ntb_client, drv)
> 
>  /**
>   * struct ntb_device - ntb device
>   * @dev: Linux device object.
> - * @pdev:Pci device entry of the ntb.
> + * @pdev:PCI device entry of the ntb.
>   * @topo:Detected topology of the ntb.
>   * @ops: See &ntb_dev_ops.
>   * @ctx: See &ntb_ctx_ops.
> @@ -414,7 +420,6 @@ struct ntb_dev {
>   /* block unregister until device is fully released */
>   struct completion   released;
>  };
> -
>  #define dev_ntb(__dev) container_of((__dev), struct ntb_dev, dev)
> 
>  /**
> @@ -511,7 +516,7 @@ void ntb_link_event(struct ntb_dev *ntb);
>   * multiple interrupt vectors for doorbells, the vector number indicates 
> which
>   * vector received the interrupt.  The vector number is relative to the first
>   * vector used for doorbells, starting at zero, and must be less than
> - ** ntb_db_vector_count().  The driver may call ntb_db_read() to check which
> + * ntb_db_vector_count().  The driver may call ntb_db_read() to check which
>   * doorbell bits need service, and ntb_db_vector_mask() to determine which of
>   * those bits are associated with the vector number.
>   */
> --
> 2.6.6




RE: [PATCH v3 5/9] NTB: Alter Scratchpads API to support multi-ports devices

2016-12-13 Thread Allen Hubbe
From: Serge Semin
> Even though there is no any real NTB hardware, which would have both more
> than two ports and Scratchpad registers, it is logically correct to have
> Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize
> Primary and Secondary topology to split Scratchpad between connected root
> devices. Since port-index API introduced, Intel/AMD NTB hardware drivers can
> use device port to determine which Scratchpad registers actually belong to
> local and peer devices. The same approach can be used if some potential
> hardware in future will be multi-port and have some set of Scratchpads.
> Here are the brief of changes in the API:
>  ntb_spad_count() - return number of Scratchpads per each port
>  ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the
> peer device with pidx-index
>  ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the
> peer with pidx-index
>  ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the
> peer with pidx-index
> 
> Since there is hardware which doesn't support Scratchpad registers, the
> corresponding API methods are now made optional.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c | 14 +++
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 14 +++
>  drivers/ntb/ntb_transport.c | 17 -
>  drivers/ntb/test/ntb_perf.c |  6 +--
>  drivers/ntb/test/ntb_pingpong.c |  8 +++-
>  drivers/ntb/test/ntb_tool.c | 21 --
>  include/linux/ntb.h | 76 
> +++--
>  7 files changed, 98 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 6a41c38..bc537aa 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -433,30 +433,30 @@ static int amd_ntb_spad_write(struct ntb_dev *ntb,
>   return 0;
>  }
> 
> -static u32 amd_ntb_peer_spad_read(struct ntb_dev *ntb, int idx)
> +static u32 amd_ntb_peer_spad_read(struct ntb_dev *ntb, int pidx, int sidx)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   void __iomem *mmio = ndev->self_mmio;
>   u32 offset;
> 
> - if (idx < 0 || idx >= ndev->spad_count)
> + if (sidx < 0 || sidx >= ndev->spad_count)
>   return -EINVAL;
> 
> - offset = ndev->peer_spad + (idx << 2);
> + offset = ndev->peer_spad + (sidx << 2);
>   return readl(mmio + AMD_SPAD_OFFSET + offset);
>  }
> 
> -static int amd_ntb_peer_spad_write(struct ntb_dev *ntb,
> -int idx, u32 val)
> +static int amd_ntb_peer_spad_write(struct ntb_dev *ntb, int pidx,
> +int sidx, u32 val)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   void __iomem *mmio = ndev->self_mmio;
>   u32 offset;
> 
> - if (idx < 0 || idx >= ndev->spad_count)
> + if (sidx < 0 || sidx >= ndev->spad_count)
>   return -EINVAL;
> 
> - offset = ndev->peer_spad + (idx << 2);
> + offset = ndev->peer_spad + (sidx << 2);
>   writel(val, mmio + AMD_SPAD_OFFSET + offset);
> 
>   return 0;
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 4b84012..7bb14cb 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -1409,30 +1409,30 @@ static int intel_ntb_spad_write(struct ntb_dev *ntb,
>  ndev->self_reg->spad);
>  }
> 
> -static int intel_ntb_peer_spad_addr(struct ntb_dev *ntb, int idx,
> +static int intel_ntb_peer_spad_addr(struct ntb_dev *ntb, int pidx, int sidx,
>   phys_addr_t *spad_addr)
>  {
>   struct intel_ntb_dev *ndev = ntb_ndev(ntb);
> 
> - return ndev_spad_addr(ndev, idx, spad_addr, ndev->peer_addr,
> + return ndev_spad_addr(ndev, sidx, spad_addr, ndev->peer_addr,
> ndev->peer_reg->spad);
>  }
> 
> -static u32 intel_ntb_peer_spad_read(struct ntb_dev *ntb, int idx)
> +static u32 intel_ntb_peer_spad_read(struct ntb_dev *ntb, int pidx, int sidx)
>  {
>   struct intel_ntb_dev *ndev = ntb_ndev(ntb);
> 
> - return ndev_spad_read(ndev, idx,
> + return ndev_spad_read(ndev, sidx,
> ndev->peer_mmio +
> ndev->peer_reg->spad);
>  }
> 
> -static int intel_ntb_peer_spad_write(struct ntb_dev *ntb,
> -  int idx, u32 val)
> +static in

RE: [PATCH v3 4/9] NTB: Alter MW API to support multi-ports devices

2016-12-13 Thread Allen Hubbe
From: Serge Semin
> Multi-port NTB devices permit to share a memory between all accessible peers.
> Memory Windows API is altered to correspondingly initialize and map memory
> windows for such devices:
>  ntb_mw_count(pidx); - number of inbound memory windows, which can be 
> allocated
> for shared buffer with specified peer device.
>  ntb_mw_get_align(pidx, widx); - get alignment and size restriction parameters
> to properly allocate inbound memory region.
>  ntb_peer_mw_count(); - get number of outbound memory windows.
>  ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory 
> window
> 
> If hardware supports inbound translation configured on the local ntb port:
>  ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound
> memory window so a peer device could access it.
>  ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound
> memory window.
> 
> If hardware supports outbound translation configured on the peer ntb port:
>  ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory
> window retrieved from a peer device
>  ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an
> outbound memory window
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  68 +---
>  drivers/ntb/hw/intel/ntb_hw_intel.c |  90 
>  drivers/ntb/ntb.c   |   2 +
>  drivers/ntb/ntb_transport.c |  21 +++-
>  drivers/ntb/test/ntb_perf.c |  17 ++-
>  drivers/ntb/test/ntb_tool.c |  43 +---
>  include/linux/ntb.h | 208 
> 
>  7 files changed, 342 insertions(+), 107 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 4d8d0bd..6a41c38 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -5,6 +5,7 @@
>   *   GPL LICENSE SUMMARY
>   *
>   *   Copyright (C) 2016 Advanced Micro Devices, Inc. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   This program is free software; you can redistribute it and/or modify
>   *   it under the terms of version 2 of the GNU General Public License as
> @@ -13,6 +14,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright (C) 2016 Advanced Micro Devices, Inc. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
>   *   modification, are permitted provided that the following conditions
> @@ -79,40 +81,42 @@ static int ndev_mw_to_bar(struct amd_ntb_dev *ndev, int 
> idx)
>   return 1 << idx;
>  }
> 
> -static int amd_ntb_mw_count(struct ntb_dev *ntb)
> +static int amd_ntb_mw_count(struct ntb_dev *ntb, int pidx)
>  {
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
>   return ntb_ndev(ntb)->mw_count;
>  }
> 
> -static int amd_ntb_mw_get_range(struct ntb_dev *ntb, int idx,
> - phys_addr_t *base,
> - resource_size_t *size,
> - resource_size_t *align,
> - resource_size_t *align_size)
> +static int amd_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
> + resource_size_t *addr_align,
> + resource_size_t *size_align,
> + resource_size_t *size_max)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   int bar;
> 
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
>   bar = ndev_mw_to_bar(ndev, idx);
>   if (bar < 0)
>   return bar;
> 
> - if (base)
> - *base = pci_resource_start(ndev->ntb.pdev, bar);
> -
> - if (size)
> - *size = pci_resource_len(ndev->ntb.pdev, bar);
> + if (addr_align)
> + *addr_align = SZ_4K;
> 
> - if (align)
> - *align = SZ_4K;
> + if (size_align)
> + *size_align = 1;
> 
> - if (align_size)
> - *align_size = 1;
> + if (size_max)
> + *size_max = pci_resource_len(ndev->ntb.pdev, bar);
> 
>   return 0;
>  }
> 
> -static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int idx,
> +static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
>   dma_addr_t addr, resource_size_t size)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
> @@ -122,6 +126,9 @@ static int amd_ntb_mw_set_trans

RE: [PATCH v3 1/9] NTB: Make link-state API being declared first

2016-12-13 Thread Allen Hubbe
From: Serge Semin 
> Since link operations are usually performed before memory window access
> operations, it's logically better to declare link-related API before any
> of MW/Doorbell/Scratchpad methods.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> ---
>  include/linux/ntb.h | 137 
> ++--
>  1 file changed, 69 insertions(+), 68 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 6f47562..5d1f260 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -179,13 +179,13 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops
> *ops)
> 
>  /**
>   * struct ntb_ctx_ops - ntb device operations
> + * @link_is_up:  See ntb_link_is_up().
> + * @link_enable: See ntb_link_enable().
> + * @link_disable:See ntb_link_disable().
>   * @mw_count:See ntb_mw_count().
>   * @mw_get_range:See ntb_mw_get_range().
>   * @mw_set_trans:See ntb_mw_set_trans().
>   * @mw_clear_trans:  See ntb_mw_clear_trans().
> - * @link_is_up:  See ntb_link_is_up().
> - * @link_enable: See ntb_link_enable().
> - * @link_disable:See ntb_link_disable().
>   * @db_is_unsafe:See ntb_db_is_unsafe().
>   * @db_valid_mask:   See ntb_db_valid_mask().
>   * @db_vector_count: See ntb_db_vector_count().
> @@ -212,6 +212,12 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @peer_spad_write: See ntb_peer_spad_write().
>   */
>  struct ntb_dev_ops {
> + int (*link_is_up)(struct ntb_dev *ntb,
> +   enum ntb_speed *speed, enum ntb_width *width);
> + int (*link_enable)(struct ntb_dev *ntb,
> +enum ntb_speed max_speed, enum ntb_width max_width);
> + int (*link_disable)(struct ntb_dev *ntb);
> +
>   int (*mw_count)(struct ntb_dev *ntb);
>   int (*mw_get_range)(struct ntb_dev *ntb, int idx,
>   phys_addr_t *base, resource_size_t *size,
> @@ -220,12 +226,6 @@ struct ntb_dev_ops {
>   dma_addr_t addr, resource_size_t size);
>   int (*mw_clear_trans)(struct ntb_dev *ntb, int idx);
> 
> - int (*link_is_up)(struct ntb_dev *ntb,
> -   enum ntb_speed *speed, enum ntb_width *width);
> - int (*link_enable)(struct ntb_dev *ntb,
> -enum ntb_speed max_speed, enum ntb_width max_width);
> - int (*link_disable)(struct ntb_dev *ntb);
> -
>   int (*db_is_unsafe)(struct ntb_dev *ntb);
>   u64 (*db_valid_mask)(struct ntb_dev *ntb);
>   int (*db_vector_count)(struct ntb_dev *ntb);
> @@ -265,13 +265,14 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + ops->link_is_up &&
> + ops->link_enable&&
> + ops->link_disable   &&
>   ops->mw_count   &&
>   ops->mw_get_range   &&
>   ops->mw_set_trans   &&
>   /* ops->mw_clear_trans  && */
> - ops->link_is_up &&
> - ops->link_enable&&
> - ops->link_disable   &&
> +
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> 
> @@ -441,6 +442,62 @@ void ntb_link_event(struct ntb_dev *ntb);
>  void ntb_db_event(struct ntb_dev *ntb, int vector);
> 
>  /**
> + * ntb_link_is_up() - get the current ntb link state
> + * @ntb: NTB device context.
> + * @speed:   OUT - The link speed expressed as PCIe generation number.
> + * @width:   OUT - The link width expressed as the number of PCIe lanes.
> + *
> + * Get the current state of the ntb link.  It is recommended to query the 
> link
> + * state once after every link event.  It is safe to query the link state in
> + * the context of the link event callback.
> + *
> + * Return: One if the link is up, zero if the link is down, otherwise a
> + *   negative value indicating the error number.
> + */
> +static inline int ntb_link_is_up(struct ntb_dev *ntb,
> +  enum ntb_speed *speed, enum ntb_width *width)
> +{
> + return ntb->ops->link_is_up(ntb, speed, width);
> +}
> +
> +/**
> + * ntb_link_enable() - enable the link on the sec

RE: [PATCH v3 2/9] NTB: Add indexed ports NTB API

2016-12-13 Thread Allen Hubbe
From: Serge Semin
> There is some NTB hardware, which can combine more than just two domains
> over NTB. For instance, some IDT PCIe-switches can have NTB-functions
> activated on more than two-ports. The different domains are distinguished
> by ports they are connected to. So the new port-related methods are added to
> the NTB API:
>  ntb_port_number() - return local port
>  ntb_peer_port_count() - return number of peers local port can connect to
>  ntb_peer_port_number(pdix) - return port number by it index
>  ntb_peer_port_idx(port) - return port index by it number
> 
> Current test-drivers aren't changed much. They still support two-ports devices
> for the time being while multi-ports hardware drivers aren't added.
> 
> By default port-related API is declared for two-ports hardware.
> So corresponding hardware drivers won't need to implement it.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/ntb.c   |  54 ++
>  drivers/ntb/ntb_transport.c |   6 ++
>  drivers/ntb/test/ntb_perf.c |   4 ++
>  drivers/ntb/test/ntb_pingpong.c |   6 ++
>  drivers/ntb/test/ntb_tool.c |   5 ++
>  include/linux/ntb.h | 156 
> 
>  6 files changed, 231 insertions(+)
> 
> diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
> index 2e25307..1e92e52 100644
> --- a/drivers/ntb/ntb.c
> +++ b/drivers/ntb/ntb.c
> @@ -191,6 +191,60 @@ void ntb_db_event(struct ntb_dev *ntb, int vector)
>  }
>  EXPORT_SYMBOL(ntb_db_event);
> 
> +int ntb_default_port_number(struct ntb_dev *ntb)
> +{
> + switch (ntb->topo) {
> + case NTB_TOPO_PRI:
> + case NTB_TOPO_B2B_USD:
> + return NTB_PORT_PRI_USD;
> + case NTB_TOPO_SEC:
> + case NTB_TOPO_B2B_DSD:
> + return NTB_PORT_SEC_DSD;
> + default:
> + break;
> + }
> +
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL(ntb_default_port_number);
> +
> +int ntb_default_peer_port_count(struct ntb_dev *ntb)
> +{
> + return NTB_DEF_PEER_CNT;
> +}
> +EXPORT_SYMBOL(ntb_default_peer_port_count);
> +
> +int ntb_default_peer_port_number(struct ntb_dev *ntb, int pidx)
> +{
> + if (pidx != NTB_DEF_PEER_IDX)
> + return -EINVAL;
> +
> + switch (ntb->topo) {
> + case NTB_TOPO_PRI:
> + case NTB_TOPO_B2B_USD:
> + return NTB_PORT_SEC_DSD;
> + case NTB_TOPO_SEC:
> + case NTB_TOPO_B2B_DSD:
> + return NTB_PORT_PRI_USD;
> + default:
> + break;
> + }
> +
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL(ntb_default_peer_port_number);
> +
> +int ntb_default_peer_port_idx(struct ntb_dev *ntb, int port)
> +{
> + int peer_port = ntb_default_peer_port_number(ntb, NTB_DEF_PEER_IDX);
> +
> + if (peer_port == -EINVAL || port != peer_port)
> + return -EINVAL;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(ntb_default_peer_port_idx);
> +
>  static int ntb_probe(struct device *dev)
>  {
>   struct ntb_dev *ntb;
> diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> index 4eb8adb..10518b7 100644
> --- a/drivers/ntb/ntb_transport.c
> +++ b/drivers/ntb/ntb_transport.c
> @@ -94,6 +94,9 @@ MODULE_PARM_DESC(use_dma, "Use DMA engine to perform large 
> data copy");
> 
>  static struct dentry *nt_debugfs_dir;
> 
> +/* Only two-ports NTB devices are supported */
> +#define PIDX NTB_DEF_PEER_IDX
> +
>  struct ntb_queue_entry {
>   /* ntb_queue list reference */
>   struct list_head entry;
> @@ -1083,6 +1086,9 @@ static int ntb_transport_probe(struct ntb_client *self, 
> struct
> ntb_dev *ndev)
>   dev_dbg(&ndev->dev,
>   "scratchpad is unsafe, proceed anyway...\n");
> 
> + if (ntb_peer_port_count(ndev) != NTB_DEF_PEER_CNT)
> + dev_warn(&ndev->dev, "Multi-port NTB devices unsupported\n");
> +
>   node = dev_to_node(&ndev->dev);
> 
>   nt = kzalloc_node(sizeof(*nt), GFP_KERNEL, node);
> diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
> index e75d4fd..c908b3a 100644
> --- a/drivers/ntb/test/ntb_perf.c
> +++ b/drivers/ntb/test/ntb_perf.c
> @@ -76,6 +76,7 @@
>  #define DMA_RETRIES  20
>  #define SZ_4G(1ULL << 32)
>  #define MAX_SEG_ORDER20 /* no larger than 1M for kmalloc 
> buffer */
> +#define PIDX NTB_DEF_PEER_IDX
> 
>  MODULE_LICENSE(DRIVER_LICENSE);
>  MODULE_VERSION(DRIVER_VERSION);
> @@ -764,6 +765,9 @@ static int perf_probe(struct ntb_c

RE: [PATCH v2 9/9] NTB: Add ntb.h comments

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> 
> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index fe0437c..c5a369c 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -312,13 +312,18 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + /* Port operations are required */

Maybe: are required for multiport devices.

>   ops->port_number&&
>   ops->peer_port_count&&
>   ops->peer_port_number   &&
>   ops->peer_port_idx  &&
> +
> + /* Link operations are requiered */
>   ops->link_is_up &&
>   ops->link_enable&&
>   ops->link_disable   &&

Wasn't the first patch in this series all about making link ops first?

> +
> + /* One or both MW interfaces should be developed */
>   ops->mw_count   &&
>   ops->mw_get_align   &&
>   (ops->mw_set_trans  ||
> @@ -328,12 +333,11 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>   ops->peer_mw_get_addr   &&
>   /* ops->peer_mw_clear_trans && */
> 
> + /* Doorbell operations are mostly required */
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> -
>   /* both set, or both unset */
>   (!ops->db_vector_count == !ops->db_vector_mask) &&
> -
>   ops->db_read&&
>   /* ops->db_set  && */
>   ops->db_clear   &&
> @@ -347,6 +351,8 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* ops->peer_db_read_mask   && */
>   /* ops->peer_db_set_mask&& */
>   /* ops->peer_db_clear_mask  && */
> +
> + /* Scrachpad interface is optional */
>   /* !ops->spad_is_unsafe == !ops->spad_count && */
>   !ops->spad_read == !ops->spad_count &&
>   !ops->spad_write == !ops->spad_count&&
> @@ -354,6 +360,7 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* !ops->peer_spad_read == !ops->spad_count && */
>   !ops->peer_spad_write == !ops->spad_count &&
> 
> + /* Message registers interface is optional */
>   !ops->msg_inbits == !ops->msg_count &&
>   !ops->msg_outbits == !ops->msg_count&&
>   !ops->msg_read_sts == !ops->msg_count   &&
> @@ -374,13 +381,12 @@ struct ntb_client {
>   struct device_driverdrv;
>   const struct ntb_client_ops ops;
>  };
> -
>  #define drv_ntb_client(__drv) container_of((__drv), struct ntb_client, drv)
> 
>  /**
>   * struct ntb_device - ntb device
>   * @dev: Linux device object.
> - * @pdev:Pci device entry of the ntb.
> + * @pdev:PCI device entry of the ntb.
>   * @topo:Detected topology of the ntb.
>   * @ops: See &ntb_dev_ops.
>   * @ctx: See &ntb_ctx_ops.
> @@ -401,7 +407,6 @@ struct ntb_dev {
>   /* block unregister until device is fully released */
>   struct completion   released;
>  };
> -
>  #define dev_ntb(__dev) container_of((__dev), struct ntb_dev, dev)
> 
>  /**
> @@ -498,7 +503,7 @@ void ntb_link_event(struct ntb_dev *ntb);
>   * multiple interrupt vectors for doorbells, the vector number indicates 
> which
>   * vector received the interrupt.  The vector number is relative to the first
>   * vector used for doorbells, starting at zero, and must be less than
> - ** ntb_db_vector_count().  The driver may call ntb_db_read() to check which
> + * ntb_db_vector_count().  The driver may call ntb_db_read() to check which
>   * doorbell bits need service, and ntb_db_vector_mask() to determine which of
>   * those bits are associated with the vector number.
>   */
> --
> 2.6.6




RE: [PATCH v2 7/9] NTB: Add new Memory Windows API documentation

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> Since the new API slightly changes the way a typical NTB client driver
> works, the documentation file needs to be appropriately updated.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> 
> ---
>  Documentation/ntb.txt | 99 
> ++-
>  1 file changed, 91 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/ntb.txt b/Documentation/ntb.txt
> index 1d9bbab..d01bb69 100644
> --- a/Documentation/ntb.txt
> +++ b/Documentation/ntb.txt
> @@ -1,14 +1,16 @@
>  # NTB Drivers
> 
>  NTB (Non-Transparent Bridge) is a type of PCI-Express bridge chip that 
> connects
> -the separate memory systems of two computers to the same PCI-Express fabric.
> -Existing NTB hardware supports a common feature set, including scratchpad
> -registers, doorbell registers, and memory translation windows.  Scratchpad
> -registers are read-and-writable registers that are accessible from either 
> side
> -of the device, so that peers can exchange a small amount of information at a
> -fixed address.  Doorbell registers provide a way for peers to send interrupt
> -events.  Memory windows allow translated read and write access to the peer
> -memory.
> +the separate memory systems of two or more computers to the same PCI-Express
> +fabric. Existing NTB hardware supports a common feature set: doorbell
> +registers and memory translation windows, as well as non common features like
> +scratchpad and message registers. Scratchpad registers are read-and-writable
> +registers that are accessible from either side of the device, so that peers 
> can
> +exchange a small amount of information at a fixed address. Message registers 
> can
> +be utialized for the same purpose. Additionally they are provided with with
> +special status bits to make sure the information isn't rewritten by another
> +peer. Doorbell registers provide a way for peers to send interrupt events.
> +Memory windows allow translated read and write access to the peer memory.
> 
>  ## NTB Core Driver (ntb)
> 
> @@ -26,6 +28,87 @@ as ntb hardware, or hardware drivers, are inserted and 
> removed.  The
>  registration uses the Linux Device framework, so it should feel familiar to
>  anyone who has written a pci driver.
> 
> +### NTB Typical client driver implementation
> +
> +Primary purpose of NTB is to share some peace of memory between at least two
> +systems. So the NTB device features like Scratchpad/Message regiesters are
> +mainly used to perform the proper memory window initialization. Typically
> +there are two types of memory window interfaces supported by the NTB API:
> +inbound translation configured on the local ntb port and outbound translation
> +configured by the peer, on the peer ntb port. The first type is
> +depicted on the next figure
> +
> +Inbound translation:
> + Memory:  Local NTB Port:  Peer NTB Port:  Peer MMIO:
> +  
> + | dma-mapped |-ntb_mw_set_trans(addr)  |
> + | memory |_v   |   __
> + | (addr) |<==| MW xlat addr |<| MW base addr |<== memory-mapped 
> IO
> + ||   |--|  |  |--|
> +
> +So typical scenario of the first type memory window initialization looks:
> +1) allocate a memory region, 2) put translated address to NTB config,
> +3) somehow notify a peer device of performed initialization, 4) peer device
> +maps corresponding outbound memory window so to have access to the shared
> +memory region.
> +
> +The second type of interface, that implies the shared windows being
> +initialized by a peer device, is depicted on the figure:
> +
> +Outbound translation:
> + Memory:Local NTB Port:Peer NTB Port:  Peer MMIO:
> +    __
> + | dma-mapped ||   | MW base addr |<== memory-mapped IO
> + | memory ||   |--|
> + | (addr) |<===| MW xlat addr 
> |<-ntb_peer_mw_set_trans(addr)
> + |||   |--|
> +
> +Typical scenario of the second type interface initialization would be:
> +1) allocate a memory region, 2) somehow deliver a translated address to a 
> peer
> +device, 3) peer puts the translated address to NTB config, 4) peer device 
> maps
> +outbound memory window so to have access to the shared memory region.
> +
> +As one can see the described scenarios can be combined in one portable
> +algorithm.
> + Local device:
> +  1) Allocate memory for a shared window
> +  2) Initialize memory window by translated address of the allocated region
> + (it may fail if local memory window initi

RE: [PATCH v2 8/9] NTB: Add PCIe Gen4 link speed

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> 
> ---
>  include/linux/ntb.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 90746df..fe0437c 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -108,6 +108,7 @@ static inline char *ntb_topo_string(enum ntb_topo topo)
>   * @NTB_SPEED_GEN1:  Link is trained to gen1 speed.
>   * @NTB_SPEED_GEN2:  Link is trained to gen2 speed.
>   * @NTB_SPEED_GEN3:  Link is trained to gen3 speed.
> + * @NTB_SPEED_GEN4:  Link is trained to gen4 speed.
>   */
>  enum ntb_speed {
>   NTB_SPEED_AUTO = -1,
> @@ -115,6 +116,7 @@ enum ntb_speed {
>   NTB_SPEED_GEN1 = 1,
>   NTB_SPEED_GEN2 = 2,
>   NTB_SPEED_GEN3 = 3,
> + NTB_SPEED_GEN4 = 4
>  };
> 
>  /**
> --
> 2.6.6




RE: [PATCH v2 6/9] NTB: Add Messaging NTB API

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> Some IDT NTB-capable PCIe-switches have message registers to communicate with
> peer devices. This patch adds new NTB API callback methods, which can be used
> to utilize these registers functionality:
>  ntb_msg_count(); - get number of message registers
>  ntb_msg_inbits(); - get bitfield of inbound message registers status
>  ntb_msg_outbits(); - get bitfield of outbound message registers status
>  ntb_msg_read_sts(); - read the inbound and outbound message registers status
>  ntb_msg_clear_sts(); - clear status bits of message registers
>  ntb_msg_set_mask(); - mask interrupts raised by status bits of message
> registers.
>  ntb_msg_clear_mask(); - clear interrupts mask bits of message registers
>  ntb_msg_read(midx, *pidx); - read message register with specified index,
> additionally getting peer port index which data received from
>  ntb_msg_write(midx, pidx); - write data to the specified message register
> sending it to the passed peer device connected over a pidx port
>  ntb_msg_event(); - notify driver context of a new message event
> 
> Of course there is hadrware which doesn't support Message registers, so

s/hadrware/hardware/

> this API is made optional.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> 
> ---
>  drivers/ntb/ntb.c   |  13 
>  include/linux/ntb.h | 205 
> 
>  2 files changed, 218 insertions(+)
> 
> diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
> index f6153af..06574f8 100644
> --- a/drivers/ntb/ntb.c
> +++ b/drivers/ntb/ntb.c
> @@ -193,6 +193,19 @@ void ntb_db_event(struct ntb_dev *ntb, int vector)
>  }
>  EXPORT_SYMBOL(ntb_db_event);
> 
> +void ntb_msg_event(struct ntb_dev *ntb)
> +{
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&ntb->ctx_lock, irqflags);
> + {
> + if (ntb->ctx_ops && ntb->ctx_ops->msg_event)
> + ntb->ctx_ops->msg_event(ntb->ctx);
> + }
> + spin_unlock_irqrestore(&ntb->ctx_lock, irqflags);
> +}
> +EXPORT_SYMBOL(ntb_msg_event);
> +
>  static int ntb_probe(struct device *dev)
>  {
>   struct ntb_dev *ntb;
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index a6bf15d..90746df 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -164,10 +164,12 @@ static inline int ntb_client_ops_is_valid(const struct
> ntb_client_ops *ops)
>   * struct ntb_ctx_ops - ntb driver context operations
>   * @link_event:  See ntb_link_event().
>   * @db_event:See ntb_db_event().
> + * @msg_event:   See ntb_msg_event().
>   */
>  struct ntb_ctx_ops {
>   void (*link_event)(void *ctx);
>   void (*db_event)(void *ctx, int db_vector);
> + void (*msg_event)(void *ctx);
>  };
> 
>  static inline int ntb_ctx_ops_is_valid(const struct ntb_ctx_ops *ops)
> @@ -176,6 +178,7 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   return
>   /* ops->link_event  && */
>   /* ops->db_event&& */
> + /* ops->msg_event   && */
>   1;
>  }
> 
> @@ -220,6 +223,15 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @peer_spad_addr:  See ntb_peer_spad_addr().
>   * @peer_spad_read:  See ntb_peer_spad_read().
>   * @peer_spad_write: See ntb_peer_spad_write().
> + * @msg_count:   See ntb_msg_count().
> + * @msg_inbits:  See ntb_msg_inbits().
> + * @msg_outbits: See ntb_msg_outbits().
> + * @msg_read_sts:See ntb_msg_read_sts().
> + * @msg_clear_sts:   See ntb_msg_clear_sts().
> + * @msg_set_mask:See ntb_msg_set_mask().
> + * @msg_clear_mask:  See ntb_msg_clear_mask().
> + * @msg_read:See ntb_msg_read().
> + * @msg_write:   See ntb_msg_write().
>   */
>  struct ntb_dev_ops {
>   int (*port_number)(struct ntb_dev *ntb);
> @@ -282,6 +294,16 @@ struct ntb_dev_ops {
>   u32 (*peer_spad_read)(struct ntb_dev *ntb, int pidx, int sidx);
>   int (*peer_spad_write)(struct ntb_dev *ntb, int pidx, int sidx,
>  u32 val);
> +
> + int (*msg_count)(struct ntb_dev *ntb);
> + u64 (*msg_inbits)(struct ntb_dev *ntb);
> + u64 (*msg_outbits)(struct ntb_dev *ntb);
> + u64 (*msg_read_sts)(struct ntb_dev *ntb);
> + int (*msg_clear_sts)(struct ntb_dev *ntb, u64 sts_bits);
> + int (*msg_set_mask)(struct ntb_dev *ntb, u64 mask_bits);
> + int (*msg_clear_mask)(struct ntb_dev *ntb, u64 mask_bits);
> + int (*msg_read)(struct ntb_dev 

RE: [PATCH v2 5/9] NTB: Alter Scratchpads API to support multi-ports devices

2016-12-12 Thread Allen Hubbe
From: Serge Semin 
> Even though there is no any real NTB hardware, which would have both more
> than two ports and Scratchpad registers, it is logically correct to have
> Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize
> Primary and Secondary topology to split Scratchpad between connected root
> devices. Since port-index API introduced, Intel/AMD NTB hadrware drivers can

s/hadrware/hardware/

> use device port to determine which Scratchpad registers actually belong to
> local and peer devices. The same approach can be used if some potential
> hardware in future will be multi-port and have some set of Scratchpads.
> Here are the brief of changes in the API:
>  ntb_spad_count() - return number of Scratchpad per each port
>  ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the
> peer device with pidx-index
>  ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the
> peer with pidx-index
>  ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the
> peer with pidx-index
> 
> Since there is hardware which doesn't support Scratchpad registers, the
> corresponding API methods are now made optional.

The api change looks good.  See the comment to simplify ntb_tool.

> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c | 14 +++
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 14 +++
>  drivers/ntb/ntb_transport.c | 17 -
>  drivers/ntb/test/ntb_perf.c |  6 +--
>  drivers/ntb/test/ntb_pingpong.c |  8 +++-
>  drivers/ntb/test/ntb_tool.c | 45 +-
>  include/linux/ntb.h | 76 
> +++--
>  7 files changed, 115 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 74fe9b8..a2596ad 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -476,30 +476,30 @@ static int amd_ntb_spad_write(struct ntb_dev *ntb,
>   return 0;
>  }
> 
> -static u32 amd_ntb_peer_spad_read(struct ntb_dev *ntb, int idx)
> +static u32 amd_ntb_peer_spad_read(struct ntb_dev *ntb, int pidx, int sidx)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   void __iomem *mmio = ndev->self_mmio;
>   u32 offset;
> 
> - if (idx < 0 || idx >= ndev->spad_count)
> + if (sidx < 0 || sidx >= ndev->spad_count)
>   return -EINVAL;
> 
> - offset = ndev->peer_spad + (idx << 2);
> + offset = ndev->peer_spad + (sidx << 2);
>   return readl(mmio + AMD_SPAD_OFFSET + offset);
>  }
> 
> -static int amd_ntb_peer_spad_write(struct ntb_dev *ntb,
> -int idx, u32 val)
> +static int amd_ntb_peer_spad_write(struct ntb_dev *ntb, int pidx,
> +int sidx, u32 val)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   void __iomem *mmio = ndev->self_mmio;
>   u32 offset;
> 
> - if (idx < 0 || idx >= ndev->spad_count)
> + if (sidx < 0 || sidx >= ndev->spad_count)
>   return -EINVAL;
> 
> - offset = ndev->peer_spad + (idx << 2);
> + offset = ndev->peer_spad + (sidx << 2);
>   writel(val, mmio + AMD_SPAD_OFFSET + offset);
> 
>   return 0;
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 5a57d9e..471b0ba 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -1452,30 +1452,30 @@ static int intel_ntb_spad_write(struct ntb_dev *ntb,
>  ndev->self_reg->spad);
>  }
> 
> -static int intel_ntb_peer_spad_addr(struct ntb_dev *ntb, int idx,
> +static int intel_ntb_peer_spad_addr(struct ntb_dev *ntb, int pidx, int sidx,
>   phys_addr_t *spad_addr)
>  {
>   struct intel_ntb_dev *ndev = ntb_ndev(ntb);
> 
> - return ndev_spad_addr(ndev, idx, spad_addr, ndev->peer_addr,
> + return ndev_spad_addr(ndev, sidx, spad_addr, ndev->peer_addr,
> ndev->peer_reg->spad);
>  }
> 
> -static u32 intel_ntb_peer_spad_read(struct ntb_dev *ntb, int idx)
> +static u32 intel_ntb_peer_spad_read(struct ntb_dev *ntb, int pidx, int sidx)
>  {
>   struct intel_ntb_dev *ndev = ntb_ndev(ntb);
> 
> - return ndev_spad_read(ndev, idx,
> + return ndev_spad_read(ndev, sidx,
> ndev->peer_mmio +
> ndev->peer_reg->spad);
>  }
> 
> -static int intel_ntb_peer_spad_write(struct ntb_dev *ntb,
> -  int idx, u32 val)
> +static int intel_ntb_peer_spad_write(struct ntb_dev *ntb, int pidx,
> +  int sidx, u32 val)
>  {
>   struct intel_ntb_dev *ndev = ntb_ndev(ntb);
> 
> - return ndev_spad_write(ndev, idx, val,
> + return ndev_spad_write(ndev, sidx, val,
>  ndev->peer_mmio +
>  ndev->peer_reg-

RE: [PATCH v2 4/9] NTB: Alter MW API to support multi-ports devices

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> Multi-port NTB devices permit to share a memory between all accessible peers.
> Memory Windows API is altered to correspondingly initialize and map memory
> windows for such devices:
>  ntb_mw_count(pidx); - number of inbound memory windows, which can be 
> allocated
> for shared buffer with specified peer device.
>  ntb_mw_get_align(pidx, widx); - get alignment and size restrition parameters
> to properly allocate inbound memory region.
>  ntb_peer_mw_count(); - get number of outbound memory windows.
>  ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory 
> window
> 
> If hardware supports inbound translation configured on the local ntb port:
>  ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound
> memory window so a peer device could access it.
>  ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound
> memory window.
> 
> If hadrware supports outbound translation configured on the peer ntb port:

s/hadrware/hardware/

>  ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory
> window retrieved from a peer device
>  ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an
> outbound memory window
> 
> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  68 +---
>  drivers/ntb/hw/amd/ntb_hw_amd.h |   2 +
>  drivers/ntb/hw/intel/ntb_hw_intel.c |  90 
>  drivers/ntb/hw/intel/ntb_hw_intel.h |   2 +
>  drivers/ntb/ntb.c   |   2 +
>  drivers/ntb/ntb_transport.c |  21 +++-
>  drivers/ntb/test/ntb_perf.c |  17 ++-
>  drivers/ntb/test/ntb_tool.c |  43 +---
>  include/linux/ntb.h | 208 
> 
>  9 files changed, 346 insertions(+), 107 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index b6a4291..74fe9b8 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -5,6 +5,7 @@
>   *   GPL LICENSE SUMMARY
>   *
>   *   Copyright (C) 2016 Advanced Micro Devices, Inc. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   This program is free software; you can redistribute it and/or modify
>   *   it under the terms of version 2 of the GNU General Public License as
> @@ -13,6 +14,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright (C) 2016 Advanced Micro Devices, Inc. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
>   *   modification, are permitted provided that the following conditions
> @@ -213,40 +215,42 @@ static int ndev_mw_to_bar(struct amd_ntb_dev *ndev, int 
> idx)
>   return 1 << idx;
>  }
> 
> -static int amd_ntb_mw_count(struct ntb_dev *ntb)
> +static int amd_ntb_mw_count(struct ntb_dev *ntb, int pidx)
>  {
> + if (pidx > NTB_PIDX_MAX)
> + return -EINVAL;

pidx may be negative.  This should be if (pidx != 0) with some named constant 
for zero, or just if (pidx).

Similarly apply this comment below.

> +
>   return ntb_ndev(ntb)->mw_count;
>  }
> 
> -static int amd_ntb_mw_get_range(struct ntb_dev *ntb, int idx,
> - phys_addr_t *base,
> - resource_size_t *size,
> - resource_size_t *align,
> - resource_size_t *align_size)
> +static int amd_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
> + resource_size_t *addr_align,
> + resource_size_t *size_align,
> + resource_size_t *size_max)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>   int bar;
> 
> + if (pidx > NTB_PIDX_MAX)
> + return -EINVAL;
> +
>   bar = ndev_mw_to_bar(ndev, idx);
>   if (bar < 0)
>   return bar;
> 
> - if (base)
> - *base = pci_resource_start(ndev->ntb.pdev, bar);
> -
> - if (size)
> - *size = pci_resource_len(ndev->ntb.pdev, bar);
> + if (addr_align)
> + *addr_align = SZ_4K;
> 
> - if (align)
> - *align = SZ_4K;
> + if (size_align)
> + *size_align = 1;
> 
> - if (align_size)
> - *align_size = 1;
> + if (size_max)
> + *size_max = pci_resource_len(ndev->ntb.pdev, bar);
> 
>   return 0;
>  }
> 
> -static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int idx,
> +static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
>   dma_addr_t addr, resource_size_t size)
>  {
>   struct amd_ntb_dev *ndev = ntb_ndev(ntb);
> @@ -256,6 +260,9 @@ static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int 
> idx,
>   u64 base_addr, limit, reg_val;
>   int bar;
> 
> + if (pidx > NTB_PIDX_MAX)
> + return -EINV

RE: [PATCH v2 1/9] NTB: Make link-state API being declared first

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> Since link operations are usually performed before memory window access
> operations, it's logically better to declared link-related API before any
> other methods. Additionally it's good practice for readability to declare
> NTB device callback methods of hadrware drivers with the same order as it's
> done within ntb.h.

s/hadrware/hardware/

Please limit this change to ntb.h.  Leave it to the hardware driver maintainers 
to rearrange the methods in each driver if they care to.  Code movement may 
conflict with other changes in development.

> 
> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c | 188 
> ++--
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 168 
>  include/linux/ntb.h | 137 +-
>  3 files changed, 247 insertions(+), 246 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 6ccba0d..6704327 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -71,6 +71,97 @@ MODULE_AUTHOR("AMD Inc.");
>  static const struct file_operations amd_ntb_debugfs_info;
>  static struct dentry *debugfs_dir;
> 
> +static int amd_link_is_up(struct amd_ntb_dev *ndev)
> +{
> + if (!ndev->peer_sta)
> + return NTB_LNK_STA_ACTIVE(ndev->cntl_sta);
> +
> + /* If peer_sta is reset or D0 event, the ISR has
> +  * started a timer to check link status of hardware.
> +  * So here just clear status bit. And if peer_sta is
> +  * D3 or PME_TO, D0/reset event will be happened when
> +  * system wakeup/poweron, so do nothing here.
> +  */
> + if (ndev->peer_sta & AMD_PEER_RESET_EVENT)
> + ndev->peer_sta &= ~AMD_PEER_RESET_EVENT;
> + else if (ndev->peer_sta & AMD_PEER_D0_EVENT)
> + ndev->peer_sta = 0;
> +
> + return 0;
> +}
> +
> +static int amd_ntb_link_is_up(struct ntb_dev *ntb,
> +   enum ntb_speed *speed,
> +   enum ntb_width *width)
> +{
> + struct amd_ntb_dev *ndev = ntb_ndev(ntb);
> + int ret = 0;
> +
> + if (amd_link_is_up(ndev)) {
> + if (speed)
> + *speed = NTB_LNK_STA_SPEED(ndev->lnk_sta);
> + if (width)
> + *width = NTB_LNK_STA_WIDTH(ndev->lnk_sta);
> +
> + dev_dbg(ndev_dev(ndev), "link is up.\n");
> +
> + ret = 1;
> + } else {
> + if (speed)
> + *speed = NTB_SPEED_NONE;
> + if (width)
> + *width = NTB_WIDTH_NONE;
> +
> + dev_dbg(ndev_dev(ndev), "link is down.\n");
> + }
> +
> + return ret;
> +}
> +
> +static int amd_ntb_link_enable(struct ntb_dev *ntb,
> +enum ntb_speed max_speed,
> +enum ntb_width max_width)
> +{
> + struct amd_ntb_dev *ndev = ntb_ndev(ntb);
> + void __iomem *mmio = ndev->self_mmio;
> + u32 ntb_ctl;
> +
> + /* Enable event interrupt */
> + ndev->int_mask &= ~AMD_EVENT_INTMASK;
> + writel(ndev->int_mask, mmio + AMD_INTMASK_OFFSET);
> +
> + if (ndev->ntb.topo == NTB_TOPO_SEC)
> + return -EINVAL;
> + dev_dbg(ndev_dev(ndev), "Enabling Link.\n");
> +
> + ntb_ctl = readl(mmio + AMD_CNTL_OFFSET);
> + ntb_ctl |= (PMM_REG_CTL | SMM_REG_CTL);
> + writel(ntb_ctl, mmio + AMD_CNTL_OFFSET);
> +
> + return 0;
> +}
> +
> +static int amd_ntb_link_disable(struct ntb_dev *ntb)
> +{
> + struct amd_ntb_dev *ndev = ntb_ndev(ntb);
> + void __iomem *mmio = ndev->self_mmio;
> + u32 ntb_ctl;
> +
> + /* Disable event interrupt */
> + ndev->int_mask |= AMD_EVENT_INTMASK;
> + writel(ndev->int_mask, mmio + AMD_INTMASK_OFFSET);
> +
> + if (ndev->ntb.topo == NTB_TOPO_SEC)
> + return -EINVAL;
> + dev_dbg(ndev_dev(ndev), "Enabling Link.\n");
> +
> + ntb_ctl = readl(mmio + AMD_CNTL_OFFSET);
> + ntb_ctl &= ~(PMM_REG_CTL | SMM_REG_CTL);
> + writel(ntb_ctl, mmio + AMD_CNTL_OFFSET);
> +
> + return 0;
> +}
> +
>  static int ndev_mw_to_bar(struct amd_ntb_dev *ndev, int idx)
>  {
>   if (idx < 0 || idx > ndev->mw_count)
> @@ -194,97 +285,6 @@ static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int 
> idx,
>   return 0;
>  }
> 
> -static int amd_link_is_up(struct amd_ntb_dev *ndev)
> -{
> - if (!ndev->peer_sta)
> - return NTB_LNK_STA_ACTIVE(ndev->cntl_sta);
> -
> - /* If peer_sta is reset or D0 event, the ISR has
> -  * started a timer to check link status of hardware.
> -  * So here just clear status bit. And if peer_sta is
> -  * D3 or PME_TO, D0/reset event will be happened when
> -  * system wakeup/poweron, so do nothing here.
> -  */
> - if (ndev->peer_sta & AMD_PEER_RESET_EVENT)
> - ndev->peer_sta &= ~AMD_PEER_RESET_EVENT;
> - else if (ndev->peer_sta & AMD_PEER_D0_EV

RE: [PATCH v2 3/9] NTB: Alter link-state API to support multi-port devices

2016-12-12 Thread Allen Hubbe
From: Serge Semin
> Multi-port devices permit the NTB connections between multiple domains,
> so a local device can have NTB link being up with one peer and being
> down with another. NTB link-state API is appropriately altered to return
> a bitfield of the link-states between the local device and possible peers.
> 
> Signed-off-by: Serge Semin 

Acked-by: Allen Hubbe 

> 
> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  2 +-
>  drivers/ntb/hw/intel/ntb_hw_intel.c |  2 +-
>  include/linux/ntb.h | 31 ---
>  3 files changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 0b767ef..b6a4291 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -133,7 +133,7 @@ static int amd_link_is_up(struct amd_ntb_dev *ndev)
>   return 0;
>  }
> 
> -static int amd_ntb_link_is_up(struct ntb_dev *ntb,
> +static u64 amd_ntb_link_is_up(struct ntb_dev *ntb,
> enum ntb_speed *speed,
> enum ntb_width *width)
>  {
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 7e44dc3..f37b6fb 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -1078,7 +1078,7 @@ static int intel_ntb_peer_port_idx(struct ntb_dev *ntb, 
> int port)
>   return 0;
>  }
> 
> -static int intel_ntb_link_is_up(struct ntb_dev *ntb,
> +static u64 intel_ntb_link_is_up(struct ntb_dev *ntb,
>   enum ntb_speed *speed,
>   enum ntb_width *width)
>  {
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 3216689..47ec611 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -221,7 +221,7 @@ struct ntb_dev_ops {
>   int (*peer_port_number)(struct ntb_dev *ntb, int pidx);
>   int (*peer_port_idx)(struct ntb_dev *ntb, int port);
> 
> - int (*link_is_up)(struct ntb_dev *ntb,
> + u64 (*link_is_up)(struct ntb_dev *ntb,
> enum ntb_speed *speed, enum ntb_width *width);
>   int (*link_enable)(struct ntb_dev *ntb,
>  enum ntb_speed max_speed, enum ntb_width max_width);
> @@ -522,25 +522,26 @@ static inline int ntb_peer_port_idx(struct ntb_dev 
> *ntb, int port)
>   * state once after every link event.  It is safe to query the link state in
>   * the context of the link event callback.
>   *
> - * Return: One if the link is up, zero if the link is down, otherwise a
> - *   negative value indicating the error number.
> + * Return: bitfield of indexed ports link state: bit is set/cleared if the
> + * link is up/down respectively.
>   */
> -static inline int ntb_link_is_up(struct ntb_dev *ntb,
> +static inline u64 ntb_link_is_up(struct ntb_dev *ntb,
>enum ntb_speed *speed, enum ntb_width *width)
>  {
>   return ntb->ops->link_is_up(ntb, speed, width);
>  }
> 
>  /**
> - * ntb_link_enable() - enable the link on the secondary side of the ntb
> + * ntb_link_enable() - enable the local port ntb connection
>   * @ntb: NTB device context.
>   * @max_speed:   The maximum link speed expressed as PCIe generation 
> number.
>   * @max_width:   The maximum link width expressed as the number of PCIe 
> lanes.
>   *
> - * Enable the link on the secondary side of the ntb.  This can only be done
> - * from the primary side of the ntb in primary or b2b topology.  The ntb 
> device
> - * should train the link to its maximum speed and width, or the requested 
> speed
> - * and width, whichever is smaller, if supported.
> + * Enable the NTB/PCIe link on the local or remote (for bridge-to-bridge
> + * topology) side of the bridge. If it's supported the ntb device should 
> train
> + * the link to its maximum speed and width, or the requested speed and width,
> + * whichever is smaller. Some hardware doesn't support PCIe link training, so
> + * the last two arguments will be ignored then.
>   *
>   * Return: Zero on success, otherwise an error number.
>   */
> @@ -552,14 +553,14 @@ static inline int ntb_link_enable(struct ntb_dev *ntb,
>  }
> 
>  /**
> - * ntb_link_disable() - disable the link on the secondary side of the ntb
> + * ntb_link_disable() - disable the local port ntb connection
>   * @ntb: NTB device context.
>   *
> - * Disable the link on the secondary side of the ntb.  This can only be
> - * done from the primary side of the ntb in primary or b2b topology.  The ntb
> - * device should disable the link.  Returning from this call must ind

RE: [PATCH v2 2/9] NTB: Add indexed ports NTB API

2016-12-12 Thread Allen Hubbe
From: Serge Semin 
> There are some NTB hardware, which can combine more than just two domains
> over NTB. For instance, some IDT PCIe-switches can have NTB-functions
> activated on more than two-ports. The different domains are distinguished
> by ports they are connected to. So the new port-related methods are added to
> the NTB API:
>  ntb_port_number() - return local port
>  ntb_peer_port_count() - return number of peers local port can connect to
>  ntb_peer_port_number(pdix) - return port number by it index
>  ntb_peer_port_idx(port) - return port index by it number
> 
> Current test-drivers aren't changed much. They still support two-ports devices
> for the time being while multi-ports hardware drivers aren't added.
> 

The port methods are the same for PRI/SEC drivers.  Rather than duplicating the 
code, multiport could be made optional in the api, and default implementations 
provided by ntb common code.

Some comments below.

> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/hw/amd/ntb_hw_amd.c | 47 
>  drivers/ntb/hw/amd/ntb_hw_amd.h |  9 +
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 52 ++-
>  drivers/ntb/hw/intel/ntb_hw_intel.h |  9 +
>  drivers/ntb/ntb_transport.c |  6 
>  drivers/ntb/test/ntb_perf.c |  4 +++
>  drivers/ntb/test/ntb_pingpong.c |  6 
>  drivers/ntb/test/ntb_tool.c |  5 +++
>  include/linux/ntb.h | 71 
> +
>  9 files changed, 208 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
> index 6704327..0b767ef 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.c
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
> @@ -71,6 +71,49 @@ MODULE_AUTHOR("AMD Inc.");
>  static const struct file_operations amd_ntb_debugfs_info;
>  static struct dentry *debugfs_dir;
> 
> +static int amd_ntb_port_number(struct ntb_dev *ntb)
> +{
> + switch (ntb->topo) {
> + case NTB_TOPO_PRI:
> + case NTB_TOPO_B2B_USD:
> + return NTB_PORT_PRI_USD;
> + case NTB_TOPO_SEC:
> + case NTB_TOPO_B2B_DSD:
> + return NTB_PORT_SEC_DSD;
> + default:
> + break;
> + }
> +
> + return -EINVAL;
> +}
> +
> +static int amd_ntb_peer_port_count(struct ntb_dev *ntb)
> +{
> + return NTB_PEER_CNT;
> +}
> +
> +static int amd_ntb_peer_port_number(struct ntb_dev *ntb, int pidx)
> +{
> + int local_port = amd_ntb_port_number(ntb);
> +
> + if (pidx > NTB_PIDX_MAX)
> + return -EINVAL;

pidx may be negative.

> +
> + return (local_port == NTB_PORT_PRI_USD ?
> + NTB_PORT_SEC_DSD : NTB_PORT_PRI_USD);

local_port may be -EINVAL.

It may be simpler to have the same switch statement here as in 
amd_ntb_port_number(), with the return statements swapped.

> +}
> +
> +static int amd_ntb_peer_port_idx(struct ntb_dev *ntb, int port)
> +{
> + int local_port = amd_ntb_port_number(ntb);
> +
> + if ((local_port == NTB_PORT_PRI_USD && port != NTB_PORT_SEC_DSD) ||
> + (local_port == NTB_PORT_SEC_DSD && port != NTB_PORT_PRI_USD))
> + return -EINVAL;
> +

How about:

peer_port = amd_ntb_peer_port_number(ntb, 0);

if (peer_port == -EINVAL || port != peer_port)
return -EINVAL;

return 0;

> + return 0;
> +}
> +
>  static int amd_link_is_up(struct amd_ntb_dev *ndev)
>  {
>   if (!ndev->peer_sta)
> @@ -431,6 +474,10 @@ static int amd_ntb_peer_spad_write(struct ntb_dev *ntb,
>  }
> 
>  static const struct ntb_dev_ops amd_ntb_ops = {
> + .port_number= amd_ntb_port_number,
> + .peer_port_count= amd_ntb_peer_port_count,
> + .peer_port_number   = amd_ntb_peer_port_number,
> + .peer_port_idx  = amd_ntb_peer_port_idx,
>   .link_is_up = amd_ntb_link_is_up,
>   .link_enable= amd_ntb_link_enable,
>   .link_disable   = amd_ntb_link_disable,
> diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.h b/drivers/ntb/hw/amd/ntb_hw_amd.h
> index 2eac3cd..1aeb08f 100644
> --- a/drivers/ntb/hw/amd/ntb_hw_amd.h
> +++ b/drivers/ntb/hw/amd/ntb_hw_amd.h
> @@ -62,6 +62,10 @@
>  #define NTB_LNK_STA_SPEED(x) (((x) & NTB_LNK_STA_SPEED_MASK) >> 16)
>  #define NTB_LNK_STA_WIDTH(x) (((x) & NTB_LNK_STA_WIDTH_MASK) >> 20)
> 
> +/* port related constants */
> +#define NTB_PEER_CNT (1)
> +#define NTB_PIDX_MAX (0)

Just NTB_PEER_CNT is sufficient.  Anything that checks if (pidx > NTB_PIDX_MAX) 
could check if (pidx >= NTB_PEER_CNT).

> +
>  #ifndef read64
>  #ifdef readq
>  #define read64 readq
> @@ -91,6 +95,11 @@ static inline void _write64(u64 val, void __iomem *mmio)
>  #endif
>  #endif
> 
> +enum amd_ntb_port {
> + NTB_PORT_PRI_USD,
> + NTB_PORT_SEC_DSD
> +};

This could be part of ntb.h, since it will likely be the same for any of the 
PRI/SEC variety of NTB devices.  Making it a part of the api will encourage 
other PRI/SEC drivers t

RE: [PATCH 10/22] NTB Intel: Add port-related NTB API callback methods

2016-12-07 Thread Allen Hubbe
From: Serge Semin
>

This needs an actual commit message.

> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 195 
> +---
>  drivers/ntb/hw/intel/ntb_hw_intel.h |  10 ++
>  2 files changed, 124 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index d3da0ce..724ccfe 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c

I am leaning more toward recommending that the topo api be left alone.

See RE: [PATCH 02/22] NTB: Add peer indexed ports NTB API

This is context for that response, highlighting 

> @@ -1464,31 +1490,37 @@ static int xeon_poll_link(struct intel_ntb_dev *ndev)
> 
>  static int xeon_link_is_up(struct intel_ntb_dev *ndev)
>  {
> - if (ndev->ntb.topo == NTB_TOPO_SEC)
> + if (ndev->ntb.port == NTB_PORT_SEC)

Should this also check ntb.topo == P2P?  I think just checking ntb.port is 
incorrect, because ntb.port is also used with ntb.topo == B2B to distinguish 
"upstream" and "downstream" topologies.

ndev->ntb.topo == NTB_TOPO_P2P && ndev->ntb.port == NTB_PORT_SEC

>   return 1;
> 
>   return NTB_LNK_STA_ACTIVE(ndev->lnk_sta);
>  }
> 

Also, the changes to xeon_link_is_up cause a merge conflict with 
https://github.com/davejiang/linux ntb.

> 
>  static inline int xeon_ppd_bar4_split(struct intel_ntb_dev *ndev, u8 ppd)
> @@ -1776,39 +1808,39 @@ static int xeon_init_ntb(struct intel_ntb_dev *ndev)
>   ndev->db_count = XEON_DB_COUNT;
>   ndev->db_link_mask = XEON_DB_LINK_BIT;
> 
> - switch (ndev->ntb.topo) {
> - case NTB_TOPO_PRI:
> - if (ndev->hwerr_flags & NTB_HWERR_SDOORBELL_LOCKUP) {
> - dev_err(ndev_dev(ndev), "NTB Primary config 
> disabled\n");
> - return -EINVAL;
> - }
> -
> - /* enable link to allow secondary side device to appear */
> - ntb_ctl = ioread32(ndev->self_mmio + ndev->reg->ntb_ctl);
> - ntb_ctl &= ~NTB_CTL_DISABLE;
> - iowrite32(ntb_ctl, ndev->self_mmio + ndev->reg->ntb_ctl);
> -
> - /* use half the spads for the peer */
> - ndev->spad_count >>= 1;
> - ndev->self_reg = &xeon_pri_reg;
> - ndev->peer_reg = &xeon_sec_reg;
> - ndev->xlat_reg = &xeon_sec_xlat;
> - break;
> + if (ndev->ntb.topo == NTB_TOPO_P2P) {
> + if (ndev->ntb.port == NTB_PORT_PRI) {

This was one case in a switch statement, now two branches.

> + if (ndev->hwerr_flags & NTB_HWERR_SDOORBELL_LOCKUP) {
> + dev_err(ndev_dev(ndev),
> + "NTB Primary config disabled\n");
> + return -EINVAL;
> + }
> 
> - case NTB_TOPO_SEC:
> - if (ndev->hwerr_flags & NTB_HWERR_SDOORBELL_LOCKUP) {
> - dev_err(ndev_dev(ndev), "NTB Secondary config 
> disabled\n");
> - return -EINVAL;
> + /* enable link to allow secondary side dev to appear */
> + ntb_ctl = ioread32(ndev->self_mmio +
> +ndev->reg->ntb_ctl);
> + ntb_ctl &= ~NTB_CTL_DISABLE;
> + iowrite32(ntb_ctl, ndev->self_mmio +
> +   ndev->reg->ntb_ctl);

The nested if has increased the indentation, and lines have had to be further 
split to conform to coding standards.

The indentation changes here affect many lines of code, and create a 
complicated merge with https://github.com/davejiang/linux ntb.

> +
> + /* use half the spads for the peer */
> + ndev->spad_count >>= 1;
> + ndev->self_reg = &xeon_pri_reg;
> + ndev->peer_reg = &xeon_sec_reg;
> + ndev->xlat_reg = &xeon_sec_xlat;
> + } else {
> + if (ndev->hwerr_flags & NTB_HWERR_SDOORBELL_LOCKUP) {
> + dev_err(ndev_dev(ndev),
> + "NTB Secondary config disabled\n");
> + return -EINVAL;
> + }
> + /* use half the spads for the peer */
> + ndev->spad_count >>= 1;
> + ndev->self_reg = &xeon_sec_reg;
> + ndev->peer_reg = &xeon_pri_reg;
> + ndev->xlat_reg = &xeon_pri_xlat;
>   }
> - /* use half the spads for the peer */
> - ndev->spad_count >>= 1;
> - ndev->self_reg = &xeon_sec_reg;
> - ndev->peer_reg = &xeon_pri_reg;
> - ndev->xlat_reg = &xeon_pri_xlat;
> - break;
> -
> - case NTB_TOPO_B2B_USD:
> - case NTB_TOPO_B2B_DSD:
> + } else {

This else combines two cases in 

RE: [PATCH 02/22] NTB: Add peer indexed ports NTB API

2016-12-07 Thread Allen Hubbe
From: Allen Hubbe
> From: Serge Semin
> > Add new port-index NTB API. Additionally lets get rid of Primary and
> > Secondary topologies, since port-number can be effectively used instead.
> 
> Split into two patches please.
> 
> I see no harm to the TOPO changes, though I wonder if they are necessary.
> 

I am leaning more toward recommending that the topo api be left alone.

The topo changes to the ntb api complicate the Intel driver:
 - The changes add second branches where there had been just one before (need 
to check topo AND port now).
 - The changes also cause some complicated merge conflicts with 
https://github.com/davejiang/linux.git ntb.

See RE: [PATCH 10/22] NTB Intel: Add port-related NTB API callback methods

If we leave the ntb topo api as it was, then on multiport devices, if the local 
port is not the primary port, let it be one of potentially many secondary 
ports.  Or, if there is no distinction between primary/secondary on some 
hardware, let them all be primary.

This topo api doesn't have much value for existing drivers (that I know of), 
except for informative purposes.  So, my preference for changing it would be, 
only if necessary, and to minimize changes otherwise.

Allen



RE: [PATCH 02/22] NTB: Add peer indexed ports NTB API

2016-12-03 Thread Allen Hubbe
From: Serge Semin
> Add new port-index NTB API. Additionally lets get rid of Primary and
> Secondary topologies, since port-number can be effectively used instead.

Split into two patches please.

I see no harm to the TOPO changes, though I wonder if they are necessary.

> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 101 
> 
>  1 file changed, 79 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 5d1f260..0941a43 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -64,37 +64,21 @@ struct pci_dev;
>  /**
>   * enum ntb_topo - NTB connection topology
>   * @NTB_TOPO_NONE:   Topology is unknown or invalid.
> - * @NTB_TOPO_PRI:On primary side of local ntb.
> - * @NTB_TOPO_SEC:On secondary side of remote ntb.
> - * @NTB_TOPO_B2B_USD:On primary side of local ntb upstream of remote 
> ntb.
> - * @NTB_TOPO_B2B_DSD:On primary side of local ntb downstream of 
> remote ntb.
> + * @NTB_TOPO_P2P:Simple port-to-port NTB topology
> + * @NTB_TOPO_B2B:Bridge-to-bridge NTB topology
>   */
>  enum ntb_topo {
>   NTB_TOPO_NONE = -1,
> - NTB_TOPO_PRI,
> - NTB_TOPO_SEC,
> - NTB_TOPO_B2B_USD,
> - NTB_TOPO_B2B_DSD,
> + NTB_TOPO_P2P,
> + NTB_TOPO_B2B
>  };
> 
> -static inline int ntb_topo_is_b2b(enum ntb_topo topo)
> -{
> - switch ((int)topo) {
> - case NTB_TOPO_B2B_USD:
> - case NTB_TOPO_B2B_DSD:
> - return 1;
> - }
> - return 0;
> -}
> -
>  static inline char *ntb_topo_string(enum ntb_topo topo)
>  {
>   switch (topo) {
>   case NTB_TOPO_NONE: return "NTB_TOPO_NONE";
> - case NTB_TOPO_PRI:  return "NTB_TOPO_PRI";
> - case NTB_TOPO_SEC:  return "NTB_TOPO_SEC";
> - case NTB_TOPO_B2B_USD:  return "NTB_TOPO_B2B_USD";
> - case NTB_TOPO_B2B_DSD:  return "NTB_TOPO_B2B_DSD";
> + case NTB_TOPO_P2P:  return "NTB_TOPO_P2P";
> + case NTB_TOPO_B2B:  return "NTB_TOPO_B2B";
>   }
>   return "NTB_TOPO_INVALID";
>  }
> @@ -179,6 +163,10 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
> 
>  /**
>   * struct ntb_ctx_ops - ntb device operations
> + * @port_number: See ntb_port_number().
> + * @peer_port_count: See ntb_peer_port_count().
> + * @peer_port_number:See ntb_peer_port_number().
> + * @peer_port_idx:   See ntb_peer_port_idx().
>   * @link_is_up:  See ntb_link_is_up().
>   * @link_enable: See ntb_link_enable().
>   * @link_disable:See ntb_link_disable().
> @@ -212,6 +200,11 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @peer_spad_write: See ntb_peer_spad_write().
>   */
>  struct ntb_dev_ops {
> + int (*port_number)(struct ntb_dev *ntb);
> + int (*peer_port_count)(struct ntb_dev *ntb);
> + int (*peer_port_number)(struct ntb_dev *ntb, int pidx);
> + int (*peer_port_idx)(struct ntb_dev *ntb, int port);
> +
>   int (*link_is_up)(struct ntb_dev *ntb,
> enum ntb_speed *speed, enum ntb_width *width);
>   int (*link_enable)(struct ntb_dev *ntb,
> @@ -265,6 +258,10 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + ops->port_number&&
> + ops->peer_port_count&&
> + ops->peer_port_number   &&
> + ops->peer_port_idx  &&
>   ops->link_is_up &&
>   ops->link_enable&&
>   ops->link_disable   &&
> @@ -319,6 +316,7 @@ struct ntb_client {
>   * @dev: Linux device object.
>   * @pdev:Pci device entry of the ntb.
>   * @topo:Detected topology of the ntb.
> + * @port:Local port of the ntb.
>   * @ops: See &ntb_dev_ops.
>   * @ctx: See &ntb_ctx_ops.
>   * @ctx_ops: See &ntb_ctx_ops.
> @@ -327,6 +325,7 @@ struct ntb_dev {
>   struct device   dev;
>   struct pci_dev  *pdev;
>   enum ntb_topo   topo;
> + int port;
>   const struct ntb_dev_ops*ops;
>   void*ctx;
>   const struct ntb_ctx_ops*ctx_ops;
> @@ -442,6 +441,64 @@ void ntb_link_event(struct ntb_dev *ntb);
>  void ntb_db_event(struct ntb_dev *ntb, int vector);
> 
>  /**
> + * ntb_port_number() - get the local port number
> + * @ntb: NTB device context.
> + *
> + * Hardware must support at least simple two-ports topology
> + *
> + * Return: the local port number
> + */
> +static inline int ntb_port_number(struct ntb_dev *ntb)
> +{
> + return ntb->ops->port_number(ntb);
> +}
> +
> +/**
> + * ntb_peer_port_count() - get the number o

RE: [PATCH 06/22] NTB: Slightly alter link state NTB API

2016-12-03 Thread Allen Hubbe
From: Serge Semin 
> Some minor changes of link state NTB API. Particularly link_is_up()
> method from now shall return a bitfield of link states for all accessible
> port indexes.

Looks good.  I plan to ack.

See comment on ntb_link_enable.

> 
> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 31 ---
>  1 file changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index fc9d034..a59a155 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -221,7 +221,7 @@ struct ntb_dev_ops {
>   int (*peer_port_number)(struct ntb_dev *ntb, int pidx);
>   int (*peer_port_idx)(struct ntb_dev *ntb, int port);
> 
> - int (*link_is_up)(struct ntb_dev *ntb,
> + u64 (*link_is_up)(struct ntb_dev *ntb,
> enum ntb_speed *speed, enum ntb_width *width);
>   int (*link_enable)(struct ntb_dev *ntb,
>  enum ntb_speed max_speed, enum ntb_width max_width);
> @@ -567,25 +567,26 @@ static inline int ntb_peer_port_idx(struct ntb_dev 
> *ntb, int port)
>   * state once after every link event.  It is safe to query the link state in
>   * the context of the link event callback.
>   *
> - * Return: One if the link is up, zero if the link is down, otherwise a
> - *   negative value indicating the error number.
> + * Return: bitfield of indexed ports link state: bit is set/cleared if the
> + * link is up/down respectively, otherwise a negative value 
> indicating
> + * an error number.
>   */
> -static inline int ntb_link_is_up(struct ntb_dev *ntb,
> +static inline u64 ntb_link_is_up(struct ntb_dev *ntb,
>enum ntb_speed *speed, enum ntb_width *width)
>  {
>   return ntb->ops->link_is_up(ntb, speed, width);
>  }
> 
>  /**
> - * ntb_link_enable() - enable the link on the secondary side of the ntb
> + * ntb_link_enable() - enable the link of the ntb

Maybe: enable the local port ntb connection.  The description that follows is 
good for B2B topology, but the description including "link training" might not 
be accurate for multi-port devices.

>   * @ntb: NTB device context.
>   * @max_speed:   The maximum link speed expressed as PCIe generation 
> number.
>   * @max_width:   The maximum link width expressed as the number of PCIe 
> lanes.
>   *
> - * Enable the link on the secondary side of the ntb.  This can only be done
> - * from the primary side of the ntb in primary or b2b topology.  The ntb 
> device
> - * should train the link to its maximum speed and width, or the requested 
> speed
> - * and width, whichever is smaller, if supported.
> + * Enable the NTB/PCIe link on the local or remote (for bridge-to-bridge
> + * topology) side of the bridge. The ntb device should train the link to its
> + * maximum speed and width, or the requested speed and width, whichever is
> + * smaller, if supported.
>   *
>   * Return: Zero on success, otherwise an error number.
>   */
> @@ -597,14 +598,14 @@ static inline int ntb_link_enable(struct ntb_dev *ntb,
>  }
> 
>  /**
> - * ntb_link_disable() - disable the link on the secondary side of the ntb
> + * ntb_link_disable() - disable the link of the ntb
>   * @ntb: NTB device context.
>   *
> - * Disable the link on the secondary side of the ntb.  This can only be
> - * done from the primary side of the ntb in primary or b2b topology.  The ntb
> - * device should disable the link.  Returning from this call must indicate 
> that
> - * a barrier has passed, though with no more writes may pass in either
> - * direction across the link, except if this call returns an error number.
> + * Disable the link on the local or remote (for b2b topology) of the ntb.
> + * The ntb device should disable the link.  Returning from this call must
> + * indicate that a barrier has passed, though with no more writes may pass in
> + * either direction across the link, except if this call returns an error
> + * number.
>   *
>   * Return: Zero on success, otherwise an error number.
>   */
> --
> 2.6.6




RE: [PATCH 03/22] NTB: Alter NTB API to support both inbound and outbound MW based interfaces

2016-12-03 Thread Allen Hubbe
From: Serge Semin 
> Alter NTB API to support inbound and outbound MW based interfaces.
> Additionally I made it supporting multi-port devices as well. Useful
> infographics is added right before MW API is declared. It shall help to
> better understand how the new API really works and how it can be utilized
> within client drivers.
> 

This looks good.  I plan to ack.

See comments below on documentation.

> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 290 
> 
>  1 file changed, 245 insertions(+), 45 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 0941a43..4a150b5 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -171,9 +171,13 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @link_enable: See ntb_link_enable().
>   * @link_disable:See ntb_link_disable().
>   * @mw_count:See ntb_mw_count().
> - * @mw_get_range:See ntb_mw_get_range().
> + * @mw_get_align:See ntb_mw_get_align().
>   * @mw_set_trans:See ntb_mw_set_trans().
>   * @mw_clear_trans:  See ntb_mw_clear_trans().
> + * @peer_mw_count:   See ntb_peer_mw_count().
> + * @peer_mw_get_addr:See ntb_peer_mw_get_addr().
> + * @peer_mw_set_trans:   See ntb_peer_mw_set_trans().
> + * @peer_mw_clear_trans:See ntb_peer_mw_clear_trans().
>   * @db_is_unsafe:See ntb_db_is_unsafe().
>   * @db_valid_mask:   See ntb_db_valid_mask().
>   * @db_vector_count: See ntb_db_vector_count().
> @@ -211,13 +215,20 @@ struct ntb_dev_ops {
>  enum ntb_speed max_speed, enum ntb_width max_width);
>   int (*link_disable)(struct ntb_dev *ntb);
> 
> - int (*mw_count)(struct ntb_dev *ntb);
> - int (*mw_get_range)(struct ntb_dev *ntb, int idx,
> - phys_addr_t *base, resource_size_t *size,
> - resource_size_t *align, resource_size_t *align_size);
> - int (*mw_set_trans)(struct ntb_dev *ntb, int idx,
> + int (*mw_count)(struct ntb_dev *ntb, int pidx);
> + int (*mw_get_align)(struct ntb_dev *ntb, int pidx, int widx,
> + resource_size_t *addr_align,
> + resource_size_t *size_align,
> + resource_size_t *size_max);
> + int (*mw_set_trans)(struct ntb_dev *ntb, int pidx, int widx,
>   dma_addr_t addr, resource_size_t size);
> - int (*mw_clear_trans)(struct ntb_dev *ntb, int idx);
> + int (*mw_clear_trans)(struct ntb_dev *ntb, int pidx, int widx);
> + int (*peer_mw_count)(struct ntb_dev *ntb);
> + int (*peer_mw_get_addr)(struct ntb_dev *ntb, int widx,
> + phys_addr_t *base, resource_size_t *size);
> + int (*peer_mw_set_trans)(struct ntb_dev *ntb, int pidx, int widx,
> +  u64 addr, resource_size_t size);
> + int (*peer_mw_clear_trans)(struct ntb_dev *ntb, int pidx, int widx);
> 
>   int (*db_is_unsafe)(struct ntb_dev *ntb);
>   u64 (*db_valid_mask)(struct ntb_dev *ntb);
> @@ -266,9 +277,13 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   ops->link_enable&&
>   ops->link_disable   &&
>   ops->mw_count   &&
> - ops->mw_get_range   &&
> - ops->mw_set_trans   &&
> + ops->mw_get_align   &&
> + (ops->mw_set_trans  ||
> +  ops->peer_mw_set_trans)&&
>   /* ops->mw_clear_trans  && */
> + ops->peer_mw_count  &&
> + ops->peer_mw_get_addr   &&
> + /* ops->peer_mw_clear_trans && */
> 
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> @@ -555,79 +570,264 @@ static inline int ntb_link_disable(struct ntb_dev *ntb)
>  }
> 
>  /**
> - * ntb_mw_count() - get the number of memory windows
> + *   NTB Memory Windows description

The two variants could be more succintly described as "inbound translation 
configured on the local ntb port" and "outbound translation configured by the 
peer, on the peer ntb port" for a locally allocated dma-mapped range of memory.

Please avoid confusing these concepts in the documentation:
- "Memory" on the system vs. "Memory Window" on the ntb
- "Physical" address vs. "dma-mapped" address of memory
- "Base Address" vs. "Translation Address"

Inbound translation:

  Memory:  Local NTB Port:  Peer NTB Port:Peer MMIO:
    
  | dma-mapped |-ntb_set_xlat_addr(addr) |
  | memory |_v   |   __
  | (addr) |<==| MW xlat addr |<| MW base addr |< memory-m

RE: [PATCH 08/22] NTB: Add T-Platforms copyrights to NTB API

2016-12-03 Thread Allen Hubbe
> From: Serge Semin
> 
> Signed-off-by: Serge Semin 

This patch has no comment, but instead...

This can be squashed with your first commit of significant changes to each file.

> 
> ---
>  drivers/ntb/ntb.c   | 2 ++
>  include/linux/ntb.h | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
> index 4b2cc60..06574f8 100644
> --- a/drivers/ntb/ntb.c
> +++ b/drivers/ntb/ntb.c
> @@ -5,6 +5,7 @@
>   *   GPL LICENSE SUMMARY
>   *
>   *   Copyright (C) 2015 EMC Corporation. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   This program is free software; you can redistribute it and/or modify
>   *   it under the terms of version 2 of the GNU General Public License as
> @@ -18,6 +19,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright (C) 2015 EMC Corporation. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
>   *   modification, are permitted provided that the following conditions
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 8b19327..9edd9dc 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -5,6 +5,7 @@
>   *   GPL LICENSE SUMMARY
>   *
>   *   Copyright (C) 2015 EMC Corporation. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   This program is free software; you can redistribute it and/or modify
>   *   it under the terms of version 2 of the GNU General Public License as
> @@ -18,6 +19,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright (C) 2015 EMC Corporation. All Rights Reserved.
> + *   Copyright (C) 2016 T-Platforms. All Rights Reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
>   *   modification, are permitted provided that the following conditions
> --
> 2.6.6




RE: [PATCH 04/22] NTB: Add messaging NTB API

2016-12-03 Thread Allen Hubbe
From: Serge Semin 
> IDT PCIe-switches have message registers to communicate with peer devices.
> This patch adds new NTB API callback methods, which can be used to utilize
> these registers functionality.
> 

Please split: add msg api; make spads optional.

See comments below on ntb_dev_ops_is_valid.

> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/ntb.c   |  13 +++
>  include/linux/ntb.h | 236 
> ++--
>  2 files changed, 241 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
> index 2e25307..4b2cc60 100644
> --- a/drivers/ntb/ntb.c
> +++ b/drivers/ntb/ntb.c
> @@ -191,6 +191,19 @@ void ntb_db_event(struct ntb_dev *ntb, int vector)
>  }
>  EXPORT_SYMBOL(ntb_db_event);
> 
> +void ntb_msg_event(struct ntb_dev *ntb)
> +{
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&ntb->ctx_lock, irqflags);
> + {
> + if (ntb->ctx_ops && ntb->ctx_ops->msg_event)
> + ntb->ctx_ops->msg_event(ntb->ctx);
> + }
> + spin_unlock_irqrestore(&ntb->ctx_lock, irqflags);
> +}
> +EXPORT_SYMBOL(ntb_msg_event);
> +
>  static int ntb_probe(struct device *dev)
>  {
>   struct ntb_dev *ntb;
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 4a150b5..59de1f6 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -146,10 +146,12 @@ static inline int ntb_client_ops_is_valid(const struct
> ntb_client_ops *ops)
>   * struct ntb_ctx_ops - ntb driver context operations
>   * @link_event:  See ntb_link_event().
>   * @db_event:See ntb_db_event().
> + * @msg_event:   See ntb_msg_event().
>   */
>  struct ntb_ctx_ops {
>   void (*link_event)(void *ctx);
>   void (*db_event)(void *ctx, int db_vector);
> + void (*msg_event)(void *ctx);
>  };
> 
>  static inline int ntb_ctx_ops_is_valid(const struct ntb_ctx_ops *ops)
> @@ -158,6 +160,7 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   return
>   /* ops->link_event  && */
>   /* ops->db_event&& */
> + /* ops->msg_event   && */
>   1;
>  }
> 
> @@ -202,6 +205,15 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @peer_spad_addr:  See ntb_peer_spad_addr().
>   * @peer_spad_read:  See ntb_peer_spad_read().
>   * @peer_spad_write: See ntb_peer_spad_write().
> + * @msg_count:   See ntb_msg_count().
> + * @msg_inbits:  See ntb_msg_inbits().
> + * @msg_outbits: See ntb_msg_outbits().
> + * @msg_read_sts:See ntb_msg_read_sts().
> + * @msg_clear_sts:   See ntb_msg_clear_sts().
> + * @msg_set_mask:See ntb_msg_set_mask().
> + * @msg_clear_mask:  See ntb_msg_clear_mask().
> + * @msg_read:See ntb_msg_read().
> + * @msg_write:   See ntb_msg_write().
>   */
>  struct ntb_dev_ops {
>   int (*port_number)(struct ntb_dev *ntb);
> @@ -263,6 +275,16 @@ struct ntb_dev_ops {
> phys_addr_t *spad_addr);
>   u32 (*peer_spad_read)(struct ntb_dev *ntb, int idx);
>   int (*peer_spad_write)(struct ntb_dev *ntb, int idx, u32 val);
> +
> + int (*msg_count)(struct ntb_dev *ntb);
> + u64 (*msg_inbits)(struct ntb_dev *ntb);
> + u64 (*msg_outbits)(struct ntb_dev *ntb);
> + u64 (*msg_read_sts)(struct ntb_dev *ntb);
> + int (*msg_clear_sts)(struct ntb_dev *ntb, u64 sts_bits);
> + int (*msg_set_mask)(struct ntb_dev *ntb, u64 mask_bits);
> + int (*msg_clear_mask)(struct ntb_dev *ntb, u64 mask_bits);
> + int (*msg_read)(struct ntb_dev *ntb, int midx, int *pidx, u32 *msg);
> + int (*msg_write)(struct ntb_dev *ntb, int midx, int pidx, u32 msg);
>  };
> 
>  static inline int ntb_dev_ops_is_valid(const struct ntb_dev_ops *ops)
> @@ -304,13 +326,22 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>   /* ops->peer_db_read_mask   && */
>   /* ops->peer_db_set_mask&& */
>   /* ops->peer_db_clear_mask  && */
> - /* ops->spad_is_unsafe  && */
> - ops->spad_count &&
> - ops->spad_read  &&
> - ops->spad_write &&
> - /* ops->peer_spad_addr  && */
> - /* ops->peer_spad_read  && */
> - ops->peer_spad_write&&
> + ((/* ops->spad_is_unsafe&& */
> +   ops->spad_count   &&
> +   ops->spad_read&&
> +   ops->spad_write   &&
> +   /* ops->peer_spad_addr&& */
> +   /* ops->peer_spad_read&& */
> +   ops->peer_spad_write) ||

RE: [PATCH 07/22] NTB: Fix a few ntb.h issues

2016-12-03 Thread Allen Hubbe
From: Serge Semin 
> Fix some minor issues found in ntb.h file.
> 

"Fix a few issues" is not a descriptive commit title or message.

Please split: add NTB_SPEED_GEN4, ntb.h comments.

Changes look good and I will ack.

> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index a59a155..8b19327 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -90,6 +90,7 @@ static inline char *ntb_topo_string(enum ntb_topo topo)
>   * @NTB_SPEED_GEN1:  Link is trained to gen1 speed.
>   * @NTB_SPEED_GEN2:  Link is trained to gen2 speed.
>   * @NTB_SPEED_GEN3:  Link is trained to gen3 speed.
> + * @NTB_SPEED_GEN4:  Link is trained to gen4 speed.
>   */
>  enum ntb_speed {
>   NTB_SPEED_AUTO = -1,
> @@ -97,6 +98,7 @@ enum ntb_speed {
>   NTB_SPEED_GEN1 = 1,
>   NTB_SPEED_GEN2 = 2,
>   NTB_SPEED_GEN3 = 3,
> + NTB_SPEED_GEN4 = 4
>  };
> 
>  /**
> @@ -292,13 +294,18 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + /* Port operations are required */
>   ops->port_number&&
>   ops->peer_port_count&&
>   ops->peer_port_number   &&
>   ops->peer_port_idx  &&
> +
> + /* Link operations are requiered */
>   ops->link_is_up &&
>   ops->link_enable&&
>   ops->link_disable   &&
> +
> + /* One or both MW interfaces should be developed */
>   ops->mw_count   &&
>   ops->mw_get_align   &&
>   (ops->mw_set_trans  ||
> @@ -308,12 +315,11 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>   ops->peer_mw_get_addr   &&
>   /* ops->peer_mw_clear_trans && */
> 
> + /* Doorbell operations are mostly required */
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> -
>   /* both set, or both unset */
>   (!ops->db_vector_count == !ops->db_vector_mask) &&
> -
>   ops->db_read&&
>   /* ops->db_set  && */
>   ops->db_clear   &&
> @@ -327,6 +333,8 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops *ops)
>   /* ops->peer_db_read_mask   && */
>   /* ops->peer_db_set_mask&& */
>   /* ops->peer_db_clear_mask  && */
> +
> + /* Scrachpad or messaging interfaces should be developed */
>   ((/* ops->spad_is_unsafe&& */
> ops->spad_count   &&
> ops->spad_read&&
> @@ -355,13 +363,12 @@ struct ntb_client {
>   struct device_driverdrv;
>   const struct ntb_client_ops ops;
>  };
> -
>  #define drv_ntb_client(__drv) container_of((__drv), struct ntb_client, drv)
> 
>  /**
>   * struct ntb_device - ntb device
>   * @dev: Linux device object.
> - * @pdev:Pci device entry of the ntb.
> + * @pdev:PCI device entry of the ntb.
>   * @topo:Detected topology of the ntb.
>   * @port:Local port of the ntb.
>   * @ops: See &ntb_dev_ops.
> @@ -384,7 +391,6 @@ struct ntb_dev {
>   /* block unregister until device is fully released */
>   struct completion   released;
>  };
> -
>  #define dev_ntb(__dev) container_of((__dev), struct ntb_dev, dev)
> 
>  /**
> @@ -481,7 +487,7 @@ void ntb_link_event(struct ntb_dev *ntb);
>   * multiple interrupt vectors for doorbells, the vector number indicates 
> which
>   * vector received the interrupt.  The vector number is relative to the first
>   * vector used for doorbells, starting at zero, and must be less than
> - ** ntb_db_vector_count().  The driver may call ntb_db_read() to check which
> + * ntb_db_vector_count().  The driver may call ntb_db_read() to check which
>   * doorbell bits need service, and ntb_db_vector_mask() to determine which of
>   * those bits are associated with the vector number.
>   */
> --
> 2.6.6




RE: [PATCH 01/22] NTB: Move link state API being first in sources

2016-12-03 Thread Allen Hubbe
From: Serge Semin
> Since link operations are usually performed before memory window access
> operations, it's logically better to declared link-related API before any
> other methods. Additionally it's good practice for readability to declare
> NTB device callback methods of hadrware drivers with the same order as it's
> done within ntb.h.

No harm.  Please spellcheck the description.

I think this and patch 7 can be squashed.

I plan to ack.

> Signed-off-by: Serge Semin 
>
> ---
>  include/linux/ntb.h | 137 
> ++--
>  1 file changed, 69 insertions(+), 68 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 6f47562..5d1f260 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -179,13 +179,13 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops
> *ops)
> 
>  /**
>   * struct ntb_ctx_ops - ntb device operations
> + * @link_is_up:  See ntb_link_is_up().
> + * @link_enable: See ntb_link_enable().
> + * @link_disable:See ntb_link_disable().
>   * @mw_count:See ntb_mw_count().
>   * @mw_get_range:See ntb_mw_get_range().
>   * @mw_set_trans:See ntb_mw_set_trans().
>   * @mw_clear_trans:  See ntb_mw_clear_trans().
> - * @link_is_up:  See ntb_link_is_up().
> - * @link_enable: See ntb_link_enable().
> - * @link_disable:See ntb_link_disable().
>   * @db_is_unsafe:See ntb_db_is_unsafe().
>   * @db_valid_mask:   See ntb_db_valid_mask().
>   * @db_vector_count: See ntb_db_vector_count().
> @@ -212,6 +212,12 @@ static inline int ntb_ctx_ops_is_valid(const struct 
> ntb_ctx_ops *ops)
>   * @peer_spad_write: See ntb_peer_spad_write().
>   */
>  struct ntb_dev_ops {
> + int (*link_is_up)(struct ntb_dev *ntb,
> +   enum ntb_speed *speed, enum ntb_width *width);
> + int (*link_enable)(struct ntb_dev *ntb,
> +enum ntb_speed max_speed, enum ntb_width max_width);
> + int (*link_disable)(struct ntb_dev *ntb);
> +
>   int (*mw_count)(struct ntb_dev *ntb);
>   int (*mw_get_range)(struct ntb_dev *ntb, int idx,
>   phys_addr_t *base, resource_size_t *size,
> @@ -220,12 +226,6 @@ struct ntb_dev_ops {
>   dma_addr_t addr, resource_size_t size);
>   int (*mw_clear_trans)(struct ntb_dev *ntb, int idx);
> 
> - int (*link_is_up)(struct ntb_dev *ntb,
> -   enum ntb_speed *speed, enum ntb_width *width);
> - int (*link_enable)(struct ntb_dev *ntb,
> -enum ntb_speed max_speed, enum ntb_width max_width);
> - int (*link_disable)(struct ntb_dev *ntb);
> -
>   int (*db_is_unsafe)(struct ntb_dev *ntb);
>   u64 (*db_valid_mask)(struct ntb_dev *ntb);
>   int (*db_vector_count)(struct ntb_dev *ntb);
> @@ -265,13 +265,14 @@ static inline int ntb_dev_ops_is_valid(const struct 
> ntb_dev_ops
> *ops)
>  {
>   /* commented callbacks are not required: */
>   return
> + ops->link_is_up &&
> + ops->link_enable&&
> + ops->link_disable   &&
>   ops->mw_count   &&
>   ops->mw_get_range   &&
>   ops->mw_set_trans   &&
>   /* ops->mw_clear_trans  && */
> - ops->link_is_up &&
> - ops->link_enable&&
> - ops->link_disable   &&
> +
>   /* ops->db_is_unsafe&& */
>   ops->db_valid_mask  &&
> 
> @@ -441,6 +442,62 @@ void ntb_link_event(struct ntb_dev *ntb);
>  void ntb_db_event(struct ntb_dev *ntb, int vector);
> 
>  /**
> + * ntb_link_is_up() - get the current ntb link state
> + * @ntb: NTB device context.
> + * @speed:   OUT - The link speed expressed as PCIe generation number.
> + * @width:   OUT - The link width expressed as the number of PCIe lanes.
> + *
> + * Get the current state of the ntb link.  It is recommended to query the 
> link
> + * state once after every link event.  It is safe to query the link state in
> + * the context of the link event callback.
> + *
> + * Return: One if the link is up, zero if the link is down, otherwise a
> + *   negative value indicating the error number.
> + */
> +static inline int ntb_link_is_up(struct ntb_dev *ntb,
> +  enum ntb_speed *speed, enum ntb_width *width)
> +{
> + return ntb->ops->link_is_up(ntb, speed, width);
> +}
> +
> +/**
> + * ntb_link_enable() - enable the link on the secondary side of the ntb
> + * @ntb: NTB device context.
> + * @max_speed:   The maximum link speed expressed as PCIe generation 
> number.
> + * @max_width:   The maximum link width expressed as the number of PCIe 
> lanes.
> + *
> +

RE: [PATCH 05/22] NTB: Alter Scratchpads NTB API to support multi-ports interface

2016-12-03 Thread Allen Hubbe
From: Serge Semin 
> Even though there is no any real NTB hardware, which would have both more
> than two ports and Scratchpad registers, it is logically correct to have
> Scratchpad API accepting a peer port index as well. Intel/AMD drivers used
> to utilize Primary and Secondary topology to split Scratchpad between
> connected root devices. Since port-index API replaced Primary and Secondary
> topology, Intel/AMD NTB hadrware drivers can use device port to determine
> which Scratchpad registers actually belong to local and peer devices.
> The same approach can be used if some potential hardware in future will be
> multi-port and have some set of Scratchpads.
> 

I agree with addition of the peer port index, but I don't think s/idx/sidx/ for 
scratchpad index is necessary.

> Signed-off-by: Serge Semin 
> 
> ---
>  include/linux/ntb.h | 46 ++
>  1 file changed, 26 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/ntb.h b/include/linux/ntb.h
> index 59de1f6..fc9d034 100644
> --- a/include/linux/ntb.h
> +++ b/include/linux/ntb.h
> @@ -268,13 +268,14 @@ struct ntb_dev_ops {
>   int (*spad_is_unsafe)(struct ntb_dev *ntb);
>   int (*spad_count)(struct ntb_dev *ntb);
> 
> - u32 (*spad_read)(struct ntb_dev *ntb, int idx);
> - int (*spad_write)(struct ntb_dev *ntb, int idx, u32 val);
> + u32 (*spad_read)(struct ntb_dev *ntb, int sidx);
> + int (*spad_write)(struct ntb_dev *ntb, int sidx, u32 val);
> 
> - int (*peer_spad_addr)(struct ntb_dev *ntb, int idx,
> + int (*peer_spad_addr)(struct ntb_dev *ntb, int pidx, int sidx,
> phys_addr_t *spad_addr);
> - u32 (*peer_spad_read)(struct ntb_dev *ntb, int idx);
> - int (*peer_spad_write)(struct ntb_dev *ntb, int idx, u32 val);
> + u32 (*peer_spad_read)(struct ntb_dev *ntb, int pidx, int sidx);
> + int (*peer_spad_write)(struct ntb_dev *ntb, int pidx, int sidx,
> +u32 val);
> 
>   int (*msg_count)(struct ntb_dev *ntb);
>   u64 (*msg_inbits)(struct ntb_dev *ntb);
> @@ -1201,6 +1202,7 @@ static inline int ntb_spad_is_unsafe(struct ntb_dev 
> *ntb)
>   * @ntb: NTB device context.
>   *
>   * Hardware and topology may support a different number of scratchpads.
> + * Although it must be the same for all ports per NTB device.
>   *
>   * Return: the number of scratchpads.
>   */
> @@ -1215,42 +1217,43 @@ static inline int ntb_spad_count(struct ntb_dev *ntb)
>  /**
>   * ntb_spad_read() - read the local scratchpad register
>   * @ntb: NTB device context.
> - * @idx: Scratchpad index.
> + * @sidx:Scratchpad index.
>   *
>   * Read the local scratchpad register, and return the value.
>   *
>   * Return: The value of the local scratchpad register.
>   */
> -static inline u32 ntb_spad_read(struct ntb_dev *ntb, int idx)
> +static inline u32 ntb_spad_read(struct ntb_dev *ntb, int sidx)
>  {
>   if (!ntb->ops->spad_read)
>   return ~(u32)0;
> 
> - return ntb->ops->spad_read(ntb, idx);
> + return ntb->ops->spad_read(ntb, sidx);
>  }
> 
>  /**
>   * ntb_spad_write() - write the local scratchpad register
>   * @ntb: NTB device context.
> - * @idx: Scratchpad index.
> + * @sidx:Scratchpad index.
>   * @val: Scratchpad value.
>   *
>   * Write the value to the local scratchpad register.
>   *
>   * Return: Zero on success, otherwise an error number.
>   */
> -static inline int ntb_spad_write(struct ntb_dev *ntb, int idx, u32 val)
> +static inline int ntb_spad_write(struct ntb_dev *ntb, int sidx, u32 val)
>  {
>   if (!ntb->ops->spad_write)
>   return -EINVAL;
> 
> - return ntb->ops->spad_write(ntb, idx, val);
> + return ntb->ops->spad_write(ntb, sidx, val);
>  }
> 
>  /**
>   * ntb_peer_spad_addr() - address of the peer scratchpad register
>   * @ntb: NTB device context.
> - * @idx: Scratchpad index.
> + * @pidx:Port index of peer device.
> + * @sidx:Scratchpad index.
>   * @spad_addr:   OUT - The address of the peer scratchpad register.
>   *
>   * Return the address of the peer doorbell register.  This may be used, for
> @@ -1258,48 +1261,51 @@ static inline int ntb_spad_write(struct ntb_dev *ntb, 
> int idx, u32
> val)
>   *
>   * Return: Zero on success, otherwise an error number.
>   */
> -static inline int ntb_peer_spad_addr(struct ntb_dev *ntb, int idx,
> +static inline int ntb_peer_spad_addr(struct ntb_dev *ntb, int pidx, int sidx,
>phys_addr_t *spad_addr)
>  {
>   if (!ntb->ops->peer_spad_addr)
>   return -EINVAL;
> 
> - return ntb->ops->peer_spad_addr(ntb, idx, spad_addr);
> + return ntb->ops->peer_spad_addr(ntb, pidx, sidx, spad_addr);
>  }
> 
>  /**
>   * ntb_peer_spad_read() - read the peer scratchpad register
>   * @ntb: NTB device context.
> - * @idx: Scratchpad index.
> + * @pidx:Port index of peer device.
> + * @sidx:Scratchpad index.

RE: [PATCH v2 1/3] ntb: Add asynchronous devices support to NTB-bus interface

2016-08-19 Thread Allen Hubbe
From: Serge Semin
> 3) IDT driver redevelopment will take a lot of time, since I don't have much 
> free time to
> do it. It may be half of year or even more.
> 
> From my side, such an improvement will significantly complicate the NTB 
> Kernel API. Since
> you are the subsystem maintainer it's your decision which design to choose, 
> but I don't think
> I'll do the IDT driver suitable for this design anytime soon.

I'm sorry to have made you feel that way.

> > I hope we got it settled now. If not, We can have a Skype conversation, 
> > since writing
> such a long letters takes lot of time.

Come join irc.oftc.net #ntb



RE: [PATCH v2 1/3] ntb: Add asynchronous devices support to NTB-bus interface

2016-08-08 Thread Allen Hubbe
> lookup table is accessed. Alas It is accessed only by a pair of "Entry 
> index/Data"
> registers. So a root complex must write an entry index to one registers, then 
> read/write
> data from another. As you might realise, that weak point leads to a race 
> condition of
> multiple root complexes accessing the lookup table of one shared peer. Alas I 
> could not
> come up with a simple and strong solution of the race.

Right, multiple peers reaching across to some other peer's NTB configuration 
space is problematic.  I don't mean to suggest we should reach across to 
configure the lookup table (or anything else) on a remote NTB.

> That's why I've introduced the asynchronous hardware in the NTB bus kernel 
> API. Since
> local root complex can't directly write a translated base address to a peer, 
> it must wait
> until a peer asks him to allocate a memory and send the address back using 
> some of a
> hardware mechanism. It can be anything: Scratchpad registers, Message 
> registers or even
> "crazy" doorbells bingbanging. For instance, the IDT switches of the first 
> group support:
> 1) Shared Memory windows. In particular local root complex can set a 
> translated base
> address to BARs of local and peer NT-function using the cross-coupled PCIe/NTB
> configuration space, the same way as it can be done for AMD/Intel NTBs.
> 2) One Doorbell register.
> 3) Two Scratchpads.
> 4) Four message regietsrs.
> As you can see the switches of the first group can be considered as both 
> synchronous and
> asynchronous. All the NTB bus kernel API can be implemented for it including 
> the changes
> introduced by this patch (I would do it if I had a corresponding hardware). 
> AMD and Intel
> NTBs can be considered both synchronous and asynchronous as well, although 
> they don't
> support messaging so Scratchpads can be used to send a data to a peer. 
> Finally the
> switches of the second group lack of ability to initialize BARs translated 
> base address of
> peers due to the race condition I described before.
> 
> To sum up I've spent a lot of time designing the IDT NTB driver. I've done my 
> best to make
> the IDT driver as much compatible with current design as possible, 
> nevertheless the NTB
> bus kernel API had to be slightly changed. You can find answers to the 
> commentaries down
> below.
> 
> On Fri, Aug 05, 2016 at 11:31:58AM -0400, Allen Hubbe  
> wrote:
> > From: Serge Semin
> > > Currently supported AMD and Intel Non-transparent PCIe-bridges are 
> > > synchronous
> > > devices, so translated base address of memory windows can be direcly 
> > > written
> > > to peer registers. But there are some IDT PCIe-switches which implement
> > > complex interfaces using Lookup Tables of translation addresses. Due to
> > > the way the table is accessed, it can not be done synchronously from 
> > > different
> > > RCs, that's why the asynchronous interface should be developed.
> > >
> > > For these purpose the Memory Window related interface is correspondingly 
> > > split
> > > as it is for Doorbell and Scratchpad registers. The definition of Memory 
> > > Window
> > > is following: "It is a virtual memory region, which locally reflects a 
> > > physical
> > > memory of peer device." So to speak the "ntb_peer_mw_"-prefixed methods 
> > > control
> > > the peers memory windows, "ntb_mw_"-prefixed functions work with the local
> > > memory windows.
> > > Here is the description of the Memory Window related NTB-bus callback
> > > functions:
> > >  - ntb_mw_count() - number of local memory windows.
> > >  - ntb_mw_get_maprsc() - get the physical address and size of the local 
> > > memory
> > >  window to map.
> > >  - ntb_mw_set_trans() - set translation address of local memory window 
> > > (this
> > > address should be somehow retrieved from a peer).
> > >  - ntb_mw_get_trans() - get translation address of local memory window.
> > >  - ntb_mw_get_align() - get alignment of translated base address and size 
> > > of
> > > local memory window. Additionally one can get the
> > > upper size limit of the memory window.
> > >  - ntb_peer_mw_count() - number of peer memory windows (it can differ 
> > > from the
> > >  local number).
> > >  - ntb_peer_mw_set_trans() - set translation address of peer memory window
>

RE: [PATCH] NTB: ntb_hw_intel: Fix typo in module parameter descriptions

2016-08-08 Thread Allen Hubbe
From: Wei Yongjun
> Fix typo in module parameter descriptions.
> 
> Signed-off-by: Wei Yongjun 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 0d5c29a..1ee61d9 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -112,17 +112,17 @@ MODULE_PARM_DESC(xeon_b2b_usd_bar2_addr64,
> 
>  module_param_named(xeon_b2b_usd_bar4_addr64,
>  xeon_b2b_usd_addr.bar4_addr64, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_usd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_usd_bar4_addr64,
>"XEON B2B USD BAR 4 64-bit address");
> 
>  module_param_named(xeon_b2b_usd_bar4_addr32,
>  xeon_b2b_usd_addr.bar4_addr32, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_usd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_usd_bar4_addr32,
>"XEON B2B USD split-BAR 4 32-bit address");
> 
>  module_param_named(xeon_b2b_usd_bar5_addr32,
>  xeon_b2b_usd_addr.bar5_addr32, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_usd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_usd_bar5_addr32,
>"XEON B2B USD split-BAR 5 32-bit address");
> 
>  module_param_named(xeon_b2b_dsd_bar2_addr64,
> @@ -132,17 +132,17 @@ MODULE_PARM_DESC(xeon_b2b_dsd_bar2_addr64,
> 
>  module_param_named(xeon_b2b_dsd_bar4_addr64,
>  xeon_b2b_dsd_addr.bar4_addr64, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_dsd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_dsd_bar4_addr64,
>"XEON B2B DSD BAR 4 64-bit address");
> 
>  module_param_named(xeon_b2b_dsd_bar4_addr32,
>  xeon_b2b_dsd_addr.bar4_addr32, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_dsd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_dsd_bar4_addr32,
>"XEON B2B DSD split-BAR 4 32-bit address");
> 
>  module_param_named(xeon_b2b_dsd_bar5_addr32,
>  xeon_b2b_dsd_addr.bar5_addr32, ullong, 0644);
> -MODULE_PARM_DESC(xeon_b2b_dsd_bar2_addr64,
> +MODULE_PARM_DESC(xeon_b2b_dsd_bar5_addr32,
>"XEON B2B DSD split-BAR 5 32-bit address");
> 
>  #ifndef ioread64



RE: [PATCH] ntb_pingpong: Fix db_init parameter description

2016-08-08 Thread Allen Hubbe
From: Wei Yongjun
> Fix 'db_init' parameter description.
> 
> Signed-off-by: Wei Yongjun 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_pingpong.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
> index 7d31179..4358611 100644
> --- a/drivers/ntb/test/ntb_pingpong.c
> +++ b/drivers/ntb/test/ntb_pingpong.c
> @@ -88,7 +88,7 @@ MODULE_PARM_DESC(delay_ms, "Milliseconds to delay the 
> response to
> peer");
> 
>  static unsigned long db_init = 0x7;
>  module_param(db_init, ulong, 0644);
> -MODULE_PARM_DESC(delay_ms, "Initial doorbell bits to ring on the peer");
> +MODULE_PARM_DESC(db_init, "Initial doorbell bits to ring on the peer");
> 
>  struct pp_ctx {
>   struct ntb_dev  *ntb;



RE: [PATCH v2 1/3] ntb: Add asynchronous devices support to NTB-bus interface

2016-08-05 Thread Allen Hubbe
From: Serge Semin
> Currently supported AMD and Intel Non-transparent PCIe-bridges are synchronous
> devices, so translated base address of memory windows can be direcly written
> to peer registers. But there are some IDT PCIe-switches which implement
> complex interfaces using Lookup Tables of translation addresses. Due to
> the way the table is accessed, it can not be done synchronously from different
> RCs, that's why the asynchronous interface should be developed.
> 
> For these purpose the Memory Window related interface is correspondingly split
> as it is for Doorbell and Scratchpad registers. The definition of Memory 
> Window
> is following: "It is a virtual memory region, which locally reflects a 
> physical
> memory of peer device." So to speak the "ntb_peer_mw_"-prefixed methods 
> control
> the peers memory windows, "ntb_mw_"-prefixed functions work with the local
> memory windows.
> Here is the description of the Memory Window related NTB-bus callback
> functions:
>  - ntb_mw_count() - number of local memory windows.
>  - ntb_mw_get_maprsc() - get the physical address and size of the local memory
>  window to map.
>  - ntb_mw_set_trans() - set translation address of local memory window (this
> address should be somehow retrieved from a peer).
>  - ntb_mw_get_trans() - get translation address of local memory window.
>  - ntb_mw_get_align() - get alignment of translated base address and size of
> local memory window. Additionally one can get the
> upper size limit of the memory window.
>  - ntb_peer_mw_count() - number of peer memory windows (it can differ from the
>  local number).
>  - ntb_peer_mw_set_trans() - set translation address of peer memory window
>  - ntb_peer_mw_get_trans() - get translation address of peer memory window
>  - ntb_peer_mw_get_align() - get alignment of translated base address and size
>  of peer memory window.Additionally one can get 
> the
>  upper size limit of the memory window.
> 
> As one can see current AMD and Intel NTB drivers mostly implement the
> "ntb_peer_mw_"-prefixed methods. So this patch correspondingly renames the
> driver functions. IDT NTB driver mostly expose "ntb_nw_"-prefixed methods,
> since it doesn't have convenient access to the peer Lookup Table.
> 
> In order to pass information from one RC to another NTB functions of IDT
> PCIe-switch implement Messaging subsystem. They currently support four message
> registers to transfer DWORD sized data to a specified peer. So there are two
> new callback methods are introduced:
>  - ntb_msg_size() - get the number of DWORDs supported by NTB function to send
> and receive messages
>  - ntb_msg_post() - send message of size retrieved from ntb_msg_size()
> to a peer
> Additionally there is a new event function:
>  - ntb_msg_event() - it is invoked when either a new message was retrieved
>  (NTB_MSG_NEW), or last message was successfully sent
>  (NTB_MSG_SENT), or the last message failed to be sent
>  (NTB_MSG_FAIL).
> 
> The last change concerns the IDs (practically names) of NTB-devices on the
> NTB-bus. It is not good to have the devices with same names in the system
> and it brakes my IDT NTB driver from being loaded =) So I developed a simple
> algorithm of NTB devices naming. Particulary it generates names "ntbS{N}" for
> synchronous devices, "ntbA{N}" for asynchronous devices, and "ntbAS{N}" for
> devices supporting both interfaces.

Thanks for the work that went into writing this driver, and thanks for your 
patience with the review.  Please read my initial comments inline.  I would 
like to approach this from a top-down api perspective first, and settle on that 
first before requesting any specific changes in the hardware driver.  My major 
concern about these changes is that they introduce a distinct classification 
for sync and async hardware, supported by different sets of methods in the api, 
neither is a subset of the other.

You know the IDT hardware, so if any of my requests below are infeasible, I 
would like your constructive opinion (even if it means significant changes to 
existing drivers) on how to resolve the api so that new and existing hardware 
drivers can be unified under the same api, if possible.

> 
> Signed-off-by: Serge Semin 
> 
> ---
>  drivers/ntb/Kconfig |   4 +-
>  drivers/ntb/hw/amd/ntb_hw_amd.c |  49 ++-
>  drivers/ntb/hw/intel/ntb_hw_intel.c |  59 +++-
>  drivers/ntb/ntb.c   |  86 +-
>  drivers/ntb/ntb_transport.c |  19 +-
>  drivers/ntb/test/ntb_perf.c |  16 +-
>  drivers/ntb/test/ntb_pingpong.c |   5 +
>  drivers/ntb/test/ntb_tool.c |  25 +-
>  include/linux/ntb.h | 600 
> +---

RE: [PATCH 0807/1285] Replace numeric parameter like 0444 with macro

2016-08-02 Thread Allen Hubbe
From: Baole Ni
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 40d04ef..7aa1faa 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -92,56 +92,56 @@ static const struct file_operations 
> intel_ntb_debugfs_info;
>  static struct dentry *debugfs_dir;
> 
>  static int b2b_mw_idx = -1;
> -module_param(b2b_mw_idx, int, 0644);
> +module_param(b2b_mw_idx, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);

The result of this change should be simplified to S_IWUSR | S_IRUGO.

>  module_param_named(xeon_b2b_usd_bar2_addr64,
> -xeon_b2b_usd_addr.bar2_addr64, ullong, 0644);
> +xeon_b2b_usd_addr.bar2_addr64, ullong, S_IRUSR | S_IWUSR | 
> S_IRGRP | S_IROTH);

This line is 97 columns long.

The result of this change should be simplified to S_IWUSR | S_IRUGO, and that 
might put it under 80 columns.  There are bound to still be other places where 
this kind of replacement will require additional changes to code formatting.



Re: [PATCH] checkpatch: fix perl warning about unescaped brace

2016-07-29 Thread Allen Hubbe
On Fri, Jul 29, 2016 at 7:57 PM, Allen Hubbe  wrote:
> On Fri, Jul 29, 2016 at 7:06 PM, Joe Perches  wrote:
>> On Fri, 2016-07-29 at 18:27 -0400, Allen Hubbe wrote:
>>> Perl warns:
>>>
>>> Unescaped left brace in regex is deprecated, passed through in regex
>>>
>>> This is explained under "Quantifiers" in perl doc:
>>> http://perldoc.perl.org/perlre.html#Quantifiers
>>
>> Hey Allen.
>>
>> I'm at 5.22 and don't see a warning.
>> For what version of perl does this warning get generated?
>> Only 5.24 and higher?
>
> This system has v5.22.1.  I don't know exactly how this perl was
> configured.  I didn't see the warning on other systems.  Strangely, I
> don't remember seeing it yesterday on this system, either.  I might
> try to reproduce the warning on another system and let you know.  The
> warning and this fix seem to be in line with the perl documentation,
> though.
>
> perl --version
>
> This is perl 5, version 22, subversion 1 (v5.22.1) built for
> x86_64-linux-thread-multi
> (with 15 registered patches, see perl -V for more detail)
>
> cat /etc/fedora-release
> Fedora release 23 (Twenty Three)

Mystery solved... and drop this patch.

My 'master' on this system was set to an old ref, v4.2-rc1.  There are
no perl warnings as of v4.7.

docker pull fedora
docker run -it --rm fedora /bin/bash
dnf install wget perl perl-Term-ANSIColor perl-Getopt-Long

wget 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/scripts/checkpatch.pl
perl -c checkpatch.pl

huh... no errors :-/

wget 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/scripts/checkpatch.pl?h=v4.2-rc1
perl -c checkpatch.pl\?h\=v4.2-rc1

Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/\#\s*define.*do\s{ <-- HERE / at
checkpatch.pl?h=v4.2-rc1 line 3523.
Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/\(.*\){ <-- HERE / at checkpatch.pl?h=v4.2-rc1
line 4035.
Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/do{ <-- HERE / at checkpatch.pl?h=v4.2-rc1
line 4036.
Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/^\({ <-- HERE / at checkpatch.pl?h=v4.2-rc1
line 4483.


Re: [PATCH] checkpatch: fix perl warning about unescaped brace

2016-07-29 Thread Allen Hubbe
This system has v5.22.1.  I don't know exactly how this perl was
configured.  I didn't see the warning on other systems.  Strangely, I
don't remember seeing it yesterday on this system, either.  I might
try to reproduce the warning on another system and let you know.  The
warning and this fix seem to be in line with the perl documentation,
though.

perl --version

This is perl 5, version 22, subversion 1 (v5.22.1) built for
x86_64-linux-thread-multi
(with 15 registered patches, see perl -V for more detail)

cat /etc/fedora-release
Fedora release 23 (Twenty Three)

On Fri, Jul 29, 2016 at 7:06 PM, Joe Perches  wrote:
> On Fri, 2016-07-29 at 18:27 -0400, Allen Hubbe wrote:
>> Perl warns:
>>
>> Unescaped left brace in regex is deprecated, passed through in regex
>>
>> This is explained under "Quantifiers" in perl doc:
>> http://perldoc.perl.org/perlre.html#Quantifiers
>
> Hey Allen.
>
> I'm at 5.22 and don't see a warning.
> For what version of perl does this warning get generated?
> Only 5.24 and higher?
>


[PATCH] checkpatch: if no filenames then read stdin

2016-07-29 Thread Allen Hubbe
If no filenames are given, then read the patch from stdin.

Signed-off-by: Allen Hubbe 
---
 scripts/checkpatch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 90e1edc8dd42..b0659f1e9b09 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -178,9 +178,9 @@ if ($^V && $^V lt $minimum_perl_version) {
}
 }
 
+#if no filenames are given, push '-' to read patch from stdin
 if ($#ARGV < 0) {
-   print "$P: no input files\n";
-   exit(1);
+   push(@ARGV, '-');
 }
 
 sub hash_save_array_words {
-- 
2.9.1



[PATCH] checkpatch: fix perl warning about unescaped brace

2016-07-29 Thread Allen Hubbe
Perl warns:

Unescaped left brace in regex is deprecated, passed through in regex

This is explained under "Quantifiers" in perl doc:
http://perldoc.perl.org/perlre.html#Quantifiers

(If a curly bracket occurs in a context other than one of the
quantifiers listed above, where it does not form part of a backslashed
sequence like \x{...} , it is treated as a regular character. However, a
deprecation warning is raised for these occurrences, and in Perl v5.26,
literal uses of a curly bracket will be required to be escaped, say by
preceding them with a backslash ("\{" ) or enclosing them within square
brackets ("[{]" ). This change will allow for future syntax extensions
(like making the lower bound of a quantifier optional), and better error
checking of quantifiers.)

Signed-off-by: Allen Hubbe 
---
 scripts/checkpatch.pl | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index b0659f1e9b09..7d40f4c8666a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3520,7 +3520,7 @@ sub process {
 # function brace can't be on same line, except for #defines of do while,
 # or if closed on same line
if (($line=~/$Type\s*$Ident\(.*\).*\s*{/) and
-   !($line=~/\#\s*define.*do\s{/) and !($line=~/}/)) {
+   !($line=~/\#\s*define.*do\s\{/) and !($line=~/}/)) {
if (ERROR("OPEN_BRACE",
  "open brace '{' following function 
declarations go on the next line\n" . $herecurr) &&
$fix) {
@@ -4032,12 +4032,12 @@ sub process {
 ## }
 
 #need space before brace following if, while, etc
-   if (($line =~ /\(.*\){/ && $line !~ /\($Type\){/) ||
-   $line =~ /do{/) {
+   if (($line =~ /\(.*\)\{/ && $line !~ /\($Type\)\{/) ||
+   $line =~ /do\{/) {
if (ERROR("SPACING",
  "space required before the open brace '{'\n" 
. $herecurr) &&
$fix) {
-   $fixed[$fixlinenr] =~ s/^(\+.*(?:do|\))){/$1 {/;
+   $fixed[$fixlinenr] =~ s/^(\+.*(?:do|\)))\{/$1 
{/;
}
}
 
@@ -4480,7 +4480,7 @@ sub process {
$dstat !~ /^for\s*$Constant$/ &&
# for (...)
$dstat !~ 
/^for\s*$Constant\s+(?:$Ident|-?$Constant)$/ &&   # for (...) bar()
$dstat !~ /^do\s*{/ &&  
# do {...
-   $dstat !~ /^\({/ && 
# ({...
+   $dstat !~ /^\(\{/ &&
# ({...
$ctx !~ 
/^.\s*#\s*define\s+TRACE_(?:SYSTEM|INCLUDE_FILE|INCLUDE_PATH)\b/)
{
$ctx =~ s/\n*$//;
-- 
2.9.1



Re: [PATCH] checkpatch: check signoff when reading stdin

2016-07-29 Thread Allen Hubbe
On Wed, Jul 27, 2016 at 8:41 PM, Joe Perches  wrote:
> I think this is not a great idea because the most likely
> use case is piping git diff output ala:
>
> $ git diff  | ./scripts/checkpatch.pl -

Thanks for the review.  Has v2 addressed your concern?


RE: [PATCH v2 0/3] ntb: Asynchronous NTB devices support

2016-07-28 Thread Allen Hubbe
From: Serge Semin
> Please, find the general patchset description in the cover letter of the first
> patchset (see the very first message in thread).
> 
> Changes in v2:
>  - Fix sparc64 compilation warning in drivers/ntb/hw/idt/ntb_hw_idt.c :
>warning: right shift count >= width of type
>  - Fix sparc64 compilation warnings in drivers/ntb/test/ntb_mw_test.c :
>warning: right shift count >= width of type
>warning: cast to pointer from integer of different size

Thanks for reacting to the test robot so quickly.  Since nobody else has 
responded yet, I would like to assure you that the patches are not being 
ignored.  Please be patient.  The IDT driver will be a valuable contribution to 
the ntb subsystem.  I am working carefully through patch 1/3 first, since it 
affects existing drivers and interface.

A word of caution regarding your statement, "There are a some types of 
checkpatch warnings I left unfixed."  Coding style can be a touchy subject, 
leading to some recent rants^H^H^H^H^Hdiscussion on some of the same topics 
that are included in that list of unfixed warnings.  Be prepared to adhere to 
the style guide, even if it is inconvenient and against your own logic, because 
that is almost always the easier and more practical approach than asking for 
changes or exceptions, and better for your mental health not to be on the To: 
list of something like https://lkml.org/lkml/2016/7/8/625.

"Of course all of these warnings are discussable, except the last one."  Be 
prepared, even if it will require significant changes to the code.  For really 
inconvenient changes, we can talk about other more readily acceptable 
approaches to keep the code short and elegant, as is obviously your intent.  
Please be patient with the review.



[PATCH v2] checkpatch: check signoff when reading stdin

2016-07-27 Thread Allen Hubbe
Signoff was not checked if the filename is '-', indicating reading the
patch from stdin.  Commands such as the below would not warn about a
missing signoff, because the patch filename is '-'.  This change allows
checkpatch to warn about a missing signoff, even if the input filename
is '-', but only if the patch has a commit message.

git show --pretty=email | scripts/checkpatch.pl -

A more common use of checkpatch with stdin is for piping git diff
through checkpatch.  The diff output would not contain a commit message,
and therefore it would not contain a signoff line.  For this common use
case, a warning should not be printed about the missing signoff.  With
this change we will only warn about a missing signoff if the input
contains a commit message.

git diff | scripts/checkpatch.pl -

Before this patch, a workaround for the first command was to refer to
stdin by a name other than '-'.  The workaround is not an elegant
solution, because elsewhere checkpatch uses the fact that filename
equals '-', such as in setting '$vname' to 'Your patch' for stdin.  The
command below would report "/dev/stdin has style problems" instead of
"Your patch has style problems."

git show --pretty=email | scripts/checkpatch.pl /dev/stdin

Signed-off-by: Allen Hubbe 
---
 scripts/checkpatch.pl | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4904ced676d4..9be0343baa77 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2064,6 +2064,7 @@ sub process {
my $is_patch = 0;
my $in_header_lines = $file ? 0 : 1;
my $in_commit_log = 0;  #Scanning lines before patch
+   my $has_commit_log = 0; #Encountered lines before patch
my $commit_log_possible_stack_dump = 0;
my $commit_log_long_line = 0;
my $commit_log_has_diff = 0;
@@ -2561,6 +2562,7 @@ sub process {
  $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
$in_header_lines = 0;
$in_commit_log = 1;
+   $has_commit_log = 1;
}
 
 # Check if there is UTF-8 in a commit log when a mail header has explicitly
@@ -6045,7 +6047,7 @@ sub process {
ERROR("NOT_UNIFIED_DIFF",
  "Does not appear to be a unified-diff format patch\n");
}
-   if ($is_patch && $filename ne '-' && $chk_signoff && $signoff == 0) {
+   if ($is_patch && $has_commit_log && $chk_signoff && $signoff == 0) {
ERROR("MISSING_SIGN_OFF",
  "Missing Signed-off-by: line(s)\n");
}
-- 
2.9.1



[PATCH] checkpatch: check signoff when reading stdin

2016-07-27 Thread Allen Hubbe
Before, signoff was not checked if the filename is '-', indicating
reading the patch from stdin.  This causes commands such as below not to
warn about a missing signoff.

  git show --pretty=email | scripts/checkpatch.pl -

As a workaround, the command could be modified to refer to stdin by a
name other than '-'.  The workaround is not an elegant solution, because
elsewhere checkpatch uses the fact that filename equals '-' is special
for stdin, such as setting '$vname' to 'Your patch'.

  git show --pretty=email | scripts/checkpatch.pl /dev/stdin

This change causes checkpatch to check for a missing signoff line, even
if the filename is '-', as in the first variation of the command.

Signed-off-by: Allen Hubbe 
---
 scripts/checkpatch.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4904ced676d4..83acbac10705 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6045,7 +6045,7 @@ sub process {
ERROR("NOT_UNIFIED_DIFF",
  "Does not appear to be a unified-diff format patch\n");
}
-   if ($is_patch && $filename ne '-' && $chk_signoff && $signoff == 0) {
+   if ($is_patch && $chk_signoff && $signoff == 0) {
ERROR("MISSING_SIGN_OFF",
  "Missing Signed-off-by: line(s)\n");
}
-- 
2.9.1



RE: [PATCH v5 06/10] ntb_tool: Postpone memory window initialization for the user

2016-06-20 Thread Allen Hubbe
From: Logan Gunthorpe
> In order to make the interface closer to the raw NTB API, this commit
> changes memory windows so they are not initialized on link up.
> Instead, the 'peer_trans*' debugfs files are introduced. When read,
> they return information provided by ntb_mw_get_range. When written,
> they create a buffer and initialize the memory window. The
> value written is taken as the requested size of the buffer (which
> is then rounded for alignment). Writing a value of zero frees the buffer
> and tears down the memory window translation. The 'peer_mw*' file is
> only created once the memory window translation is setup by the user.
> 
> Additionally, it was noticed that the read and write functions for the
> 'peer_mw*' files should have checked for a NULL pointer.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 366 
> +++-
>  1 file changed, 228 insertions(+), 138 deletions(-)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index cba31fd..1509b4c 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -79,6 +79,13 @@
>   * root@self# cat $DBG_DIR/spad
>   *
>   * Observe that spad 0 and 1 have the values set by the peer.
> + *
> + * # Check the memory window translation info
> + * cat $DBG_DIR/peer_trans0
> + *
> + * # Setup a 16k memory window buffer
> + * echo 16384 > $DBG_DIR/peer_trans0
> + *
>   */
> 
>  #include 
> @@ -108,25 +115,22 @@ MODULE_DESCRIPTION(DRIVER_DESCRIPTION);
> 
>  #define MAX_MWS 16
> 
> -static unsigned long mw_size = 16;
> -module_param(mw_size, ulong, 0644);
> -MODULE_PARM_DESC(mw_size, "size order [n^2] of the memory window for 
> testing");
> -
>  static struct dentry *tool_dbgfs;
> 
>  struct tool_mw {
> + int idx;
> + struct tool_ctx *tc;
> + resource_size_t win_size;
>   resource_size_t size;
>   u8 __iomem *local;
>   u8 *peer;
>   dma_addr_t peer_dma;
> + struct dentry *peer_dbg_file;
>  };
> 
>  struct tool_ctx {
>   struct ntb_dev *ntb;
>   struct dentry *dbgfs;
> - struct work_struct link_cleanup;
> - bool link_is_up;
> - struct delayed_work link_work;
>   int mw_count;
>   struct tool_mw mws[MAX_MWS];
>  };
> @@ -143,111 +147,6 @@ struct tool_ctx {
>   .write = __write,   \
>   }
> 
> -static int tool_setup_mw(struct tool_ctx *tc, int idx)
> -{
> - int rc;
> - struct tool_mw *mw = &tc->mws[idx];
> - phys_addr_t base;
> - resource_size_t size, align, align_size;
> -
> - if (mw->local)
> - return 0;
> -
> - rc = ntb_mw_get_range(tc->ntb, idx, &base, &size, &align,
> -   &align_size);
> - if (rc)
> - return rc;
> -
> - mw->size = min_t(resource_size_t, 1 << mw_size, size);
> - mw->size = round_up(mw->size, align);
> - mw->size = round_up(mw->size, align_size);
> -
> - mw->local = ioremap_wc(base, size);
> - if (mw->local == NULL)
> - return -EFAULT;
> -
> - mw->peer = dma_alloc_coherent(&tc->ntb->pdev->dev, mw->size,
> -   &mw->peer_dma, GFP_KERNEL);
> -
> - if (mw->peer == NULL)
> - return -ENOMEM;
> -
> - rc = ntb_mw_set_trans(tc->ntb, idx, mw->peer_dma, mw->size);
> - if (rc)
> - return rc;
> -
> - return 0;
> -}
> -
> -static void tool_free_mws(struct tool_ctx *tc)
> -{
> - int i;
> -
> - for (i = 0; i < tc->mw_count; i++) {
> - if (tc->mws[i].peer) {
> - ntb_mw_clear_trans(tc->ntb, i);
> - dma_free_coherent(&tc->ntb->pdev->dev, tc->mws[i].size,
> -   tc->mws[i].peer,
> -   tc->mws[i].peer_dma);
> -
> - }
> -
> - tc->mws[i].peer = NULL;
> - tc->mws[i].peer_dma = 0;
> -
> - if (tc->mws[i].local)
> - iounmap(tc->mws[i].local);
> -
> - tc->mws[i].local = NULL;
> - }
> -
> - tc->mw_count = 0;
> -}
> -
> -static int tool_setup_mws(struct tool_ctx *tc)
> -{
> - int i;
> - int rc;
> -
> - tc->mw_count = min(ntb_mw_count(tc->ntb), MAX_MWS);
> -
> - for (i = 0; i < tc->mw_count; i++) {
> -   

RE: [PATCH v4 07/10] ntb_tool: Add link status and files to debugfs

2016-06-16 Thread Allen Hubbe
From: Logan Gunthorpe
> In order to more successfully script with ntb_tool it's useful to
> have a link file to check the link status so that the script
> doesn't use the other files until the link is up.
> 
> This commit adds a 'link' file to the debugfs directory which reads
> boolean (Y or N) depending on the link status. Writing to the file
> change the link state using ntb_link_enable or ntb_link_disable.
> 
> A 'link_event' file is also provided so an application can block until
> the link changes to the desired state. If the user writes a 1, it will
> block until the link is up. If the user writes a 0, it will block until
> the link is down.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 92 
> +
>  1 file changed, 92 insertions(+)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 031723d..b7f4f4e 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -59,6 +59,12 @@
>   *
>   * Eg: check if clearing the doorbell mask generates an interrupt.
>   *
> + * # Check the link status
> + * root@self# cat $DBG_DIR/link
> + *
> + * # Block until the link is up
> + * root@self# echo Y > $DBG_DIR/link_event
> + *
>   * # Set the doorbell mask
>   * root@self# echo 's 1' > $DBG_DIR/mask
>   *
> @@ -131,6 +137,7 @@ struct tool_mw {
>  struct tool_ctx {
>   struct ntb_dev *ntb;
>   struct dentry *dbgfs;
> + wait_queue_head_t link_wq;
>   int mw_count;
>   struct tool_mw mws[MAX_MWS];
>  };
> @@ -159,6 +166,7 @@ static void tool_link_event(void *ctx)
>   dev_dbg(&tc->ntb->dev, "link is %s speed %d width %d\n",
>   up ? "up" : "down", speed, width);
> 
> + wake_up(&tc->link_wq);
>  }
> 
>  static void tool_db_event(void *ctx, int vec)
> @@ -473,6 +481,83 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
> tool_peer_spad_read,
> tool_peer_spad_write);
> 
> +static ssize_t tool_link_read(struct file *filep, char __user *ubuf,
> +   size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[3];
> +
> + buf[0] = ntb_link_is_up(tc->ntb, NULL, NULL) ? 'Y' : 'N';
> + buf[1] = '\n';
> + buf[2] = '\0';
> +
> + return simple_read_from_buffer(ubuf, size, offp, buf, 2);
> +}
> +
> +static ssize_t tool_link_write(struct file *filep, const char __user *ubuf,
> +size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[32];
> + size_t buf_size;
> + bool val;
> + int rc;
> +
> + buf_size = min(size, (sizeof(buf) - 1));
> + if (copy_from_user(buf, ubuf, buf_size))
> + return -EFAULT;
> +
> + buf[buf_size] = '\0';
> +
> + rc = strtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (val)
> + rc = ntb_link_enable(tc->ntb, NTB_SPEED_AUTO, NTB_WIDTH_AUTO);
> + else
> + rc = ntb_link_disable(tc->ntb);
> +
> + if (rc)
> + return rc;
> +
> + return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_fops,
> +   tool_link_read,
> +   tool_link_write);
> +
> +static ssize_t tool_link_event_write(struct file *filep,
> +  const char __user *ubuf,
> +  size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[32];
> + size_t buf_size;
> + bool val;
> + int rc;
> +
> + buf_size = min(size, (sizeof(buf) - 1));
> + if (copy_from_user(buf, ubuf, buf_size))
> + return -EFAULT;
> +
> + buf[buf_size] = '\0';
> +
> + rc = strtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (wait_event_interruptible(tc->link_wq,
> + ntb_link_is_up(tc->ntb, NULL, NULL) == val))
> + return -ERESTART;
> +
> + return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_event_fops,
> +   NULL,
> +   tool_link_event_write);
> 
>  static ssize_t tool_mw_read(struct file *filep, char __user *ubuf,
>   size_t size, loff_t *offp)
> @@ -793,6 +878,12 @@ static void tool_setup_dbgfs(struct tool_ctx *tc)
>   debugfs_cre

RE: [PATCH v4 09/10] ntb_test: Add a selftest script for the NTB subsystem

2016-06-16 Thread Allen Hubbe
From: Logan Gunthorpe
> This script automates testing doorbells, scratchpads and memory windows
> for an NTB device. It can be run locally, with the NTB looped
> back to the same host or use SSH to remotely control the second host.
> 
> In the single host case, the script just needs to be passed two
> arguments: a PCI ID for each side of the link. In the two host case
> the -r option must be used to specify the remote hostname (which must
> be SSH accessible and should probably have ssh-keys exchanged).
> 
> A sample run looks like this:
> 
> $ sudo ./ntb_test.sh :03:00.1 :83:00.1 -p 29
> Starting ntb_tool tests...
> Running link tests on: :03:00.1 / :83:00.1
>   Passed
> Running link tests on: :83:00.1 / :03:00.1
>   Passed
> Running db tests on: :03:00.1 / :83:00.1
>   Passed
> Running db tests on: :83:00.1 / :03:00.1
>   Passed
> Running spad tests on: :03:00.1 / :83:00.1
>   Passed
> Running spad tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw0 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw0 tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw1 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw1 tests on: :83:00.1 / :03:00.1
>   Passed
> 
> Starting ntb_pingpong tests...
> Running ping pong tests on: :03:00.1 / :83:00.1
>   Passed
> 
> Starting ntb_perf tests...
> Running local perf test without DMA
>   0: copied 536870912 bytes in 164453 usecs, 3264 MBytes/s
>   Passed
> Running remote perf test without DMA
>   0: copied 536870912 bytes in 164453 usecs, 3264 MBytes/s
>   Passed
> 
> Signed-off-by: Logan Gunthorpe 
> Acked-by: Shuah Khan 

Acked-by: Allen Hubbe 

> ---
>  MAINTAINERS |   1 +
>  tools/testing/selftests/ntb/ntb_test.sh | 422 
> 
>  2 files changed, 423 insertions(+)
>  create mode 100755 tools/testing/selftests/ntb/ntb_test.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9c567a4..f178e7e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7846,6 +7846,7 @@ F:  drivers/ntb/
>  F:   drivers/net/ntb_netdev.c
>  F:   include/linux/ntb.h
>  F:   include/linux/ntb_transport.h
> +F:   tools/testing/selftests/ntb/
> 
>  NTB INTEL DRIVER
>  M:   Jon Mason 
> diff --git a/tools/testing/selftests/ntb/ntb_test.sh
> b/tools/testing/selftests/ntb/ntb_test.sh
> new file mode 100755
> index 000..a676d3e
> --- /dev/null
> +++ b/tools/testing/selftests/ntb/ntb_test.sh
> @@ -0,0 +1,422 @@
> +#!/bin/bash
> +# Copyright (c) 2016 Microsemi. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation; either version 2 of
> +# the License, or (at your option) any later version.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# Author: Logan Gunthorpe 
> +
> +REMOTE_HOST=
> +LIST_DEVS=FALSE
> +
> +DEBUGFS=${DEBUGFS-/sys/kernel/debug}
> +
> +PERF_RUN_ORDER=32
> +MAX_MW_SIZE=0
> +RUN_DMA_TESTS=
> +DONT_CLEANUP=
> +MW_SIZE=65536
> +
> +function show_help()
> +{
> + echo "Usage: $0 [OPTIONS] LOCAL_DEV REMOTE_DEV"
> + echo "Run tests on a pair of NTB endpoints."
> + echo
> + echo "If the NTB device loops back to the same host then,"
> + echo "just specifying the two PCI ids on the command line is"
> + echo "sufficient. Otherwise, if the NTB link spans two hosts"
> + echo "use the -r option to specify the hostname for the remote"
> + echo "device. SSH will then be used to test the remote side."
> + echo "An SSH key between the root users of the host would then"
> + echo "be highly recommended."
> + echo
> + echo "Options:"
> + echo "  -C  don't cleanup ntb modules on exit"
> + echo "  -d  run dma tests"
> + echo "  -h  show this help message"
> + echo "  -l  list available local and remote PCI ids"
> + echo "  -r REMOTE_HOST  specify the remote's hostname to connect"
> +echo "  to for the test (using ssh)"
> + echo "  -p NUM  ntb_perf run order (default: $PERF_RUN_ORDER

RE: [PATCH v3 09/10] ntb_test: Add a selftest script for the NTB subsystem

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> On 15/06/16 03:49 PM, Allen Hubbe wrote:
> >> +function link_test()
> >> +{
> >> +  LOC=$1
> >> +  REM=$2
> >> +  EXP=0
> >> +
> >> +  echo "Running link tests on: $(basename $LOC) / $(basename $REM)"
> >> +
> >> +  write_file "N" "$LOC/link"
> >> +  write_file "N" "$LOC/link_event"
> >
> > If it fails to bring down the link, won't it just block waiting on 
> > link_event and never
> make it to the next step of the test?
> >
> >> +  if [[ $(read_file "$REM/link") != "N" ]]; then
> >> +  echo "Expected remote link to be down in $REM/link" >&2
> >> +  exit -1
> >> +  fi
> >> +
> >> +  write_file "Y" "$LOC/link"
> >> +  write_file "Y" "$LOC/link_event"
> >> +
> >> +  echo "  Passed"
> >> +}
> 
> Well, the test is really intended to ensure both sides of the link see
> changes to the link status. If the driver is somehow buggy and the link
> never goes down/up when requested there's little I can do here except
> block forever. Unless we want to add a timeout to the link_event file
> (which I'd rather not).
> 
> You'd have the same issue if, when bringing the link up for the first
> time, the link does not come back.

The link might come up, but this test checks if the link can be forced down.

This test should fail on Intel RP/TB topology (two cpu sharing one ntb).  The 
link state is the link state of the secondary side pcie bus connected to the 
secondary side cpu.  The link must be up in order for the secondary side cpu to 
discover the ntb device, so the driver does not allow the link to be disabled 
in such topology.

A simple thing to do here might be:

write_file "N" "$LOC/link"
sleep 1
read_file "$REM/link"

You already have my Ack.  This minor issue can be fixed later if anyone cares.  
I don't think it is a big deal, just worth pointing out that the script will 
hang here instead of report a failure.  If it is worth fixing later, at that 
point we might also want to change this script to continue with other tests 
instead of exit on the first failure.



RE: [PATCH v3 06/10] ntb_tool: Postpone memory window initialization for the user

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> In order to make the interface closer to the raw NTB API, this commit
> changes memory windows so they are not initialized on link up.
> Instead, the 'peer_trans*' debugfs files are introduced. When read,
> they return information provided by ntb_mw_get_range. When written,
> they create a buffer and initialize the memory window. The
> value written is taken as the requested size of the buffer (which
> is then rounded for alignment). Writing a value of zero frees the buffer
> and tears down the memory window translation. The 'peer_mw*' file is
> only created once the memory window translation is setup by the user.
> 
> Additionally, it was noticed that the read and write functions for the
> 'peer_mw*' files should have checked for a NULL pointer.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

I am happy to see all the link_work complexity go away with this patch.

> ---
>  drivers/ntb/test/ntb_tool.c | 356 
> +++-
>  1 file changed, 218 insertions(+), 138 deletions(-)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index cba31fd..031723d 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -79,6 +79,13 @@
>   * root@self# cat $DBG_DIR/spad
>   *
>   * Observe that spad 0 and 1 have the values set by the peer.
> + *
> + * # Check the memory window translation info
> + * cat $DBG_DIR/peer_trans0
> + *
> + * # Setup a 16k memory window buffer
> + * echo 16384 > $DBG_DIR/peer_trans0
> + *
>   */
> 
>  #include 
> @@ -108,25 +115,22 @@ MODULE_DESCRIPTION(DRIVER_DESCRIPTION);
> 
>  #define MAX_MWS 16
> 
> -static unsigned long mw_size = 16;
> -module_param(mw_size, ulong, 0644);
> -MODULE_PARM_DESC(mw_size, "size order [n^2] of the memory window for 
> testing");
> -
>  static struct dentry *tool_dbgfs;
> 
>  struct tool_mw {
> + int idx;
> + struct tool_ctx *tc;
> + resource_size_t win_size;
>   resource_size_t size;
>   u8 __iomem *local;
>   u8 *peer;
>   dma_addr_t peer_dma;
> + struct dentry *peer_dbg_file;
>  };
> 
>  struct tool_ctx {
>   struct ntb_dev *ntb;
>   struct dentry *dbgfs;
> - struct work_struct link_cleanup;
> - bool link_is_up;
> - struct delayed_work link_work;
>   int mw_count;
>   struct tool_mw mws[MAX_MWS];
>  };
> @@ -143,111 +147,6 @@ struct tool_ctx {
>   .write = __write,   \
>   }
> 
> -static int tool_setup_mw(struct tool_ctx *tc, int idx)
> -{
> - int rc;
> - struct tool_mw *mw = &tc->mws[idx];
> - phys_addr_t base;
> - resource_size_t size, align, align_size;
> -
> - if (mw->local)
> - return 0;
> -
> - rc = ntb_mw_get_range(tc->ntb, idx, &base, &size, &align,
> -   &align_size);
> - if (rc)
> - return rc;
> -
> - mw->size = min_t(resource_size_t, 1 << mw_size, size);
> - mw->size = round_up(mw->size, align);
> - mw->size = round_up(mw->size, align_size);
> -
> - mw->local = ioremap_wc(base, size);
> - if (mw->local == NULL)
> - return -EFAULT;
> -
> - mw->peer = dma_alloc_coherent(&tc->ntb->pdev->dev, mw->size,
> -   &mw->peer_dma, GFP_KERNEL);
> -
> - if (mw->peer == NULL)
> - return -ENOMEM;
> -
> - rc = ntb_mw_set_trans(tc->ntb, idx, mw->peer_dma, mw->size);
> - if (rc)
> - return rc;
> -
> - return 0;
> -}
> -
> -static void tool_free_mws(struct tool_ctx *tc)
> -{
> - int i;
> -
> - for (i = 0; i < tc->mw_count; i++) {
> - if (tc->mws[i].peer) {
> - ntb_mw_clear_trans(tc->ntb, i);
> - dma_free_coherent(&tc->ntb->pdev->dev, tc->mws[i].size,
> -   tc->mws[i].peer,
> -   tc->mws[i].peer_dma);
> -
> - }
> -
> - tc->mws[i].peer = NULL;
> - tc->mws[i].peer_dma = 0;
> -
> - if (tc->mws[i].local)
> - iounmap(tc->mws[i].local);
> -
> - tc->mws[i].local = NULL;
> - }
> -
> - tc->mw_count = 0;
> -}
> -
> -static int tool_setup_mws(struct tool_ctx *tc)
> -{
> - int i;
> - int rc;
> -
> - tc->mw_count = min(ntb_mw_count(tc->ntb), MAX_MWS);
> -
&

RE: [PATCH v3 09/10] ntb_test: Add a selftest script for the NTB subsystem

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> This script automates testing doorbells, scratchpads and memory windows
> for an NTB device. It can be run locally, with the NTB looped
> back to the same host or use SSH to remotely control the second host.
> 
> In the single host case, the script just needs to be passed two
> arguments: a PCI ID for each side of the link. In the two host case
> the -r option must be used to specify the remote hostname (which must
> be SSH accessible and should probably have ssh-keys exchanged).
> 
> A sample run looks like this:
> 
> $ sudo ./ntb_test.sh :03:00.1 :83:00.1 -p 29
> Starting ntb_tool tests...
> Running link tests on: :03:00.1 / :83:00.1
>   Passed
> Running link tests on: :83:00.1 / :03:00.1
>   Passed
> Running db tests on: :03:00.1 / :83:00.1
>   Passed
> Running db tests on: :83:00.1 / :03:00.1
>   Passed
> Running spad tests on: :03:00.1 / :83:00.1
>   Passed
> Running spad tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw0 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw0 tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw1 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw1 tests on: :83:00.1 / :03:00.1
>   Passed
> 
> Starting ntb_pingpong tests...
> Running ping pong tests on: :03:00.1 / :83:00.1
>   Passed
> 
> Starting ntb_perf tests...
> Running local perf test without DMA
>   0: copied 536870912 bytes in 164453 usecs, 3264 MBytes/s
>   Passed
> Running remote perf test without DMA
>   0: copied 536870912 bytes in 164453 usecs, 3264 MBytes/s
>   Passed
> 
> Signed-off-by: Logan Gunthorpe 
> Acked-by: Shuah Khan 

Acked-by: Allen Hubbe 

note one comment below, link_test

> ---
>  MAINTAINERS |   1 +
>  tools/testing/selftests/ntb/ntb_test.sh | 417 
> 
>  2 files changed, 418 insertions(+)
>  create mode 100755 tools/testing/selftests/ntb/ntb_test.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9c567a4..f178e7e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7846,6 +7846,7 @@ F:  drivers/ntb/
>  F:   drivers/net/ntb_netdev.c
>  F:   include/linux/ntb.h
>  F:   include/linux/ntb_transport.h
> +F:   tools/testing/selftests/ntb/
> 
>  NTB INTEL DRIVER
>  M:   Jon Mason 
> diff --git a/tools/testing/selftests/ntb/ntb_test.sh
> b/tools/testing/selftests/ntb/ntb_test.sh
> new file mode 100755
> index 000..2b7bf81
> --- /dev/null
> +++ b/tools/testing/selftests/ntb/ntb_test.sh
> @@ -0,0 +1,417 @@
> +#!/bin/bash
> +# Copyright (c) 2016 Microsemi. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation; either version 2 of
> +# the License, or (at your option) any later version.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# Author: Logan Gunthorpe 
> +
> +REMOTE_HOST=
> +LIST_DEVS=FALSE
> +
> +DEBUGFS=${DEBUGFS-/sys/kernel/debug}
> +
> +PERF_RUN_ORDER=32
> +MAX_MW_SIZE=0
> +RUN_DMA_TESTS=
> +DONT_CLEANUP=
> +MW_SIZE=65536
> +
> +function show_help()
> +{
> + echo "Usage: $0 [OPTIONS] LOCAL_DEV REMOTE_DEV"
> + echo "Run tests on a pair of NTB endpoints."
> + echo
> + echo "If the NTB device loops back to the same host then,"
> + echo "just specifying the two PCI ids on the command line is"
> + echo "sufficient. Otherwise, if the NTB link spans two hosts"
> + echo "use the -r option to specify the hostname for the remote"
> + echo "device. SSH will then be used to test the remote side."
> + echo "An SSH key between the root users of the host would then"
> + echo "be highly recommended."
> + echo
> + echo "Options:"
> + echo "  -C  don't cleanup ntb modules on exit"
> + echo "  -d  run dma tests"
> + echo "  -h  show this help message"
> + echo "  -l  list available local and remote PCI ids"
> + echo "  -r REMOTE_HOST  specify the remote's hostname to connect"
> +echo "  to for the test (using ssh)"
> + echo "  -p NUM  ntb_perf run order (d

RE: [PATCH v3 07/10] ntb_tool: Add link status and files to debugfs

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> In order to more successfully script with ntb_tool it's useful to
> have a link file to check the link status so that the script
> doesn't use the other files until the link is up.
> 
> This commit adds a 'link' file to the debugfs directory which reads
> boolean (Y or N) depending on the link status. Writing to the file
> change the link state using ntb_link_enable or ntb_link_disable.
> 
> A 'link_event' file is also provided so an application can block until
> the link changes to the desired state. If the user writes a 1, it will
> block until the link is up. If the user writes a 0, it will block until
> the link is down.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 89 
> +
>  1 file changed, 89 insertions(+)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 031723d..caef74c 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -59,6 +59,12 @@
>   *
>   * Eg: check if clearing the doorbell mask generates an interrupt.
>   *
> + * # Check the link status
> + * root@self# cat $DBG_DIR/link
> + *
> + * # Block until the link is up
> + * root@self# echo Y > $DBG_DIR/link_event
> + *
>   * # Set the doorbell mask
>   * root@self# echo 's 1' > $DBG_DIR/mask
>   *
> @@ -131,6 +137,7 @@ struct tool_mw {
>  struct tool_ctx {
>   struct ntb_dev *ntb;
>   struct dentry *dbgfs;
> + wait_queue_head_t link_wq;
>   int mw_count;
>   struct tool_mw mws[MAX_MWS];
>  };
> @@ -159,6 +166,7 @@ static void tool_link_event(void *ctx)
>   dev_dbg(&tc->ntb->dev, "link is %s speed %d width %d\n",
>   up ? "up" : "down", speed, width);
> 
> + wake_up(&tc->link_wq);
>  }
> 
>  static void tool_db_event(void *ctx, int vec)
> @@ -473,6 +481,80 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
> tool_peer_spad_read,
> tool_peer_spad_write);
> 
> +static ssize_t tool_link_read(struct file *filep, char __user *ubuf,
> +   size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[3];
> +
> + buf[0] = ntb_link_is_up(tc->ntb, NULL, NULL) ? 'Y' : 'N';
> + buf[1] = '\n';
> + buf[2] = '\0';
> +
> + return simple_read_from_buffer(ubuf, size, offp, buf, 2);
> +}
> +
> +static ssize_t tool_link_write(struct file *filep, const char __user *ubuf,
> +size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[32];
> + size_t buf_size;
> + bool val;
> + int rc;
> +
> + buf_size = min(size, (sizeof(buf) - 1));
> + if (copy_from_user(buf, ubuf, buf_size))
> + return -EFAULT;
> +
> + buf[buf_size] = '\0';
> +
> + rc = strtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (val)
> + ntb_link_enable(tc->ntb, NTB_SPEED_AUTO, NTB_WIDTH_AUTO);
> + else
> + ntb_link_disable(tc->ntb);
> +
> + return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_fops,
> +   tool_link_read,
> +   tool_link_write);
> +
> +static ssize_t tool_link_event_write(struct file *filep,
> +  const char __user *ubuf,
> +  size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[32];
> + size_t buf_size;
> + bool val;
> + int rc;
> +
> + buf_size = min(size, (sizeof(buf) - 1));
> + if (copy_from_user(buf, ubuf, buf_size))
> + return -EFAULT;
> +
> + buf[buf_size] = '\0';
> +
> + rc = strtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (wait_event_interruptible(tc->link_wq,
> + ntb_link_is_up(tc->ntb, NULL, NULL) == val))
> + return -ERESTART;
> +
> + return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_event_fops,
> +   NULL,
> +   tool_link_event_write);
> 
>  static ssize_t tool_mw_read(struct file *filep, char __user *ubuf,
>   size_t size, loff_t *offp)
> @@ -793,6 +875,12 @@ static void tool_setup_dbgfs(struct tool_ctx *tc)
>   debugfs_create_file("peer_spad", S_IRUSR | S_IWUSR, tc->dbgf

RE: [PATCH v3 08/10] ntb_pingpong: Add a debugfs file to get the ping count

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> This commit adds a debugfs 'count' file to ntb_pingpong. This is so
> testing with ntb_pingpong can be automated beyond just checking the
> logs for pong messages.
> 
> The count file returns a number which increments every pong. The
> counter can be cleared by writing a zero.
> 
> Signed-off-by: Logan Gunthorpe 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_pingpong.c | 62 
> -
>  1 file changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
> index fe16005..7d31179 100644
> --- a/drivers/ntb/test/ntb_pingpong.c
> +++ b/drivers/ntb/test/ntb_pingpong.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
> 
> @@ -96,8 +97,13 @@ struct pp_ctx {
>   spinlock_t  db_lock;
>   struct timer_list   db_timer;
>   unsigned long   db_delay;
> + struct dentry   *debugfs_node_dir;
> + struct dentry   *debugfs_count;
> + atomic_tcount;
>  };
> 
> +static struct dentry *pp_debugfs_dir;
> +
>  static void pp_ping(unsigned long ctx)
>  {
>   struct pp_ctx *pp = (void *)ctx;
> @@ -171,10 +177,32 @@ static void pp_db_event(void *ctx, int vec)
>   dev_dbg(&pp->ntb->dev,
>   "Pong vec %d bits %#llx\n",
>   vec, db_bits);
> + atomic_inc(&pp->count);
>   }
>   spin_unlock_irqrestore(&pp->db_lock, irqflags);
>  }
> 
> +static int pp_debugfs_setup(struct pp_ctx *pp)
> +{
> + struct pci_dev *pdev = pp->ntb->pdev;
> +
> + if (!pp_debugfs_dir)
> + return -ENODEV;
> +
> + pp->debugfs_node_dir = debugfs_create_dir(pci_name(pdev),
> +   pp_debugfs_dir);
> + if (!pp->debugfs_node_dir)
> + return -ENODEV;
> +
> + pp->debugfs_count = debugfs_create_atomic_t("count", S_IRUSR | S_IWUSR,
> + pp->debugfs_node_dir,
> + &pp->count);
> + if (!pp->debugfs_count)
> + return -ENODEV;
> +
> + return 0;
> +}
> +
>  static const struct ntb_ctx_ops pp_ops = {
>   .link_event = pp_link_event,
>   .db_event = pp_db_event,
> @@ -210,6 +238,7 @@ static int pp_probe(struct ntb_client *client,
> 
>   pp->ntb = ntb;
>   pp->db_bits = 0;
> + atomic_set(&pp->count, 0);
>   spin_lock_init(&pp->db_lock);
>   setup_timer(&pp->db_timer, pp_ping, (unsigned long)pp);
>   pp->db_delay = msecs_to_jiffies(delay_ms);
> @@ -218,6 +247,10 @@ static int pp_probe(struct ntb_client *client,
>   if (rc)
>   goto err_ctx;
> 
> + rc = pp_debugfs_setup(pp);
> + if (rc)
> + goto err_ctx;
> +
>   ntb_link_enable(ntb, NTB_SPEED_AUTO, NTB_WIDTH_AUTO);
>   ntb_link_event(ntb);
> 
> @@ -234,6 +267,8 @@ static void pp_remove(struct ntb_client *client,
>  {
>   struct pp_ctx *pp = ntb->ctx;
> 
> + debugfs_remove_recursive(pp->debugfs_node_dir);
> +
>   ntb_clear_ctx(ntb);
>   del_timer_sync(&pp->db_timer);
>   ntb_link_disable(ntb);
> @@ -247,4 +282,29 @@ static struct ntb_client pp_client = {
>   .remove = pp_remove,
>   },
>  };
> -module_ntb_client(pp_client);
> +
> +static int __init pp_init(void)
> +{
> + int rc;
> +
> + if (debugfs_initialized())
> + pp_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
> +
> + rc = ntb_register_client(&pp_client);
> + if (rc)
> + goto err_client;
> +
> + return 0;
> +
> +err_client:
> + debugfs_remove_recursive(pp_debugfs_dir);
> + return rc;
> +}
> +module_init(pp_init);
> +
> +static void __exit pp_exit(void)
> +{
> + ntb_unregister_client(&pp_client);
> + debugfs_remove_recursive(pp_debugfs_dir);
> +}
> +module_exit(pp_exit);
> --
> 2.1.4




RE: [PATCH v2 6/8] ntb_tool: Add link status and files to debugfs

2016-06-15 Thread Allen Hubbe
From: Logan Gunthorpe
> On 14/06/16 03:46 PM, Allen Hubbe wrote:
> > The ntb_tool is intended to be a simple low level access to the ntb.h api.  
> > As much as
> possible, I think ntb_tool should directly expose the ntb.h api through 
> debugfs, and not
> invent higher level concepts.
> 
> I really think practical concerns should override this. If we do it that
> way then my ntb_test script wouldn't necessarily work reliably and we'd
> just be asking for race conditions. (Especially if I moved the memory
> window tests earlier.) Anyone else trying to script with ntb_tool would
> run into the same problem.
> 
> Additionally, the link is up _and_ the hardware is configured/usable
> isn't really that high level a concept or anything a user wouldn't
> expect already.

If the user is debugging some issue in their hardware or driver, they may care 
to know that the link is reported up by the driver, even if some other 
configuration didn't work as expected.  Debugging the api-level behaviors of 
hardware and hardware drivers is the primary purpose of ntb_tool.  As you note 
below, ntb_tool is not intended to support real applications.

> 
> My understanding is that ntb_tool is really just a test client to verify
> the API and the hardware. I personally would not recommend it for any
> real applications. As such, I don't think this philosophical argument
> really matches that goal.

The purpose is to "verify the API and the hardware", not to support "real 
applications."

The link status reported by the tool should be the link status reported by "the 
API and the hardware," and not something else that might be convenient for "my 
ntb_test script" or "anyone else trying to script with ntb_tool."  The primary 
purpose of ntb_tool is api-level debugging of hardware and drivers, not 
scripting.

The problem with races in ntb_tool is due to auto-configuration of memory 
windows in ntb_tool.  Instead of having ntb_tool setup the memory windows 
automatically, maybe instead it should provide a file to control the memory 
windows via debugfs.  Reading the file can format what is returned by 
ntb_mw_get_range(), and writing the file can allocate a buffer and call 
ntb_mw_set_trans(), or ntb_mw_clear_trans() and free the buffer.  Then, the 
test script can wait for link up, then setup the memory windows, and then 
finally proceed with the rest of the tests, and there would be no race.  There 
would be no confusion about what "link up" means, and ntb_tool would more 
closely resemble the ntb.h api for memory windows.

> 
> 
> >>> If this was never set false anywhere in the patch that added memory 
> >>> windows, I wonder
> if
> >> there is a bug.
> >>
> >> Yup, this looks like an oversight on my part. However, I don't think it
> >> resulted in any noticeable bug seeing, at the time, the only way to
> >> bring the link back down was to remove the module or the device. It is
> >> only strictly necessary now that we have the 'link' file which can
> >> control the link.
> >
> > Even without a file to control the link, any one side could be unloaded and 
> > reloaded.
> That also affects the link state on the side that stays loaded.  The side 
> that stays
> loaded still needs to be sane when the link comes back up.
> 
> Yup, you're correct. If the other side of link goes down then
> tc->link_is_up would be incorrect. So, yes, there may be a corner case
> bug there. Though, seeing tc-link_is_up was only previously used to
> cancel potentially queued delayed work it's probably pretty minor.
> 
> This was copied from ntb_perf which looks like it has the same issue.
> I'll make a patch for that in v3.
> 
> >>> I think tc->link_is_up should instead be ntb_link_is_up(tc->ntb).
> >>
> >> I disagree. Bad things will happen if the user waits on the event and
> >> then immediately uses the memory windows. It will just be buggy and
> >> racy. I can't see a situation where the user would want to wait for the
> >> link to come up and not have everything in ntb_tool ready and usable.
> >
> > The memory windows can be configured prior to link up.  They can be 
> > configured when
> probing the device instead of waiting for link up.  Doing memory window 
> configuration in
> probe would simplify the driver, and there would be no race.
> 
> I'm not sure this is true, especially considering all possible hardware.
> It's certainly not true with the hardware I'm working with and I'd
> assume that all the existing NTB clients configured their memory windows
> on link up and not in 

RE: [PATCH v2 6/8] ntb_tool: Add link status and files to debugfs

2016-06-14 Thread Allen Hubbe
From: Logan Gunthorpe
> On 14/06/16 01:33 PM, Allen Hubbe wrote:
> >> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> >> index cba31fd..9bebd0d 100644
> >> --- a/drivers/ntb/test/ntb_tool.c
> >> +++ b/drivers/ntb/test/ntb_tool.c
> >> @@ -59,6 +59,13 @@
> >>   *
> >>   * Eg: check if clearing the doorbell mask generates an interrupt.
> >>   *
> >> + * # Check the link status
> >> + * root@self# cat $DBG_DIR/link
> >> + *
> >> + * # Block until the link is up
> >> + * root@self# echo Y > $DBG_DIR/link_event
> >> + * root@self# cat $DBG_DIR/link_event
> >> + *
> >>   * # Set the doorbell mask
> >>   * root@self# echo 's 1' > $DBG_DIR/mask
> >>   *
> >> @@ -126,7 +133,9 @@ struct tool_ctx {
> >>struct dentry *dbgfs;
> >>struct work_struct link_cleanup;
> >>bool link_is_up;
> >
> > Really, link_is_up means "memory windows are configured."  This comes from 
> > your earlier
> patch that introduced memory windows to ntb_tool.
> 
> Yes, this is technically true. However, I don't think the distinction is
> necessary. The user only really cares whether everything is up and
> usable -- not whether the link is just physically up or not.
> 

The ntb_tool is intended to be a simple low level access to the ntb.h api.  As 
much as possible, I think ntb_tool should directly expose the ntb.h api through 
debugfs, and not invent higher level concepts.

> 
> >> +  bool link_event;
> >>struct delayed_work link_work;
> >> +  wait_queue_head_t link_wq;
> >>int mw_count;
> >>struct tool_mw mws[MAX_MWS];
> >>  };
> >> @@ -237,6 +246,7 @@ static void tool_link_work(struct work_struct *work)
> >>"Error setting up memory windows: %d\n", rc);
> >>
> >>tc->link_is_up = true;
> >
> > In other words, "memory windows are configured" = true.
> 
> Technically, yes.
> 
> >> +  wake_up(&tc->link_wq);
> >>  }
> >>
> >>  static void tool_link_cleanup(struct work_struct *work)
> >> @@ -246,6 +256,9 @@ static void tool_link_cleanup(struct work_struct *work)
> >>
> >>if (!tc->link_is_up)
> >>cancel_delayed_work_sync(&tc->link_work);
> >> +
> >> +  tc->link_is_up = false;
> >
> > If this was never set false anywhere in the patch that added memory 
> > windows, I wonder if
> there is a bug.
> 
> Yup, this looks like an oversight on my part. However, I don't think it
> resulted in any noticeable bug seeing, at the time, the only way to
> bring the link back down was to remove the module or the device. It is
> only strictly necessary now that we have the 'link' file which can
> control the link.

Even without a file to control the link, any one side could be unloaded and 
reloaded.  That also affects the link state on the side that stays loaded.  The 
side that stays loaded still needs to be sane when the link comes back up.

> 
> >> +  wake_up(&tc->link_wq);
> >>  }
> >>
> >>  static void tool_link_event(void *ctx)
> >> @@ -578,6 +591,95 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
> >>  tool_peer_spad_read,
> >>  tool_peer_spad_write);
> >>
> >> +static ssize_t tool_link_read(struct file *filep, char __user *ubuf,
> >> +size_t size, loff_t *offp)
> >> +{
> >> +  struct tool_ctx *tc = filep->private_data;
> >> +  char buf[3];
> >> +
> >> +  buf[0] = tc->link_is_up ? 'Y' : 'N';
> >
> > I think tc->link_is_up should instead be ntb_link_is_up(tc->ntb).
> 
> I disagree. Bad things will happen if the user waits on the event and
> then immediately uses the memory windows. It will just be buggy and
> racy. I can't see a situation where the user would want to wait for the
> link to come up and not have everything in ntb_tool ready and usable.

The memory windows can be configured prior to link up.  They can be configured 
when probing the device instead of waiting for link up.  Doing memory window 
configuration in probe would simplify the driver, and there would be no race.

> 
> >> +  buf[1] = '\n';
> >> +  buf[2] = '\0';
> >> +
> >> +  return simple_read_from_buffer(ubuf, size, offp, buf, 2);
> >> +}
> >> +
> >> +static ssize_t tool_link_write(struct file *filep

RE: [PATCH v2 6/8] ntb_tool: Add link status and files to debugfs

2016-06-14 Thread Allen Hubbe
From: Logan Gunthorpe
> In order to more successfully script with ntb_tool it's useful to
> have a link file to check the link status so that the script
> doesn't use the other files until the link is up.
> 
> This commit adds a 'link' file to the debugfs directory which reads a
> boolean (Y or N) depending on the link status. Writing to the file will
> change the link state using ntb_link_enable or ntb_link_disable.
> 
> A 'link_event' file is also provided so an application can block until
> the link changes to a desired state. This file is primed by writing a
> boolean. If the user writes a 1, the next read of link_event will
> block until the link is up. If the user writes a 0, the next read
> will block until the link is down. Besides blocking, reads return the
> same value as the 'link' file.
> 
> Signed-off-by: Logan Gunthorpe 
> ---
>  drivers/ntb/test/ntb_tool.c | 111 
> +++-
>  1 file changed, 110 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index cba31fd..9bebd0d 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -59,6 +59,13 @@
>   *
>   * Eg: check if clearing the doorbell mask generates an interrupt.
>   *
> + * # Check the link status
> + * root@self# cat $DBG_DIR/link
> + *
> + * # Block until the link is up
> + * root@self# echo Y > $DBG_DIR/link_event
> + * root@self# cat $DBG_DIR/link_event
> + *
>   * # Set the doorbell mask
>   * root@self# echo 's 1' > $DBG_DIR/mask
>   *
> @@ -126,7 +133,9 @@ struct tool_ctx {
>   struct dentry *dbgfs;
>   struct work_struct link_cleanup;
>   bool link_is_up;

Really, link_is_up means "memory windows are configured."  This comes from your 
earlier patch that introduced memory windows to ntb_tool.

> + bool link_event;
>   struct delayed_work link_work;
> + wait_queue_head_t link_wq;
>   int mw_count;
>   struct tool_mw mws[MAX_MWS];
>  };
> @@ -237,6 +246,7 @@ static void tool_link_work(struct work_struct *work)
>   "Error setting up memory windows: %d\n", rc);
> 
>   tc->link_is_up = true;

In other words, "memory windows are configured" = true.

> + wake_up(&tc->link_wq);
>  }
> 
>  static void tool_link_cleanup(struct work_struct *work)
> @@ -246,6 +256,9 @@ static void tool_link_cleanup(struct work_struct *work)
> 
>   if (!tc->link_is_up)
>   cancel_delayed_work_sync(&tc->link_work);
> +
> + tc->link_is_up = false;

If this was never set false anywhere in the patch that added memory windows, I 
wonder if there is a bug.

> + wake_up(&tc->link_wq);
>  }
> 
>  static void tool_link_event(void *ctx)
> @@ -578,6 +591,95 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
> tool_peer_spad_read,
> tool_peer_spad_write);
> 
> +static ssize_t tool_link_read(struct file *filep, char __user *ubuf,
> +   size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[3];
> +
> + buf[0] = tc->link_is_up ? 'Y' : 'N';

I think tc->link_is_up should instead be ntb_link_is_up(tc->ntb).

> + buf[1] = '\n';
> + buf[2] = '\0';
> +
> + return simple_read_from_buffer(ubuf, size, offp, buf, 2);
> +}
> +
> +static ssize_t tool_link_write(struct file *filep, const char __user *ubuf,
> +size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[32];
> + size_t buf_size;
> + bool val;
> + int rc;
> +
> + buf_size = min(size, (sizeof(buf) - 1));
> + if (copy_from_user(buf, ubuf, buf_size))
> + return -EFAULT;
> +
> + buf[buf_size] = '\0';
> +
> + rc = strtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (val)
> + ntb_link_enable(tc->ntb, NTB_SPEED_AUTO, NTB_WIDTH_AUTO);
> + else
> + ntb_link_disable(tc->ntb);
> +
> + return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_fops,
> +   tool_link_read,
> +   tool_link_write);
> +
> +static ssize_t tool_link_event_read(struct file *filep, char __user *ubuf,
> + size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = filep->private_data;
> + char buf[3];
> +
> + if (wait_event_interruptible(tc->link_wq,
> +  tc->link_is_up == tc->link_event))

I think tc->link_is_up should instead be ntb_link_is_up(tc->ntb).

> + return -ERESTART;
> +
> + buf[0] = tc->link_is_up ? 'Y' : 'N';
> + buf[1] = '\n';
> + buf[2] = '\0';
> +
> + return simple_read_from_buffer(ubuf, size, offp, buf, 2);
> +}
> +
> +static ssize_t tool_link_event_write(struct file *filep,
> +  const char __user *ubuf,
> +  size_t size, loff_t *offp)
> +{
> + struct tool_ctx *tc = 

RE: [PATCH 6/8] ntb_tool: Add link status file to debugfs

2016-06-14 Thread Allen Hubbe
From: Logan Gunthorpe
> On 14/06/16 09:45 AM, Allen Hubbe wrote:
> >
> > Feel free to disregard my suggestion above.  I hope my comment has not cost 
> > you too much
> time.
> >
> > The way you have written it already, and used it in the self-test script is 
> > much more
> concise.
> >
> >>> + * root@self# echo > $DBG_DIR/link
> >
> > Acked-by: allen.hu...@emc.com
> >
> >
> >
> > Eventually, I think it would be useful to let ntb_tool enable and disable 
> > the link.  In
> that case, it might also be useful in a test script to wait for link down, 
> not just link
> up.
> >
> > What about this:
> >
> > # Wait for the link to be up or down
> > root@self# echo 1 > $DBG_DIR/link
> > root@self# echo 0 > $DBG_DIR/link
> >
> > It need not be a part of this patch, but eventually:
> >
> > # Enable or disable the link
> > root@self# echo 1 > $DBG_DIR/link_ctrl
> > root@self# echo 0 > $DBG_DIR/link_ctrl
> >
> > # Reading the link_ctrl file can also give the link status
> > root@self# cat $DBG_DIR/link_ctrl
> >
> > Finally, I wonder if the file called "link" in this patch should be called 
> > "link_wait"
> or similar, so its purpose is obviously not for enabling and disabling the 
> link.
> >
> 
> Actually I've already implemented something similar to your original
> suggestion. I'll be submitting a v2 of this set shortly.

Ok.  Thanks.  I'll accept the blame if anyone doesn't like it.



RE: [PATCH 8/8] ntb_test: Add a selftest script for the NTB subsystem

2016-06-14 Thread Allen Hubbe
From: Logan Gunthorpe
> This script automates testing doorbells, scratchpads and memory windows
> for an NTB device. It can be run locally, with the NTB looped
> back to the same host or use SSH to remotely control the second host.
> 
> In the single host case, the script just needs to be passed two
> arguments: a PCI ID for each side of the link. In the two host case
> the -r option must be used to specify the remote hostname (which must
> be SSH accessible and should probably have ssh-keys exchanged).
> 
> A sample run looks like this:
> 
> $ sudo ./ntb_test.sh :03:00.1 :83:00.1 -p 29
> Starting ntb_tool tests...
> Running db tests on: :03:00.1 / :83:00.1
>   Passed
> Running db tests on: :83:00.1 / :03:00.1
>   Passed
> Running spad tests on: :03:00.1 / :83:00.1
>   Passed
> Running spad tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw0 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw0 tests on: :83:00.1 / :03:00.1
>   Passed
> Running mw1 tests on: :03:00.1 / :83:00.1
>   Passed
> Running mw1 tests on: :83:00.1 / :03:00.1
>   Passed
> 
> Starting ntb_pingpong tests...
> Running ping pong tests on: :03:00.1 / :83:00.1
>   Passed
> 
> Starting ntb_perf tests...
> Running local perf test without DMA
>   0: copied 536870912 bytes in 238205 usecs, 2253 MBytes/s
>   Passed
> Running remote perf test without DMA
>   0: copied 536870912 bytes in 238205 usecs, 2253 MBytes/s
>   Passed
> 
> Signed-off-by: Logan Gunthorpe 
> ---
>  MAINTAINERS |   1 +
>  tools/testing/selftests/ntb/ntb_test.sh | 386 
> 
>  2 files changed, 387 insertions(+)
>  create mode 100755 tools/testing/selftests/ntb/ntb_test.sh
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9c567a4..f178e7e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7846,6 +7846,7 @@ F:  drivers/ntb/
>  F:   drivers/net/ntb_netdev.c
>  F:   include/linux/ntb.h
>  F:   include/linux/ntb_transport.h
> +F:   tools/testing/selftests/ntb/
> 
>  NTB INTEL DRIVER
>  M:   Jon Mason 
> diff --git a/tools/testing/selftests/ntb/ntb_test.sh
> b/tools/testing/selftests/ntb/ntb_test.sh
> new file mode 100755
> index 000..e4a89e9
> --- /dev/null
> +++ b/tools/testing/selftests/ntb/ntb_test.sh
> @@ -0,0 +1,386 @@
> +#!/bin/bash
> +# Copyright (c) 2016 Microsemi. All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation; either version 2 of
> +# the License, or (at your option) any later version.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# Author: Logan Gunthorpe 
> +
> +REMOTE_HOST=
> +LIST_DEVS=FALSE
> +
> +DEBUGFS=${DEBUGFS-/sys/kernel/debug}
> +
> +PERF_RUN_ORDER=32
> +MAX_MW_SIZE=0
> +RUN_DMA_TESTS=
> +DONT_CLEANUP=
> +
> +function show_help()
> +{
> + echo "Usage: $0 [OPTIONS] LOCAL_DEV REMOTE_DEV"
> + echo "Run tests on a pair of NTB endpoints."
> + echo
> + echo "If the NTB device loops back to the same host then,"
> + echo "just specifying the two PCI ids on the command line is"
> + echo "sufficient. Otherwise, if the NTB link spans two hosts"
> + echo "use the -r option to specify the hostname for the remote"
> + echo "device. SSH will then be used to test the remote side."
> + echo "An SSH key between the root users of the host would then"
> + echo "be highly recommended."
> + echo
> + echo "Options:"
> + echo "  -C  don't cleanup ntb modules on exit"
> + echo "  -d  run dma tests"
> + echo "  -h  show this help message"
> + echo "  -l  list available local and remote PCI ids"
> + echo "  -r REMOTE_HOST  specify the remote's hostname to connect"
> +echo "  to for the test (using ssh)"
> + echo "  -p NUM  ntb_perf run order (default: $PERF_RUN_ORDER)"
> + echo "  -w max_mw_size  maxmium memory window size"
> + echo
> +}
> +
> +function parse_args()
> +{
> + OPTIND=0
> + while getopts "Cdhlr:p:w:" opt; do
> + case "$opt" in
> + C)  DONT_CLEANUP=1 ;;
> + d)  RUN_DMA_TESTS=1 ;;
> + h)  show_help; exit 0 ;;
> + l)  LIST_DEVS=TRUE ;;
> + r)  REMOTE_HOST=${OPTARG} ;;
> + p)  PERF_RUN_ORDER=${OPTARG} ;;
> + w)  MAX_MW_SIZE=${OPTARG} ;;
> + \?)
> + echo "Invalid option: -$OPTARG" >&2
> + exit 1
> + ;;
> + esac
> + done
> +}
> +
> +parse_args "$@"
> +shift $((OPTIND-1))
> +LOCAL_DEV=$1
> +shift
> +parse_args "$@"
> +

RE: [PATCH 6/8] ntb_tool: Add link status file to debugfs

2016-06-14 Thread Allen Hubbe
From: Allen Hubbe
> On Sat, Jun 11, 2016 at 11:28 AM, Logan Gunthorpe  wrote:
> > Hey Allen,
> >
> > Thanks for the feedback it's a bit more complicated but I don't object to
> > that. I'll work something up on Monday.
> >
> > I was trying to avoid adding link controls, but if we do, would you say the
> > module should still enable the link when it's installed? Or would we have
> > the user explicitly have to enable the link before using it?
> 
> I would vote to keep the current behavior and enable the link when the
> module loads.
> 
> >
> > Thanks,
> >
> > Logan
> >
> >
> > On 10/06/16 08:27 PM, Allen Hubbe wrote:
> >>
> >> On Fri, Jun 10, 2016 at 6:54 PM, Logan Gunthorpe 
> >> wrote:
> >>>
> >>> In order to more successfully script with ntb_tool it's useful to
> >>> have a link file to check the link status so that the script
> >>> doesn't use the other files until the link is up.
> >>>
> >>> This commit adds a 'link' file to the debugfs directory which reads
> >>> 0 or 1 depending on the link status. For scripting convenience, writing
> >>> will block until the link is up (discarding anything that was written).
> >>>
> >>> Signed-off-by: Logan Gunthorpe 
> >>> ---
> >>>   drivers/ntb/test/ntb_tool.c | 45
> >>> +
> >>>   1 file changed, 45 insertions(+)
> >>>
> >>> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> >>> index 954e1d5..116352e 100644
> >>> --- a/drivers/ntb/test/ntb_tool.c
> >>> +++ b/drivers/ntb/test/ntb_tool.c
> >>> @@ -59,6 +59,12 @@
> >>>*
> >>>* Eg: check if clearing the doorbell mask generates an interrupt.
> >>>*
> >>> + * # Check the link status
> >>> + * root@self# cat $DBG_DIR/link
> >>> + *
> >>> + * # Block until the link is up
> >>> + * root@self# echo > $DBG_DIR/link
> >>
> >>
> >> I think a file to get and set the link status is a good idea, but the
> >> way it is done as proposed here is not in a similar style to other
> >> ntb_tool operations.  Other operations simply read a register and
> >> format the value, or scan a value and write a register.  Similarly, I
> >> think the link status could be done in the same way: use the read file
> >> operation to get the current status with ntb_link_is_up(), and use the
> >> file write operation to enable or disable the link with
> >> ntb_link_enable() and ntb_link_disable().
> >>
> >> Waiting for link status is an interesting concept, too.  Really, one
> >> might be interested in a change in link status, whether up or down.
> >> What about a link event file that supports write to arm the event, and
> >> read to block for the event.  Consider an implementation based on
> >> .  It would be used in combination with the link
> >> status file, above, as follows.
> >>
> >> 1: Write 1 to the event file.  This arms the event.
> >>- The event will be disarmed by the next tool_link_event().
> >>
> >> 2: The application may read the link status file if it is interested
> >> in waiting for a particular event.
> >>
> >> 3. The application may wait for an event by reading the event file
> >>- The application will wait as long as the event is still armed.
> >>- If the event was disarmed before waiting, the application will not
> >> block.
> >>
> >> 4. The application should read the link status again.
> >>
> >> In any case, I think it would be more expected and natural to block
> >> while reading a file versus writing it.

Feel free to disregard my suggestion above.  I hope my comment has not cost you 
too much time.

The way you have written it already, and used it in the self-test script is 
much more concise.

> > + * root@self# echo > $DBG_DIR/link

Acked-by: allen.hu...@emc.com



Eventually, I think it would be useful to let ntb_tool enable and disable the 
link.  In that case, it might also be useful in a test script to wait for link 
down, not just link up.

What about this:

# Wait for the link to be up or down
root@self# echo 1 > $DBG_DIR/link
root@self# echo 0 > $DBG_DIR/link

It need not be a part of this patch, but eventually:

# Enable or disable the link
root@self# echo 1 > $DBG_DIR/link_ctrl
root@self# echo 0 > $DBG_DIR/link_

Re: [PATCH 6/8] ntb_tool: Add link status file to debugfs

2016-06-11 Thread Allen Hubbe
On Sat, Jun 11, 2016 at 11:28 AM, Logan Gunthorpe  wrote:
> Hey Allen,
>
> Thanks for the feedback it's a bit more complicated but I don't object to
> that. I'll work something up on Monday.
>
> I was trying to avoid adding link controls, but if we do, would you say the
> module should still enable the link when it's installed? Or would we have
> the user explicitly have to enable the link before using it?

I would vote to keep the current behavior and enable the link when the
module loads.

>
> Thanks,
>
> Logan
>
>
> On 10/06/16 08:27 PM, Allen Hubbe wrote:
>>
>> On Fri, Jun 10, 2016 at 6:54 PM, Logan Gunthorpe 
>> wrote:
>>>
>>> In order to more successfully script with ntb_tool it's useful to
>>> have a link file to check the link status so that the script
>>> doesn't use the other files until the link is up.
>>>
>>> This commit adds a 'link' file to the debugfs directory which reads
>>> 0 or 1 depending on the link status. For scripting convenience, writing
>>> will block until the link is up (discarding anything that was written).
>>>
>>> Signed-off-by: Logan Gunthorpe 
>>> ---
>>>   drivers/ntb/test/ntb_tool.c | 45
>>> +
>>>   1 file changed, 45 insertions(+)
>>>
>>> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
>>> index 954e1d5..116352e 100644
>>> --- a/drivers/ntb/test/ntb_tool.c
>>> +++ b/drivers/ntb/test/ntb_tool.c
>>> @@ -59,6 +59,12 @@
>>>*
>>>* Eg: check if clearing the doorbell mask generates an interrupt.
>>>*
>>> + * # Check the link status
>>> + * root@self# cat $DBG_DIR/link
>>> + *
>>> + * # Block until the link is up
>>> + * root@self# echo > $DBG_DIR/link
>>
>>
>> I think a file to get and set the link status is a good idea, but the
>> way it is done as proposed here is not in a similar style to other
>> ntb_tool operations.  Other operations simply read a register and
>> format the value, or scan a value and write a register.  Similarly, I
>> think the link status could be done in the same way: use the read file
>> operation to get the current status with ntb_link_is_up(), and use the
>> file write operation to enable or disable the link with
>> ntb_link_enable() and ntb_link_disable().
>>
>> Waiting for link status is an interesting concept, too.  Really, one
>> might be interested in a change in link status, whether up or down.
>> What about a link event file that supports write to arm the event, and
>> read to block for the event.  Consider an implementation based on
>> .  It would be used in combination with the link
>> status file, above, as follows.
>>
>> 1: Write 1 to the event file.  This arms the event.
>>- The event will be disarmed by the next tool_link_event().
>>
>> 2: The application may read the link status file if it is interested
>> in waiting for a particular event.
>>
>> 3. The application may wait for an event by reading the event file
>>- The application will wait as long as the event is still armed.
>>- If the event was disarmed before waiting, the application will not
>> block.
>>
>> 4. The application should read the link status again.
>>
>> In any case, I think it would be more expected and natural to block
>> while reading a file versus writing it.
>>
>>> + *
>>>* # Set the doorbell mask
>>>* root@self# echo 's 1' > $DBG_DIR/mask
>>>*
>>> @@ -127,6 +133,7 @@ struct tool_ctx {
>>>  struct work_struct link_cleanup;
>>>  bool link_is_up;
>>>  struct delayed_work link_work;
>>> +   wait_queue_head_t link_wq;
>>>  int mw_count;
>>>  struct tool_mw mws[MAX_MWS];
>>>   };
>>> @@ -237,6 +244,7 @@ static void tool_link_work(struct work_struct *work)
>>>  "Error setting up memory windows: %d\n", rc);
>>>
>>>  tc->link_is_up = true;
>>> +   wake_up(&tc->link_wq);
>>>   }
>>>
>>>   static void tool_link_cleanup(struct work_struct *work)
>>> @@ -573,6 +581,39 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
>>>tool_peer_spad_read,
>>>tool_peer_spad_write);
>>>
>>> +static ssize_t tool_link_read(struct file *filep, char __user *

Re: [PATCH 7/8] ntb_pingpong: Add a debugfs file to get the ping count

2016-06-10 Thread Allen Hubbe
On Fri, Jun 10, 2016 at 6:54 PM, Logan Gunthorpe  wrote:
> This commit adds a debugfs 'count' file to ntb_pingpong. This is so
> testing with ntb_pingpong can be automated beyond just checking the
> logs for pong messages.
>
> The count file returns a number which increments every pong. The
> counter can be cleared by writing a zero.
>
> Signed-off-by: Logan Gunthorpe 
> ---
>  drivers/ntb/test/ntb_pingpong.c | 68 
> -
>  1 file changed, 67 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
> index fe16005..34bbf5a 100644
> --- a/drivers/ntb/test/ntb_pingpong.c
> +++ b/drivers/ntb/test/ntb_pingpong.c
> @@ -61,6 +61,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>
> @@ -96,8 +97,13 @@ struct pp_ctx {
> spinlock_t  db_lock;
> struct timer_list   db_timer;
> unsigned long   db_delay;
> +   struct dentry   *debugfs_node_dir;
> +   struct dentry   *debugfs_count;
> +   atomic_tcount;
>  };
>
> +static struct dentry *pp_debugfs_dir;
> +
>  static void pp_ping(unsigned long ctx)
>  {
> struct pp_ctx *pp = (void *)ctx;
> @@ -171,10 +177,38 @@ static void pp_db_event(void *ctx, int vec)
> dev_dbg(&pp->ntb->dev,
> "Pong vec %d bits %#llx\n",
> vec, db_bits);
> +   atomic_inc(&pp->count);
> }
> spin_unlock_irqrestore(&pp->db_lock, irqflags);
>  }
>
> +static int pp_debugfs_setup(struct pp_ctx *pp)
> +{
> +   struct pci_dev *pdev = pp->ntb->pdev;
> +
> +   if (!debugfs_initialized())
> +   return -ENODEV;
> +
> +   if (!pp_debugfs_dir) {
> +   pp_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);

The pp_debugfs_dir is already initialized by the module init function.
If it doesn't exist here, I think we should just return instead of
trying again.  It's also worth noting, though it is probably no harm,
the code here does not check debugfs_initialized().

> +   if (!pp_debugfs_dir)
> +   return -ENODEV;
> +   }
> +
> +   pp->debugfs_node_dir = debugfs_create_dir(pci_name(pdev),
> + pp_debugfs_dir);
> +   if (!pp->debugfs_node_dir)
> +   return -ENODEV;
> +
> +   pp->debugfs_count = debugfs_create_atomic_t("count", S_IRUSR | 
> S_IWUSR,
> +   pp->debugfs_node_dir,
> +   &pp->count);
> +   if (!pp->debugfs_count)
> +   return -ENODEV;
> +
> +   return 0;
> +}
> +
>  static const struct ntb_ctx_ops pp_ops = {
> .link_event = pp_link_event,
> .db_event = pp_db_event,
> @@ -210,6 +244,7 @@ static int pp_probe(struct ntb_client *client,
>
> pp->ntb = ntb;
> pp->db_bits = 0;
> +   atomic_set(&pp->count, 0);
> spin_lock_init(&pp->db_lock);
> setup_timer(&pp->db_timer, pp_ping, (unsigned long)pp);
> pp->db_delay = msecs_to_jiffies(delay_ms);
> @@ -218,6 +253,10 @@ static int pp_probe(struct ntb_client *client,
> if (rc)
> goto err_ctx;
>
> +   rc = pp_debugfs_setup(pp);
> +   if (rc)
> +   goto err_ctx;
> +
> ntb_link_enable(ntb, NTB_SPEED_AUTO, NTB_WIDTH_AUTO);
> ntb_link_event(ntb);
>
> @@ -234,6 +273,8 @@ static void pp_remove(struct ntb_client *client,
>  {
> struct pp_ctx *pp = ntb->ctx;
>
> +   debugfs_remove_recursive(pp->debugfs_node_dir);
> +
> ntb_clear_ctx(ntb);
> del_timer_sync(&pp->db_timer);
> ntb_link_disable(ntb);
> @@ -247,4 +288,29 @@ static struct ntb_client pp_client = {
> .remove = pp_remove,
> },
>  };
> -module_ntb_client(pp_client);
> +
> +static int __init tool_init(void)

This should be pp_init() not tool_init().

> +{
> +   int rc;
> +
> +   if (debugfs_initialized())
> +   pp_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
> +
> +   rc = ntb_register_client(&pp_client);
> +   if (rc)
> +   goto err_client;
> +
> +   return 0;
> +
> +err_client:
> +   debugfs_remove_recursive(pp_debugfs_dir);
> +   return rc;
> +}
> +module_init(tool_init);
> +
> +static void __exit tool_exit(void)
> +{
> +   ntb_unregister_client(&pp_client);
> +   debugfs_remove_recursive(pp_debugfs_dir);
> +}
> +module_exit(tool_exit);


Re: [PATCH 5/8] ntb_tool: BUG: Ensure the buffer size is large enough to return all spads

2016-06-10 Thread Allen Hubbe
On Fri, Jun 10, 2016 at 6:54 PM, Logan Gunthorpe  wrote:
> On hardware with 32 scratchpad registers the spad field in ntb tool
> could chop off the end. The maximum buffer size is increased from
> 256 to 15 times the number or scratchpads.
>
> Signed-off-by: Logan Gunthorpe 

It could be marginally better if there was an explanation to accompany
the magic number 15, but it's not a big deal.  One might guess it has
something to do with the expected length of the formatted string.

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 4c01057..954e1d5 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -368,7 +368,9 @@ static ssize_t tool_spadfn_read(struct tool_ctx *tc, char 
> __user *ubuf,
> if (!spad_read_fn)
> return -EINVAL;
>
> -   buf_size = min_t(size_t, size, 0x100);
> +   spad_count = ntb_spad_count(tc->ntb);
> +
> +   buf_size = min_t(size_t, size, spad_count * 15);
>
> buf = kmalloc(buf_size, GFP_KERNEL);
> if (!buf)
> @@ -376,7 +378,6 @@ static ssize_t tool_spadfn_read(struct tool_ctx *tc, char 
> __user *ubuf,
>
> pos = 0;
>
> -   spad_count = ntb_spad_count(tc->ntb);
> for (i = 0; i < spad_count; ++i) {
> pos += scnprintf(buf + pos, buf_size - pos, "%d\t%#x\n",
>  i, spad_read_fn(tc->ntb, i));
> --
> 2.1.4
>
> --
> You received this message because you are subscribed to the Google Groups 
> "linux-ntb" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to linux-ntb+unsubscr...@googlegroups.com.
> To post to this group, send email to linux-...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/linux-ntb/d9488f2c946644c2b1258a78929d3543747283ec.1465598632.git.logang%40deltatee.com.
> For more options, visit https://groups.google.com/d/optout.


Re: [PATCH 6/8] ntb_tool: Add link status file to debugfs

2016-06-10 Thread Allen Hubbe
On Fri, Jun 10, 2016 at 6:54 PM, Logan Gunthorpe  wrote:
> In order to more successfully script with ntb_tool it's useful to
> have a link file to check the link status so that the script
> doesn't use the other files until the link is up.
>
> This commit adds a 'link' file to the debugfs directory which reads
> 0 or 1 depending on the link status. For scripting convenience, writing
> will block until the link is up (discarding anything that was written).
>
> Signed-off-by: Logan Gunthorpe 
> ---
>  drivers/ntb/test/ntb_tool.c | 45 
> +
>  1 file changed, 45 insertions(+)
>
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 954e1d5..116352e 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -59,6 +59,12 @@
>   *
>   * Eg: check if clearing the doorbell mask generates an interrupt.
>   *
> + * # Check the link status
> + * root@self# cat $DBG_DIR/link
> + *
> + * # Block until the link is up
> + * root@self# echo > $DBG_DIR/link

I think a file to get and set the link status is a good idea, but the
way it is done as proposed here is not in a similar style to other
ntb_tool operations.  Other operations simply read a register and
format the value, or scan a value and write a register.  Similarly, I
think the link status could be done in the same way: use the read file
operation to get the current status with ntb_link_is_up(), and use the
file write operation to enable or disable the link with
ntb_link_enable() and ntb_link_disable().

Waiting for link status is an interesting concept, too.  Really, one
might be interested in a change in link status, whether up or down.
What about a link event file that supports write to arm the event, and
read to block for the event.  Consider an implementation based on
.  It would be used in combination with the link
status file, above, as follows.

1: Write 1 to the event file.  This arms the event.
  - The event will be disarmed by the next tool_link_event().

2: The application may read the link status file if it is interested
in waiting for a particular event.

3. The application may wait for an event by reading the event file
  - The application will wait as long as the event is still armed.
  - If the event was disarmed before waiting, the application will not block.

4. The application should read the link status again.

In any case, I think it would be more expected and natural to block
while reading a file versus writing it.

> + *
>   * # Set the doorbell mask
>   * root@self# echo 's 1' > $DBG_DIR/mask
>   *
> @@ -127,6 +133,7 @@ struct tool_ctx {
> struct work_struct link_cleanup;
> bool link_is_up;
> struct delayed_work link_work;
> +   wait_queue_head_t link_wq;
> int mw_count;
> struct tool_mw mws[MAX_MWS];
>  };
> @@ -237,6 +244,7 @@ static void tool_link_work(struct work_struct *work)
> "Error setting up memory windows: %d\n", rc);
>
> tc->link_is_up = true;
> +   wake_up(&tc->link_wq);
>  }
>
>  static void tool_link_cleanup(struct work_struct *work)
> @@ -573,6 +581,39 @@ static TOOL_FOPS_RDWR(tool_peer_spad_fops,
>   tool_peer_spad_read,
>   tool_peer_spad_write);
>
> +static ssize_t tool_link_read(struct file *filep, char __user *ubuf,
> + size_t size, loff_t *offp)
> +{
> +   struct tool_ctx *tc = filep->private_data;
> +   char *buf;
> +   ssize_t pos, rc;
> +
> +   buf = kmalloc(64, GFP_KERNEL);
> +   if (!buf)
> +   return -ENOMEM;
> +
> +   pos = scnprintf(buf, 64, "%d\n", tc->link_is_up);
> +   rc = simple_read_from_buffer(ubuf, size, offp, buf, pos);
> +
> +   kfree(buf);
> +
> +   return rc;
> +}
> +
> +static ssize_t tool_link_write(struct file *filep, const char __user *ubuf,
> +  size_t size, loff_t *offp)
> +{
> +   struct tool_ctx *tc = filep->private_data;
> +
> +   if (wait_event_interruptible(tc->link_wq, tc->link_is_up))
> +   return -ERESTART;
> +
> +   return size;
> +}
> +
> +static TOOL_FOPS_RDWR(tool_link_fops,
> + tool_link_read,
> + tool_link_write);
>
>  static ssize_t tool_mw_read(struct file *filep, char __user *ubuf,
> size_t size, loff_t *offp)
> @@ -708,6 +749,9 @@ static void tool_setup_dbgfs(struct tool_ctx *tc)
> debugfs_create_file("peer_spad", S_IRUSR | S_IWUSR, tc->dbgfs,
> tc, &tool_peer_spad_fops);
>
> +   debugfs_create_file("link", S_IRUSR | S_IWUSR, tc->dbgfs,
> +   tc, &tool_link_fops);
> +
> mw_count = min(ntb_mw_count(tc->ntb), MAX_MWS);
> for (i = 0; i < mw_count; i++) {
> char buf[30];
> @@ -741,6 +785,7 @@ static int tool_probe(struct ntb_client *self, struct 
> ntb_dev *ntb)
> }
>
>   

RE: [PATCH] documentation: ntb.txt correct grammar "however"

2016-06-06 Thread Allen Hubbe
From: Austin S. Hemmelgarn
> On 2016-06-04 21:36, Ken Moffat wrote:
> > On Sat, Jun 04, 2016 at 03:34:01PM -0400, Justin Keller wrote:
> >> Correct the grammar around the word however.

> >> -besides Netdev, however no other applications have yet been written.
> >> +besides Netdev; however, no other applications have yet been written.
> >
> > As a user of British English, the original looks fine.  Your change,
> > however, looks odd - a semi-colon seems out of place.  If you
> > replaced it by a full-stop it would look acceptable to me - but not
> > in any sense better than what is there at the moment.
> FWIW, the existing usage in the file is common enough in at least
> British, American, and Australian English to be borderline idiomatic
> syntax, but is technically not correct based on traditional punctuation
> rules in any of them.
> 
> Personally, I'd leave it as is, especially considering that usage is
> also used by most translation services, and that proper usage of
> semicolons isn't taught much anymore even in collegiate English courses,
> so many younger individuals who speak English natively will think it
> looks odd.

As the however-challenged author, I have no preference one way or any other 
about the phrasing of this line.

To address others' concerns, perhaps I might suggest a change that resolves the 
correctness issue without appearing odd: change however to but.



RE: [PATCH 3/3] ntb_tool: Add memory window debug support

2016-06-03 Thread Allen Hubbe
From: Logan Gunthorpe 
> We allocate some memory window buffers when the link comes up, then we
> provide debugfs files to read/write each side of the link.
> 
> This is useful for debugging the mapping when writing new drivers.
> 
> Signed-off-by: Logan Gunthorpe 

Thanks! This was on my wish list.

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 258 
> +++-
>  1 file changed, 257 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 209ef7c..4c01057 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -89,6 +89,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
> 
> @@ -105,11 +106,29 @@ MODULE_VERSION(DRIVER_VERSION);
>  MODULE_AUTHOR(DRIVER_AUTHOR);
>  MODULE_DESCRIPTION(DRIVER_DESCRIPTION);
> 
> +#define MAX_MWS 16
> +
> +static unsigned long mw_size = 16;
> +module_param(mw_size, ulong, 0644);
> +MODULE_PARM_DESC(mw_size, "size order [n^2] of the memory window for 
> testing");
> +
>  static struct dentry *tool_dbgfs;
> 
> +struct tool_mw {
> + resource_size_t size;
> + u8 __iomem *local;
> + u8 *peer;
> + dma_addr_t peer_dma;
> +};
> +
>  struct tool_ctx {
>   struct ntb_dev *ntb;
>   struct dentry *dbgfs;
> + struct work_struct link_cleanup;
> + bool link_is_up;
> + struct delayed_work link_work;
> + int mw_count;
> + struct tool_mw mws[MAX_MWS];
>  };
> 
>  #define SPAD_FNAME_SIZE 0x10
> @@ -124,6 +143,111 @@ struct tool_ctx {
>   .write = __write,   \
>   }
> 
> +static int tool_setup_mw(struct tool_ctx *tc, int idx)
> +{
> + int rc;
> + struct tool_mw *mw = &tc->mws[idx];
> + phys_addr_t base;
> + resource_size_t size, align, align_size;
> +
> + if (mw->local)
> + return 0;
> +
> + rc = ntb_mw_get_range(tc->ntb, idx, &base, &size, &align,
> +   &align_size);
> + if (rc)
> + return rc;
> +
> + mw->size = min_t(resource_size_t, 1 << mw_size, size);
> + mw->size = round_up(mw->size, align);
> + mw->size = round_up(mw->size, align_size);
> +
> + mw->local = ioremap_wc(base, size);
> + if (mw->local == NULL)
> + return -EFAULT;
> +
> + mw->peer = dma_alloc_coherent(&tc->ntb->pdev->dev, mw->size,
> +   &mw->peer_dma, GFP_KERNEL);
> +
> + if (mw->peer == NULL)
> + return -ENOMEM;
> +
> + rc = ntb_mw_set_trans(tc->ntb, idx, mw->peer_dma, mw->size);
> + if (rc)
> + return rc;
> +
> + return 0;
> +}
> +
> +static void tool_free_mws(struct tool_ctx *tc)
> +{
> + int i;
> +
> + for (i = 0; i < tc->mw_count; i++) {
> + if (tc->mws[i].peer) {
> + ntb_mw_clear_trans(tc->ntb, i);
> + dma_free_coherent(&tc->ntb->pdev->dev, tc->mws[i].size,
> +   tc->mws[i].peer,
> +   tc->mws[i].peer_dma);
> +
> + }
> +
> + tc->mws[i].peer = NULL;
> + tc->mws[i].peer_dma = 0;
> +
> + if (tc->mws[i].local)
> + iounmap(tc->mws[i].local);
> +
> + tc->mws[i].local = NULL;
> + }
> +
> + tc->mw_count = 0;
> +}
> +
> +static int tool_setup_mws(struct tool_ctx *tc)
> +{
> + int i;
> + int rc;
> +
> + tc->mw_count = min(ntb_mw_count(tc->ntb), MAX_MWS);
> +
> + for (i = 0; i < tc->mw_count; i++) {
> + rc = tool_setup_mw(tc, i);
> + if (rc)
> + goto err_out;
> + }
> +
> + return 0;
> +
> +err_out:
> + tool_free_mws(tc);
> + return rc;
> +}
> +
> +static void tool_link_work(struct work_struct *work)
> +{
> + int rc;
> + struct tool_ctx *tc = container_of(work, struct tool_ctx,
> +link_work.work);
> +
> + tool_free_mws(tc);
> + rc = tool_setup_mws(tc);
> + if (rc)
> + dev_err(&tc->ntb->dev,
> + "Error setting up memory windows: %d\n", rc);
> +
> + tc->link_is_up = true;
> +}
> +
> +static void tool_link_cleanup(struct work_struct *work)
> +{
> + struct tool_ctx *tc = container_of(wor

Re: [PATCH] ntb_tool: Fix infinite loop bug when writing spad/peer_spad file

2016-05-28 Thread Allen Hubbe
On Fri, May 27, 2016 at 4:38 PM, Logan Gunthorpe  wrote:
> If you tried to write two spads in one line, as per the example:
>
> root@peer# echo '0 0x01010101 1 0x7f7f7f7f' > $DBG_DIR/peer_spad
>
> then the CPU would freeze in an infinite loop.
>
> This wasn't immediately obvious but 'pos' was not incrementing the
> buffer, so after reading the second pair of values, 'pos' would once
> again be 3 and it would re-read the second pair of values ad infinitum.
>
> Signed-off-by: Logan Gunthorpe 

Good catch.  Thanks Logan.

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/test/ntb_tool.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
> index 6f5dc6c..209ef7c 100644
> --- a/drivers/ntb/test/ntb_tool.c
> +++ b/drivers/ntb/test/ntb_tool.c
> @@ -268,7 +268,7 @@ static ssize_t tool_spadfn_write(struct tool_ctx *tc,
>  {
> int spad_idx;
> u32 spad_val;
> -   char *buf;
> +   char *buf, *buf_ptr;
> int pos, n;
> ssize_t rc;
>
> @@ -288,14 +288,15 @@ static ssize_t tool_spadfn_write(struct tool_ctx *tc,
> }
>
> buf[size] = 0;
> -
> -   n = sscanf(buf, "%d %i%n", &spad_idx, &spad_val, &pos);
> +   buf_ptr = buf;
> +   n = sscanf(buf_ptr, "%d %i%n", &spad_idx, &spad_val, &pos);
> while (n == 2) {
> +   buf_ptr += pos;
> rc = spad_write_fn(tc->ntb, spad_idx, spad_val);
> if (rc)
> break;
>
> -   n = sscanf(buf + pos, "%d %i%n", &spad_idx, &spad_val, &pos);
> +   n = sscanf(buf_ptr, "%d %i%n", &spad_idx, &spad_val, &pos);
> }
>
> if (n < 0)
> --
> 2.1.4


RE: drivers/ntb/hw/amd/ntb_hw_amd.c:367:29: sparse: cast removes address space of expression

2016-03-21 Thread Allen Hubbe
> >> drivers/ntb/hw/amd/ntb_hw_amd.c:367:29: sparse: cast removes address space 
> >> of
> expression
>drivers/ntb/hw/amd/ntb_hw_amd.c:427:31: sparse: cast removes address space 
> of
> expression

>360static int amd_ntb_peer_db_addr(struct ntb_dev *ntb,
>361phys_addr_t *db_addr,
>362resource_size_t *db_size)
>363{
>364struct amd_ntb_dev *ndev = ntb_ndev(ntb);
>365
>366if (db_addr)
>  > 367*db_addr = (phys_addr_t)(ndev->peer_mmio + 
> AMD_DBREQ_OFFSET);

This is a good warning.  The code here does look wrong (both lines 367 and 427).

Instead of peer_mmio, the code here should use pci_resource_start(pcidev, 0) + 
AMD_PEER_OFFSET.  For comparison, in ntb_hw_intel the offset is based on a 
physical address stored in peer_addr, not the virtual mapping of that address 
in peer_mmio.

Look at ntb_hw_amd around lines 1016 where self_mmio and peer_mmio are set up.  
The register physical address is never saved in the ndev object, only the 
virtual address is saved, therefore the physical address would need to be 
retrieved with pci_resoruce_start wherever it is used in the ntb_hw_amd driver.

Alternatively, the offending functions could be deleted until they are added 
later, after they have been tested.  Note that the ntb_foo_addr functions are 
an optional part of the api for drivers to implement.

Allen



RE: [PATCH V5 1/1] NTB: Add support for AMD PCI-Express Non-Transparent Bridge

2016-01-20 Thread Allen Hubbe
From: Yu, Xiangliang [mailto:xiangliang...@amd.com]
> > > Signed-off-by: Jon Mason 
> > > Signed-off-by: Allen Hubbe 
> >
> > NO.
> 
> Ok, I'll change it if you doesn't want to change it.

Nah, just remember it for next time...

I'm satisfied with this v5.

Reviewed-by: Allen Hubbe 

> I don’t think so. In here, the i/o memory is only happened when
> pci_iomap return
> Success, so the register can't be accessed through IO port way. And
> ioread* will
> Check if the memory type is mmio type or IO port type (please see the
> definition).
>  I don’t think we need to check It, so I use read* because It can make
> more efficient.
> I think we need to think about actual usage, not only follow book.
> And, I have said it in previous version, I don’t like explain it again,
> and again.
> If you have any concern, please tell me after my comment.

It's not more efficient, on this platform it's the same.

If it were my driver I would change it... but you can keep it this way.

> > This is different from v4.  It used to be:
> 
> Because peer_sta is change to 0, so amd_link_is_up will return 0
> (offline)
> And will not check hardware link status. So It maybe make it offline
> forever

It fixed a bug?  Great!

> > I'm nervous about ndev->peer_sta, the behavior of link_is_up,
> > timers...
>
> Actually, the code is designed according to Atom NTB, except for the
> peer_sta.

Except for peer_sta, and that's a pretty critical design change.  I'm still 
nervous, but I'll trust that you have been able to test this behavior 
thourougly.

> I'll add the explaination when having changes.

Thanks.

Allen



RE: [PATCH V5 1/1] NTB: Add support for AMD PCI-Express Non-Transparent Bridge

2016-01-20 Thread Allen Hubbe
From: Xiangliang Yu 
> This adds support for AMD's PCI-Express Non-Transparent Bridge
> (NTB) device on the Zeppelin platform. The driver connnects to the
> standard NTB sub-system interface, with modification to add hooks
> for power management in a separate patch. The AMD NTB device has 3
> memory windows, 16 doorbell, 16 scratch-pad registers, and supports
> up to 16 PCIe lanes running a Gen3 speeds.
> 
> Signed-off-by: Xiangliang Yu 

> Signed-off-by: Jon Mason 
> Signed-off-by: Allen Hubbe 

NO.


> + /* set and verify setting the translation address */
> + write64(addr, peer_mmio + xlat_reg);
> + reg_val = read64(peer_mmio + xlat_reg);
> + if (reg_val != addr) {
> + write64(0, peer_mmio + xlat_reg);
> + return -EIO;
> + }
> +
> + /* set and verify setting the limit */
> + writel(limit, mmio + limit_reg);
> + reg_val = readl(mmio + limit_reg);
> + if (reg_val != limit) {
> + writel(base_addr, mmio + limit_reg);
> + writel(0, peer_mmio + xlat_reg);
> + return -EIO;
> + }

I see what you did there, change iowrite64 to write64.

What I meant was:
 - change readl to ioread32.
 - change writel to iowrite32.
 - change readb, readw, writeb, writew (if there are any)
 - leave ioread64 and iowrite64 as they were.

Why: http://www.makelinux.net/ldd3/chp-9-sect-4

Quote: "If you read through the kernel source, you see many calls to an older 
set of functions when I/O memory is being used. These functions still work, but 
their use in new code is discouraged. Among other things, they are less safe 
because they do not perform the same sort of type checking."

The "older set of functions" are read[bwl], write[bwl].  This is a new driver, 
with all new code.  Please use the ioread/iowrite variants.

> +static int amd_link_is_up(struct amd_ntb_dev *ndev)
> +{
> + if (!ndev->peer_sta)
> + return NTB_LNK_STA_ACTIVE(ndev->cntl_sta);
> +
> + /* If peer_sta is reset or D0 event, the ISR has
> +  * started a timer to check link status of hardware.
> +  * So here just clear status bit. And if peer_sta is
> +  * D3 or PME_TO, D0/reset event will be happened when
> +  * system wakeup/poweron, so do nothing here.
> +  */
> + if (ndev->peer_sta & AMD_PEER_RESET_EVENT)
> + ndev->peer_sta &= ~AMD_PEER_RESET_EVENT;
> + else if (ndev->peer_sta & AMD_PEER_D0_EVENT)
> + ndev->peer_sta = 0;
> +
> + return 0;
> +}

Thanks.  This is much better.

> +static void amd_handle_event(struct amd_ntb_dev *ndev, int vec)
...
> + case AMD_PEER_D0_EVENT:
...
> + /* start a timer to poll link status */
> + schedule_delayed_work(&ndev->hb_timer,
> +   AMD_LINK_HB_TIMEOUT);

This is different from v4.  It used to be:

if (amd_link_is_up())
ntb_link_event();
else
schedule_delayed_work();

Why is v5 correct?
Why was v4 incorrect?

I'm nervous about ndev->peer_sta, the behavior of link_is_up, timers... 
unexplained changes to a fragile bit of code - not just this code, but any code 
that deals with parallel or asynchronous behaviors.  With the comment in 
link_is_up, this code is much better, but any changes to this whole link state 
mechanism need to be explained.


Allen



  1   2   >