Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Sean Hefty
> Maybe -- I wonder how scalable our RMPP implementation is though.
> What will happen on the SM node with 256 RMPP requests returning 256
> paths each?  How about 1024 * 1024?

The RMPP implementation doesn't expect to receive lots of simultaneous RMPP 
transactions, and expects them to be reasonably behaved.  It uses linear 
lookups, so there may be room for improvement here.  However, I would expect 
most scalability issues to be on the end receiving an RMPP message, rather than 
on the send side.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Sean Hefty
Michael S. Tsirkin wrote:
> This won't help you much.
> With 256 nodes all to all already gives you 65000 requests
> which is the same order of magnitude as the reported 13.

A cache for 256 nodes only generates 256 requests.  Each request is a get table 
from a given sgid.  The all to all connection model generates n^2 requests 
because each request is a get for a given sgid/dgid pair.  Additionally, cached 
requests can be done when the application isn't running, with a fairly long or 
infinite update time.

Arlin and I have discussed some caching options, including having multiple 
cache 
service daemons running on the subnet.  If more than service is running, a 
nodes 
can select a particular service to communicate with.  Communication can be done 
using RC to reduce the MAD overhead.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Todd Rimmer
> From: Michael S. Tsirkin
> Sent: Thursday, November 02, 2006 6:15 PM
> To: Hal Rosenstock
> Cc: Or Gerlitz; openib-general; Arlin R Davis
> Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add
support
> for address and route retries, call disconnect when recving dreq
> 
> Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > Subject: Re: scaling issues, was: uDAPL cma: add support for address
and
> route retries, call disconnect when recving dreq
> >
> > On Thu, 2006-11-02 at 17:54, Michael S. Tsirkin wrote:
> > > Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> > > > Subject: Re: [openib-general] scaling issues, was: uDAPL cma:
add
> support for address and route retries, call disconnect when recving
dreq
> > > >
> > > > Sean Hefty wrote:
> > > >
> > > > >One option is having the SA (or ib_umad?) return a busy status
in
> response to a
> > > > >MAD, but we'd still have to be able to send this response as
> quickly as requests
> > > > >are being received.  We could then limit the number of requests
> that would be
> > > > >queued in the kernel for a user.
> > > > >
> > > > >
> > > >
> > > > Another great option would be to have path record caching.
> Unfortunately
> > > > OFED 1.1 did not include ib_local_sa in the release.
> > > >
> > >
> > > This won't help you much.
> > > With 256 nodes all to all already gives you 65000 requests
> > > which is the same order of magnitude as the reported 13.
> >
> > The requests might occur at a different time so they could be spread
out
> > rather than synchronized.
> 
> I don't see how caching does this.
> 
If all the queries are made at app startup, there will be one huge batch
of queries to the SA, especially for a many process MPI job.

In contrast if SA caching is building its own replica of the relevant
subset of the SA, the pace can be more controlled.  It can even be
purposely randomized by the SA cache code itself (eg. don't just do it
every 10 minutes, do it every 10 minutes +/- a random number, etc).
This way if all nodes powered on at similar time you won't have a
pattern of everyone asking SM at the same time.

Todd Rimmer

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Todd Rimmer

> From: Michael S. Tsirkin
> Sent: Thursday, November 02, 2006 5:55 PM
> To: Arlin Davis
> Cc: Or Gerlitz; openib-general; Arlin Davis
> Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add
support
> for address and route retries, call disconnect when recving dreq
> 
> Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> > Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add
> support for address and route retries, call disconnect when recving
dreq
> >
> > Sean Hefty wrote:
> >
> > >One option is having the SA (or ib_umad?) return a busy status in
> response to a
> > >MAD, but we'd still have to be able to send this response as
quickly as
> requests
> > >are being received.  We could then limit the number of requests
that
> would be
> > >queued in the kernel for a user.
> > >
> > >
> >
> > Another great option would be to have path record caching.
Unfortunately
> > OFED 1.1 did not include ib_local_sa in the release.
> >
> 
> This won't help you much.
> With 256 nodes all to all already gives you 65000 requests
> which is the same order of magnitude as the reported 13.

We have SA caching working quite well with very large clusters.  Here
are some techniques which make it much more efficient:

1. A given node only cares about path records relevant to it.  So only
ask for path records where it is the source.
2. Use SA notices for GID in/out of service to trigger cache updates,
and only then for the specific GID which has changed
- as background, refresh all cache entrys slowly and
infrequently just in case the notice was lost, however IBTA does allow
retries and Acks of notices so this will be infrequent
3. limit number of outstanding SA queries from a given node, this avoids
1 node blasting the SM
There a little more to it, but that should be the main points relevant
to this discussion.

Todd Rimmer

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Sean Hefty
Venkatesh Babu wrote:
> I made the change you suggested.
> On Active node I got the event IB_EVENT_PATH_MIG and then send failed 
> with IB_WC_RETRY_EXC_ERR.

Ok - I will continue debugging this once I complete a test program.  Thanks for 
the assistance.

> On Passive node I got 100 IB_EVENT_PATH_MIG_ERR events.

This sounds like a bug in the stack.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 1/7] IB/core - Add DMA mapping functions to allow device drivers to interpose

2006-11-02 Thread Ralph Campbell
On Fri, 2006-11-03 at 01:44 +0200, Michael S. Tsirkin wrote:
> Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> > Subject: Re: [PATCH 1/7] IB/core - Add DMA mapping functions to allow 
> > device drivers to interpose
> > 
> >  > However, this means that the API must give the HCA the choice of
> >  > what to keep inside the mapping. This could mean, for example, returning
> >  > a structure that can include dma_addr_t, void*, or both, and a flag to
> >  > distinguish between the two.
> > 
> > It's an interesting idea.  However I think it may be more trouble than it's
> > worth, for at least two reasons.  First, the wrapper for dma_map_sg() will
> > probably become really ugly, although maybe there's a clever idea.
> 
> Oh, my guess is s/g is usually for long messages so we can just always do dma 
> in
> that case.
> 
> > Second,
> > the consumer right now only gets to pass a 64-bit address into the work
> > request posting functions.  I don't think we really want to change that
> > interface, so the driver would have to encode the flag in the address 
> > somehow
> > anyway.
> 
> But how?
> Wait, work request posting functions actually get a virtual
> address and a key, not a dma address. Maybe something can be done with this?
> Say, we have get_dma_mr at the moment - maybe we could have a special
> mr, and let the dma functions also select which mr to use?
> 
> 
> > Also handling highmem is a problem.  ipath just depends on 64BIT so it
> > avoids the problem.  I guess mthca could only return a kernel virtual
> > address if one exists, and always use DMA for highmem pages.  So that
> > isn't really a serious objection.
> 
> Right.

I'm open to suggestions if you have a proposal for making the interface
more usable.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 1/7] IB/core - Add DMA mapping functions to allow device drivers to interpose

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH 1/7] IB/core - Add DMA mapping functions to allow device 
> drivers to interpose
> 
>  > However, this means that the API must give the HCA the choice of
>  > what to keep inside the mapping. This could mean, for example, returning
>  > a structure that can include dma_addr_t, void*, or both, and a flag to
>  > distinguish between the two.
> 
> It's an interesting idea.  However I think it may be more trouble than it's
> worth, for at least two reasons.  First, the wrapper for dma_map_sg() will
> probably become really ugly, although maybe there's a clever idea.

Oh, my guess is s/g is usually for long messages so we can just always do dma in
that case.

> Second,
> the consumer right now only gets to pass a 64-bit address into the work
> request posting functions.  I don't think we really want to change that
> interface, so the driver would have to encode the flag in the address somehow
> anyway.

But how?
Wait, work request posting functions actually get a virtual
address and a key, not a dma address. Maybe something can be done with this?
Say, we have get_dma_mr at the moment - maybe we could have a special
mr, and let the dma functions also select which mr to use?


> Also handling highmem is a problem.  ipath just depends on 64BIT so it
> avoids the problem.  I guess mthca could only return a kernel virtual
> address if one exists, and always use DMA for highmem pages.  So that
> isn't really a serious objection.

Right.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Roland Dreier
 > Actually, this sounds like an excellent idea - this immediately makes the 
 > number
 > of queries linear with cluster size.

Maybe -- I wonder how scalable our RMPP implementation is though.
What will happen on the SM node with 256 RMPP requests returning 256
paths each?  How about 1024 * 1024?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Roland Dreier
 > Am I missing something here? 65,000 requests every 15 minutes (current 
 > default) for the entire cluster versus 100-13 every time I start an 
 > application is a big help. 

Depends on how long your app takes to run -- if your app only starts
once every day or something then the cache refreshing is worse.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 1/7] IB/core - Add DMA mapping functions to allow device drivers to interpose

2006-11-02 Thread Roland Dreier
 > However, this means that the API must give the HCA the choice of
 > what to keep inside the mapping. This could mean, for example, returning
 > a structure that can include dma_addr_t, void*, or both, and a flag to
 > distinguish between the two.

It's an interesting idea.  However I think it may be more trouble than
it's worth, for at least two reasons.  First, the wrapper for
dma_map_sg() will probably become really ugly, although maybe there's
a clever idea.  Second, the consumer right now only gets to pass a
64-bit address into the work request posting functions.  I don't think
we really want to change that interface, so the driver would have to
encode the flag in the address somehow anyway.

Also handling highmem is a problem.  ipath just depends on 64BIT so it
avoids the problem.  I guess mthca could only return a kernel virtual
address if one exists, and always use DMA for highmem pages.  So that
isn't really a serious objection.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add support for 
> address and route retries, call disconnect when recving dreq
> 
> > > With 256 nodes all to all already gives you 65000 requests
> > > which is the same order of magnitude as the reported 13.
> 
> I think the only advantage of caching is if you start your app twice.
> 
> But maybe you can fill the cache more efficiently by doing a single
> get table to find all the paths at once, rather than having to let the
> cma query each path after arping for the GID...

Actually, this sounds like an excellent idea - this immediately makes the number
of queries linear with cluster size.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Roland Dreier
> > With 256 nodes all to all already gives you 65000 requests
> > which is the same order of magnitude as the reported 13.

I think the only advantage of caching is if you start your app twice.

But maybe you can fill the cache more efficiently by doing a single
get table to find all the paths at once, rather than having to let the
cma query each path after arping for the GID...

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Arlin Davis
Michael S. Tsirkin wrote:

>>Another great option would be to have path record caching. Unfortunately 
>>OFED 1.1 did not include ib_local_sa in the release.
>>
>>
>>
>
>This won't help you much.
>With 256 nodes all to all already gives you 65000 requests
>which is the same order of magnitude as the reported 13.
>
>  
>
Am I missing something here? 65,000 requests every 15 minutes (current 
default) for the entire cluster versus 100-13 every time I start an 
application is a big help. Especially on a very large cluster that is 
batching up smaller independent jobs sharing a single SA and fabric. We 
either need caching or SA capabilities that can scale up with large 
clusters. A single service running at 6000 requests/second will not succeed.

-arlin.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> Subject: Re: scaling issues, was: uDAPL cma: add support for address and 
> route retries, call disconnect when recving dreq
> 
> On Thu, 2006-11-02 at 17:54, Michael S. Tsirkin wrote:
> > Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> > > Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add support 
> > > for address and route retries, call disconnect when recving dreq
> > > 
> > > Sean Hefty wrote:
> > > 
> > > >One option is having the SA (or ib_umad?) return a busy status in 
> > > >response to a 
> > > >MAD, but we'd still have to be able to send this response as quickly as 
> > > >requests 
> > > >are being received.  We could then limit the number of requests that 
> > > >would be 
> > > >queued in the kernel for a user.
> > > >  
> > > >
> > > 
> > > Another great option would be to have path record caching. Unfortunately 
> > > OFED 1.1 did not include ib_local_sa in the release.
> > > 
> > 
> > This won't help you much.
> > With 256 nodes all to all already gives you 65000 requests
> > which is the same order of magnitude as the reported 13.
> 
> The requests might occur at a different time so they could be spread out
> rather than synchronized.

I don't see how caching does this.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 1/7] IB/core - Add DMA mapping functions to allow device drivers to interpose

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Ralph Campbell <[EMAIL PROTECTED]>:
> Subject: [PATCH 1/7] IB/core - Add DMA mapping functions to allow device 
> drivers to interpose
> 
> IB/core - Add DMA mapping functions to allow device drivers to interpose
> 
> The QLogic InfiniPath HCAs use programmed I/O instead of HW DMA.
> This patch allows a verbs device driver to interpose on DMA mapping
> function calls in order to avoid relying on bus_to_virt() and
> phys_to_virt() to undo the mappings created by dma_map_single(),
> dma_map_sg(), etc.
> 
> From: Ralph Campbell <[EMAIL PROTECTED]>
> 
> diff -r f37bd0e41fec include/rdma/ib_verbs.h
> --- a/include/rdma/ib_verbs.h Thu Oct 26 21:44:41 2006 +0700
> +++ b/include/rdma/ib_verbs.h Thu Oct 26 16:10:04 2006 -0800
> @@ -43,6 +43,8 @@
>  
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -846,6 +848,42 @@ struct ib_cache {
>   struct ib_pkey_cache  **pkey_cache;
>   struct ib_gid_cache   **gid_cache;
>   u8 *lmc_cache;
> +};
> +
> +struct ib_dma_mapping_ops {
> + int (*mapping_error)(struct ib_device *dev,
> +  dma_addr_t dma_addr);
> + dma_addr_t  (*map_single)(struct ib_device *dev,
> +   void *ptr, size_t size,
> +   enum dma_data_direction direction);
> + void(*unmap_single)(struct ib_device *dev,
> + dma_addr_t addr, size_t size,
> + enum dma_data_direction direction);
> + dma_addr_t  (*map_page)(struct ib_device *dev,
> + struct page *page, unsigned long offset,
> + size_t size,
> + enum dma_data_direction direction);
> + void(*unmap_page)(struct ib_device *dev,
> +   dma_addr_t addr, size_t size,
> +   enum dma_data_direction direction);
> + int (*map_sg)(struct ib_device *dev,
> +   struct scatterlist *sg, int nents,
> +   enum dma_data_direction direction);
> + void(*unmap_sg)(struct ib_device *dev,
> + struct scatterlist *sg, int nents,
> + enum dma_data_direction direction);
> + dma_addr_t  (*dma_address)(struct ib_device *dev,
> +struct scatterlist *sg);
> + unsigned int(*dma_len)(struct ib_device *dev,
> +struct scatterlist *sg);
> + void(*sync_single_for_cpu)(struct ib_device *dev,
> +dma_addr_t dma_handle,
> +size_t size,
> +enum dma_data_direction dir);
> + void(*sync_single_for_device)(struct ib_device *dev,
> +   dma_addr_t dma_handle,
> +   size_t size,
> +   enum dma_data_direction dir);
>  };

Maybe we should make the API a bit more generic than just matching what ipath
needs. Specifically mellanox HCAs (and I expect others) can support *both* dma
(in use today) and pushing data "inline" directly into HCA.

And this actually might make more sense than DMA for small messages.

However, this means that the API must give the HCA the choice of
what to keep inside the mapping. This could mean, for example, returning
a structure that can include dma_addr_t, void*, or both, and a flag to
distinguish between the two.

Does this make sense?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 17:54, Michael S. Tsirkin wrote:
> Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> > Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add support 
> > for address and route retries, call disconnect when recving dreq
> > 
> > Sean Hefty wrote:
> > 
> > >One option is having the SA (or ib_umad?) return a busy status in response 
> > >to a 
> > >MAD, but we'd still have to be able to send this response as quickly as 
> > >requests 
> > >are being received.  We could then limit the number of requests that would 
> > >be 
> > >queued in the kernel for a user.
> > >  
> > >
> > 
> > Another great option would be to have path record caching. Unfortunately 
> > OFED 1.1 did not include ib_local_sa in the release.
> > 
> 
> This won't help you much.
> With 256 nodes all to all already gives you 65000 requests
> which is the same order of magnitude as the reported 13.

The requests might occur at a different time so they could be spread out
rather than synchronized.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH v2] opensm: strict osm_log arguments/format check

2006-11-02 Thread Yevgeny Kliteynik
Looks ok, great.

-- Yevgeny

Sasha Khapyorsky wrote:
> This adds gcc attribute to osm_log() which causes the compiler to check
> argument types against a format string. And also there are related fixes
> in osm_log() usage.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> ---
>  osm/include/opensm/osm_log.h |8 +++-
>  osm/libvendor/osm_vendor_ibumad_sa.c |2 +-
>  osm/opensm/main.c|3 ++-
>  osm/opensm/osm_pkey_mgr.c|1 +
>  osm/opensm/osm_port_info_rcv.c   |5 +++--
>  osm/opensm/osm_sa_informinfo.c   |4 ++--
>  osm/opensm/osm_sa_link_record.c  |8 
>  osm/opensm/osm_sa_mad_ctrl.c |3 ++-
>  osm/opensm/osm_sa_response.c |2 +-
>  osm/opensm/osm_sm_state_mgr.c|3 ++-
>  osm/opensm/osm_sminfo_rcv.c  |9 +
>  osm/opensm/osm_state_mgr.c   |8 
>  osm/osmtest/osmt_multicast.c |   12 +++-
>  osm/osmtest/osmt_service.c   |6 +++---
>  osm/osmtest/osmtest.c|8 
>  15 files changed, 48 insertions(+), 34 deletions(-)
> 
> diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
> index 6a1a93f..f51a1c8 100644
> --- a/osm/include/opensm/osm_log.h
> +++ b/osm/include/opensm/osm_log.h
> @@ -60,6 +60,12 @@ #include 
>  #include 
>  #include 
>  
> +#ifdef __GNUC__
> +#define STRICT_OSM_LOG_FORMAT __attribute__((format(printf, 3, 4)))
> +#else
> +#define STRICT_OSM_LOG_FORMAT
> +#endif
> +
>  #ifdef __cplusplus
>  #  define BEGIN_C_DECLS extern "C" {
>  #  define END_C_DECLS   }
> @@ -377,7 +383,7 @@ void
>  osm_log(
>   IN osm_log_t* const p_log,
>   IN const osm_log_level_t verbosity,
> - IN const char *p_str, ... );
> + IN const char *p_str, ... ) STRICT_OSM_LOG_FORMAT;
>  
>  void
>  osm_log_raw(
> diff --git a/osm/libvendor/osm_vendor_ibumad_sa.c 
> b/osm/libvendor/osm_vendor_ibumad_sa.c
> index 7fd0655..7c4a2f7 100644
> --- a/osm/libvendor/osm_vendor_ibumad_sa.c
> +++ b/osm/libvendor/osm_vendor_ibumad_sa.c
> @@ -853,7 +853,7 @@ #ifdef DUAL_SIDED_RMPP
>  if ( p_mpr_req->sgid_count + p_mpr_req->dgid_count > 
> IB_MULTIPATH_MAX_GIDS )
>  {
>osm_log( p_log, OSM_LOG_ERROR,
> -   "osmv_query_sa DBG:001 MULTIPATH_REC ",
> +   "osmv_query_sa DBG:001 MULTIPATH_REC "
> "SGID count %d DGID count %d max count %d\n",
>  p_mpr_req->sgid_count, p_mpr_req->dgid_count,
>  IB_MULTIPATH_MAX_GIDS );
> diff --git a/osm/opensm/main.c b/osm/opensm/main.c
> index 729702a..752b546 100644
> --- a/osm/opensm/main.c
> +++ b/osm/opensm/main.c
> @@ -460,7 +460,8 @@ parse_ignore_guids_file(IN char *guids_f
>{
>  osm_log( &p_osm->log, OSM_LOG_ERROR,
>   "parse_ignore_guids_file: ERR 0601: "
> - "Unable to open ignore guids file (%s)\n" );
> + "Unable to open ignore guids file (%s)\n",
> + guids_file_name );
>  status = IB_ERROR;
>  goto Exit;
>}
> diff --git a/osm/opensm/osm_pkey_mgr.c b/osm/opensm/osm_pkey_mgr.c
> index f2cb221..735dc14 100644
> --- a/osm/opensm/osm_pkey_mgr.c
> +++ b/osm/opensm/osm_pkey_mgr.c
> @@ -139,6 +139,7 @@ pkey_mgr_process_physical_port(
>  "pkey_mgr_process_physical_port: ERR 0503: "
>  "Failed to obtain P_Key 0x%04x block and index for node "
>  "0x%016" PRIx64 " port %u\n",
> +ib_pkey_get_base( pkey ),
>  cl_ntoh64( osm_node_get_node_guid( p_node ) ),
>  osm_physp_get_port_num( p_physp ) );
>return;
> diff --git a/osm/opensm/osm_port_info_rcv.c b/osm/opensm/osm_port_info_rcv.c
> index 95112dc..f6d3595 100644
> --- a/osm/opensm/osm_port_info_rcv.c
> +++ b/osm/opensm/osm_port_info_rcv.c
> @@ -724,8 +724,9 @@ osm_pi_rcv_process(
>{
>  osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
>   "osm_pi_rcv_process: "
> - "Got light sweep response from remote port of parent node GUID 
> = 0x%" PRIx64
> - " port = %u, Commencing heavy sweep\n",
> + "Got light sweep response from remote port of parent node "
> + "GUID = 0x%" PRIx64 " port = 0x%016" PRIx64
> + ", Commencing heavy sweep\n",
>   cl_ntoh64( node_guid ),
>   cl_ntoh64( port_guid ) );
>  osm_state_mgr_process( p_rcv->p_state_mgr,
> diff --git a/osm/opensm/osm_sa_informinfo.c b/osm/opensm/osm_sa_informinfo.c
> index 69dca1d..0cec307 100644
> --- a/osm/opensm/osm_sa_informinfo.c
> +++ b/osm/opensm/osm_sa_informinfo.c
> @@ -163,8 +163,8 @@ __validate_ports_access_rights(
>  {
>osm_log( p_rcv->p_log, OSM_LOG_ERROR,
> "__validate_ports_access_rights: ERR 4301: "
> -   "Invalid port guid: 0x%016\n",
> -   portguid );
> +   "Invalid port guid: 0x%016" PRIx64 "\n",
> +   cl_ntoh64(portguid) );
>valid = FALSE;
> 

Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add support for 
> address and route retries, call disconnect when recving dreq
> 
> Sean Hefty wrote:
> 
> >One option is having the SA (or ib_umad?) return a busy status in response 
> >to a 
> >MAD, but we'd still have to be able to send this response as quickly as 
> >requests 
> >are being received.  We could then limit the number of requests that would 
> >be 
> >queued in the kernel for a user.
> >  
> >
> 
> Another great option would be to have path record caching. Unfortunately 
> OFED 1.1 did not include ib_local_sa in the release.
> 

This won't help you much.
With 256 nodes all to all already gives you 65000 requests
which is the same order of magnitude as the reported 13.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
>  > By the way, what's up with this project?
>  > It's still planned for libibverbs 1.1, isn't it?
> 
> I working on it along with other things.
> 
> Where are your patches for using multiple EQs for CQ events? :)

Sorry, that was not an atempt to pressure you.
I was just organising my plans for the next month or so
and wanted to check whether this needs my attention.
I actually forgot about the multiple EQ idea - thanks for the reminder,
I need to look into how the API would look.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 7/7] IB/srp - Use new verbs IB DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/srp - Use new verbs IB DMA mapping functions

This patch converts SRP to use the new verbs DMA mapping functions
for kernel verbs consumers.

From: Ralph Campbell <[EMAIL PROTECTED]>

diff -r f37bd0e41fec drivers/infiniband/ulp/srp/ib_srp.c
--- a/drivers/infiniband/ulp/srp/ib_srp.c   Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/ulp/srp/ib_srp.c   Thu Oct 26 12:33:28 2006 -0800
@@ -122,9 +122,8 @@ static struct srp_iu *srp_alloc_iu(struc
if (!iu->buf)
goto out_free_iu;
 
-   iu->dma = dma_map_single(host->dev->dev->dma_device,
-iu->buf, size, direction);
-   if (dma_mapping_error(iu->dma))
+   iu->dma = ib_dma_map_single(host->dev->dev, iu->buf, size, direction);
+   if (ib_dma_mapping_error(host->dev->dev, iu->dma))
goto out_free_buf;
 
iu->size  = size;
@@ -145,8 +144,7 @@ static void srp_free_iu(struct srp_host 
if (!iu)
return;
 
-   dma_unmap_single(host->dev->dev->dma_device,
-iu->dma, iu->size, iu->direction);
+   ib_dma_unmap_single(host->dev->dev, iu->dma, iu->size, iu->direction);
kfree(iu->buf);
kfree(iu);
 }
@@ -481,8 +479,8 @@ static void srp_unmap_data(struct scsi_c
scat  = &req->fake_sg;
}
 
-   dma_unmap_sg(target->srp_host->dev->dev->dma_device, scat, nents,
-scmnd->sc_data_direction);
+   ib_dma_unmap_sg(target->srp_host->dev->dev, scat, nents,
+   scmnd->sc_data_direction);
 }
 
 static void srp_remove_req(struct srp_target_port *target, struct srp_request 
*req)
@@ -594,23 +592,26 @@ static int srp_map_fmr(struct srp_target
int i, j;
int ret;
struct srp_device *dev = target->srp_host->dev;
+   struct ib_device *ibdev = dev->dev;
 
if (!dev->fmr_pool)
return -ENODEV;
 
-   if ((sg_dma_address(&scat[0]) & ~dev->fmr_page_mask) &&
+   if ((ib_sg_dma_address(ibdev, &scat[0]) & ~dev->fmr_page_mask) &&
mellanox_workarounds && !memcmp(&target->ioc_guid, mellanox_oui, 3))
return -EINVAL;
 
len = page_cnt = 0;
for (i = 0; i < sg_cnt; ++i) {
-   if (sg_dma_address(&scat[i]) & ~dev->fmr_page_mask) {
+   unsigned int dma_len = ib_sg_dma_len(ibdev, &scat[i]);
+
+   if (ib_sg_dma_address(ibdev, &scat[i]) & ~dev->fmr_page_mask) {
if (i > 0)
return -EINVAL;
else
++page_cnt;
}
-   if ((sg_dma_address(&scat[i]) + sg_dma_len(&scat[i])) &
+   if ((ib_sg_dma_address(ibdev, &scat[i]) + dma_len) &
~dev->fmr_page_mask) {
if (i < sg_cnt - 1)
return -EINVAL;
@@ -618,7 +619,7 @@ static int srp_map_fmr(struct srp_target
++page_cnt;
}
 
-   len += sg_dma_len(&scat[i]);
+   len += dma_len;
}
 
page_cnt += len >> dev->fmr_page_shift;
@@ -630,10 +631,14 @@ static int srp_map_fmr(struct srp_target
return -ENOMEM;
 
page_cnt = 0;
-   for (i = 0; i < sg_cnt; ++i)
-   for (j = 0; j < sg_dma_len(&scat[i]); j += dev->fmr_page_size)
+   for (i = 0; i < sg_cnt; ++i) {
+   unsigned int dma_len = ib_sg_dma_len(ibdev, &scat[i]);
+
+   for (j = 0; j < dma_len; j += dev->fmr_page_size)
dma_pages[page_cnt++] =
-   (sg_dma_address(&scat[i]) & dev->fmr_page_mask) 
+ j;
+   (ib_sg_dma_address(ibdev, &scat[i]) &
+dev->fmr_page_mask) + j;
+   }
 
req->fmr = ib_fmr_pool_map_phys(dev->fmr_pool,
dma_pages, page_cnt, io_addr);
@@ -643,7 +648,8 @@ static int srp_map_fmr(struct srp_target
goto out;
}
 
-   buf->va  = cpu_to_be64(sg_dma_address(&scat[0]) & ~dev->fmr_page_mask);
+   buf->va  = cpu_to_be64(ib_sg_dma_address(ibdev, &scat[0]) &
+  ~dev->fmr_page_mask);
buf->key = cpu_to_be32(req->fmr->fmr->rkey);
buf->len = cpu_to_be32(len);
 
@@ -662,6 +668,8 @@ static int srp_map_data(struct scsi_cmnd
struct srp_cmd *cmd = req->cmd->buf;
int len, nents, count;
u8 fmt = SRP_DATA_DESC_DIRECT;
+   struct srp_device *dev;
+   struct ib_device *ibdev;
 
if (!scmnd->request_buffer || scmnd->sc_data_direction == DMA_NONE)
return sizeof (struct srp_cmd);
@@ -686,8 +694,10 @@ static int srp_map_data(struct scsi_cmnd
sg_init_one(scat, scmnd->request_buffer, 
scmnd->request_bufflen);
}
 
-   count = dma_map_sg(target->srp_host->dev->dev->dma_device,
-   

[openib-general] [PATCH 6/7] IB/sdp - Use the new verbs DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/sdp - Use the new verbs DMA mapping functions

This patch converts SDP to use the new DMA mapping functions
for kernel verbs consumers.

From: Ralph Campbell <[EMAIL PROTECTED]>

Index: src/linux-kernel/infiniband/ulp/sdp/sdp_bcopy.c
===
--- src/linux-kernel/infiniband/ulp/sdp/sdp_bcopy.c (revision 9441)
+++ src/linux-kernel/infiniband/ulp/sdp/sdp_bcopy.c (working copy)
@@ -67,7 +67,7 @@ void sdp_post_send(struct sdp_sock *ssk,
unsigned mseq = ssk->tx_head;
int i, rc, frags;
dma_addr_t addr;
-   struct device *hwdev;
+   struct ib_device *dev;
struct ib_sge *sge;
struct ib_send_wr *bad_wr;
 
@@ -80,15 +80,14 @@ void sdp_post_send(struct sdp_sock *ssk,
 
tx_req = &ssk->tx_ring[mseq & (SDP_TX_SIZE - 1)];
tx_req->skb = skb;
-   hwdev = ssk->dma_device;
+   dev = ssk->mr->device;
sge = ssk->ibsge;
-   addr = dma_map_single(hwdev,
- skb->data, skb->len - skb->data_len,
- DMA_TO_DEVICE);
+   addr = ib_dma_map_single(dev, skb->data, skb->len - skb->data_len,
+DMA_TO_DEVICE);
tx_req->mapping[0] = addr;

/* TODO: proper error handling */
-   BUG_ON(dma_mapping_error(addr));
+   BUG_ON(ib_dma_mapping_error(dev, addr));
 
sge->addr = (u64)addr;
sge->length = skb->len - skb->data_len;
@@ -96,11 +95,11 @@ void sdp_post_send(struct sdp_sock *ssk,
frags = skb_shinfo(skb)->nr_frags;
for (i = 0; i < frags; ++i) {
++sge;
-   addr = dma_map_page(hwdev, skb_shinfo(skb)->frags[i].page,
-   skb_shinfo(skb)->frags[i].page_offset,
-   skb_shinfo(skb)->frags[i].size,
-   DMA_TO_DEVICE);
-   BUG_ON(dma_mapping_error(addr));
+   addr = ib_dma_map_page(dev, skb_shinfo(skb)->frags[i].page,
+  skb_shinfo(skb)->frags[i].page_offset,
+  skb_shinfo(skb)->frags[i].size,
+  DMA_TO_DEVICE);
+   BUG_ON(ib_dma_mapping_error(dev, addr));
tx_req->mapping[i + 1] = addr;
sge->addr = addr;
sge->length = skb_shinfo(skb)->frags[i].size;
@@ -124,7 +123,7 @@ void sdp_post_send(struct sdp_sock *ssk,
 
 struct sk_buff *sdp_send_completion(struct sdp_sock *ssk, int mseq)
 {
-   struct device *hwdev;
+   struct ib_device *dev;
struct sdp_buf *tx_req;
struct sk_buff *skb;
int i, frags;
@@ -135,16 +134,16 @@ struct sk_buff *sdp_send_completion(stru
return NULL;
}
 
-   hwdev = ssk->dma_device;
+   dev = ssk->mr->device;
 tx_req = &ssk->tx_ring[mseq & (SDP_TX_SIZE - 1)];
skb = tx_req->skb;
-   dma_unmap_single(hwdev, tx_req->mapping[0], skb->len - skb->data_len,
-DMA_TO_DEVICE);
+   ib_dma_unmap_single(dev, tx_req->mapping[0], skb->len - skb->data_len,
+   DMA_TO_DEVICE);
frags = skb_shinfo(skb)->nr_frags;
for (i = 0; i < frags; ++i) {
-   dma_unmap_page(hwdev, tx_req->mapping[i + 1],
-  skb_shinfo(skb)->frags[i].size,
-  DMA_TO_DEVICE);
+   ib_dma_unmap_page(dev, tx_req->mapping[i + 1],
+ skb_shinfo(skb)->frags[i].size,
+ DMA_TO_DEVICE);
}
 
++ssk->tx_tail;
@@ -157,7 +156,7 @@ static void sdp_post_recv(struct sdp_soc
struct sdp_buf *rx_req;
int i, rc, frags;
dma_addr_t addr;
-   struct device *hwdev;
+   struct ib_device *dev;
struct ib_sge *sge;
struct ib_recv_wr *bad_wr;
struct sk_buff *skb;
@@ -188,11 +187,10 @@ static void sdp_post_recv(struct sdp_soc
 
 rx_req = ssk->rx_ring + (id & (SDP_RX_SIZE - 1));
rx_req->skb = skb;
-   hwdev = ssk->dma_device;
+   dev = ssk->mr->device;
sge = ssk->ibsge;
-   addr = dma_map_single(hwdev, h, skb_headlen(skb),
- DMA_FROM_DEVICE);
-   BUG_ON(dma_mapping_error(addr));
+   addr = ib_dma_map_single(dev, h, skb_headlen(skb), DMA_FROM_DEVICE);
+   BUG_ON(ib_dma_mapping_error(dev, addr));
 
rx_req->mapping[0] = addr;

@@ -203,11 +201,11 @@ static void sdp_post_recv(struct sdp_soc
frags = skb_shinfo(skb)->nr_frags;
for (i = 0; i < frags; ++i) {
++sge;
-   addr = dma_map_page(hwdev, skb_shinfo(skb)->frags[i].page,
-   skb_shinfo(skb)->frags[i].page_offset,
-   skb_shinfo(skb)->frags[i].size,
-   DMA_FROM_DEVICE);
-

[openib-general] [PATCH 5/7] IB/rds - Use the new verbs DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/rds - Use the new verbs DMA mapping functions

This patch converts RDS to use the new DMA mapping functions
for kernel verbs consumers.

From: Ralph Campbell <[EMAIL PROTECTED]>

Index: src/linux-kernel/infiniband/ulp/rds/rds_buf.c
===
--- src/linux-kernel/infiniband/ulp/rds/rds_buf.c   (revision 9441)
+++ src/linux-kernel/infiniband/ulp/rds/rds_buf.c   (working copy)
@@ -67,10 +67,10 @@ struct rds_buf* rds_alloc_send_buffer(st
buf->loopback = FALSE;
buf->optype = OP_SEND;
buf->sge.length = ep->buffer_size;
-   buf->sge.addr = dma_map_single(ep->cma_id->device->dma_device,
-   buf->data,
-   buf->sge.length,
-   DMA_TO_DEVICE);
+   buf->sge.addr = ib_dma_map_single(ep->cma_id->device,
+ buf->data,
+ buf->sge.length,
+ DMA_TO_DEVICE);
 
pci_unmap_addr_set(buf, mapping, buf->sge.addr);
 
@@ -101,7 +101,7 @@ struct rds_buf* rds_alloc_recv_buffer(st
buf->loopback = FALSE;
buf->optype = OP_RECV;
buf->sge.length = ep->buffer_size;
-   buf->sge.addr = dma_map_single(ep->cma_id->device->dma_device,
+   buf->sge.addr = ib_dma_map_single(ep->cma_id->device,
buf->data,
buf->sge.length,
DMA_FROM_DEVICE);
@@ -126,8 +126,8 @@ void rds_free_buffer(struct rds_buf *buf
printk("rds: free buffer, bad ep or 
ep->kmem_cache!!\n");
return;
}
-   dma_unmap_single(
-   ((struct 
rds_ep*)buf->parent_ep)->cma_id->device->dma_device,
+   ib_dma_unmap_single(
+   ((struct rds_ep*)buf->parent_ep)->cma_id->device,
pci_unmap_addr(buf,mapping),
buf->sge.length,
DMA_TO_DEVICE);



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 4/7] IB/iser - Use the new verbs DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/iser - Use the new verbs DMA mapping functions

This patch converts iser to use the new verbs DMA mapping functions
for kernel verbs consumers.

From: Ralph Campbell <[EMAIL PROTECTED]>

diff -r f37bd0e41fec drivers/infiniband/ulp/iser/iser_memory.c
--- a/drivers/infiniband/ulp/iser/iser_memory.c Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/ulp/iser/iser_memory.c Thu Oct 26 13:16:33 2006 -0800
@@ -51,7 +51,7 @@
  */
 int iser_regd_buff_release(struct iser_regd_buf *regd_buf)
 {
-   struct device *dma_device;
+   struct ib_device *dev;
 
if ((atomic_read(®d_buf->ref_count) == 0) ||
atomic_dec_and_test(®d_buf->ref_count)) {
@@ -60,8 +60,8 @@ int iser_regd_buff_release(struct iser_r
iser_unreg_mem(®d_buf->reg);
 
if (regd_buf->dma_addr) {
-   dma_device = regd_buf->device->ib_device->dma_device;
-   dma_unmap_single(dma_device,
+   dev = regd_buf->device->ib_device;
+   ib_dma_unmap_single(dev,
 regd_buf->dma_addr,
 regd_buf->data_size,
 regd_buf->direction);
@@ -85,10 +85,10 @@ void iser_reg_single(struct iser_device 
 {
dma_addr_t dma_addr;
 
-   dma_addr  = dma_map_single(device->ib_device->dma_device,
-  regd_buf->virt_addr,
-  regd_buf->data_size, direction);
-   BUG_ON(dma_mapping_error(dma_addr));
+   dma_addr = ib_dma_map_single(device->ib_device,
+regd_buf->virt_addr,
+regd_buf->data_size, direction);
+   BUG_ON(ib_dma_mapping_error(device->ib_device, dma_addr));
 
regd_buf->reg.lkey = device->mr->lkey;
regd_buf->reg.len  = regd_buf->data_size;
@@ -106,7 +106,7 @@ int iser_start_rdma_unaligned_sg(struct 
 enum iser_data_dir cmd_dir)
 {
int dma_nents;
-   struct device *dma_device;
+   struct ib_device *dev;
char *mem = NULL;
struct iser_data_buf *data = &iser_ctask->data[cmd_dir];
unsigned long  cmd_data_len = data->data_len;
@@ -146,17 +146,12 @@ int iser_start_rdma_unaligned_sg(struct 
 
iser_ctask->data_copy[cmd_dir].copy_buf  = mem;
 
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-
-   if (cmd_dir == ISER_DIR_OUT)
-   dma_nents = dma_map_sg(dma_device,
-  
&iser_ctask->data_copy[cmd_dir].sg_single,
-  1, DMA_TO_DEVICE);
-   else
-   dma_nents = dma_map_sg(dma_device,
-  
&iser_ctask->data_copy[cmd_dir].sg_single,
-  1, DMA_FROM_DEVICE);
-
+   dev = iser_ctask->iser_conn->ib_conn->device->ib_device;
+   dma_nents = ib_dma_map_sg(dev,
+ &iser_ctask->data_copy[cmd_dir].sg_single,
+ 1,
+ (cmd_dir == ISER_DIR_OUT) ?
+ DMA_TO_DEVICE : DMA_FROM_DEVICE);
BUG_ON(dma_nents == 0);
 
iser_ctask->data_copy[cmd_dir].dma_nents = dma_nents;
@@ -169,19 +164,16 @@ void iser_finalize_rdma_unaligned_sg(str
 void iser_finalize_rdma_unaligned_sg(struct iscsi_iser_cmd_task *iser_ctask,
 enum iser_data_dir cmd_dir)
 {
-   struct device *dma_device;
+   struct ib_device *dev;
struct iser_data_buf *mem_copy;
unsigned long  cmd_data_len;
 
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-   mem_copy   = &iser_ctask->data_copy[cmd_dir];
-
-   if (cmd_dir == ISER_DIR_OUT)
-   dma_unmap_sg(dma_device, &mem_copy->sg_single, 1,
-DMA_TO_DEVICE);
-   else
-   dma_unmap_sg(dma_device, &mem_copy->sg_single, 1,
-DMA_FROM_DEVICE);
+   dev = iser_ctask->iser_conn->ib_conn->device->ib_device;
+   mem_copy = &iser_ctask->data_copy[cmd_dir];
+
+   ib_dma_unmap_sg(dev, &mem_copy->sg_single, 1,
+   (cmd_dir == ISER_DIR_OUT) ?
+   DMA_TO_DEVICE : DMA_FROM_DEVICE);
 
if (cmd_dir == ISER_DIR_IN) {
char *mem;
@@ -230,7 +222,8 @@ void iser_finalize_rdma_unaligned_sg(str
  * consecutive elements. Also, it handles one entry SG.
  */
 static int iser_sg_to_page_vec(struct iser_data_buf *data,
-  struct iser_page_vec *page_vec)
+  struct iser_page_vec *page_vec,
+  struct ib_device *ibdev)
 {
struct scatterlist *sg = (struct scatterlist *)data->buf;
dma_addr_t first_addr, last_addr, page;
@@ -24

[openib-general] [PATCH 3/7] IB/ipoib - Use the new verbs DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/ipoib - Use the new verbs DMA mapping functions

This patch converts IPoIB to use the new DMA mapping functions
for kernel verbs consumers.

From: Ralph Campbell <[EMAIL PROTECTED]>

diff -r f37bd0e41fec drivers/infiniband/ulp/ipoib/ipoib_ib.c
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c   Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c   Thu Oct 26 12:37:09 2006 -0800
@@ -109,9 +109,8 @@ static int ipoib_ib_post_receive(struct 
ret = ib_post_recv(priv->qp, ¶m, &bad_wr);
if (unlikely(ret)) {
ipoib_warn(priv, "receive failed for buf %d (%d)\n", id, ret);
-   dma_unmap_single(priv->ca->dma_device,
-priv->rx_ring[id].mapping,
-IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv->ca, priv->rx_ring[id].mapping,
+   IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
dev_kfree_skb_any(priv->rx_ring[id].skb);
priv->rx_ring[id].skb = NULL;
}
@@ -136,10 +135,9 @@ static int ipoib_alloc_rx_skb(struct net
 */
skb_reserve(skb, 4);
 
-   addr = dma_map_single(priv->ca->dma_device,
- skb->data, IPOIB_BUF_SIZE,
- DMA_FROM_DEVICE);
-   if (unlikely(dma_mapping_error(addr))) {
+   addr = ib_dma_map_single(priv->ca, skb->data, IPOIB_BUF_SIZE,
+DMA_FROM_DEVICE);
+   if (unlikely(ib_dma_mapping_error(priv->ca, addr))) {
dev_kfree_skb_any(skb);
return -EIO;
}
@@ -193,8 +191,8 @@ static void ipoib_ib_handle_rx_wc(struct
ipoib_warn(priv, "failed recv event "
   "(status=%d, wrid=%d vend_err %x)\n",
   wc->status, wr_id, wc->vendor_err);
-   dma_unmap_single(priv->ca->dma_device, addr,
-IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv->ca, addr,
+   IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
dev_kfree_skb_any(skb);
priv->rx_ring[wr_id].skb = NULL;
return;
@@ -212,8 +210,7 @@ static void ipoib_ib_handle_rx_wc(struct
ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n",
   wc->byte_len, wc->slid);
 
-   dma_unmap_single(priv->ca->dma_device, addr,
-IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv->ca, addr, IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
 
skb_put(skb, wc->byte_len);
skb_pull(skb, IB_GRH_BYTES);
@@ -261,10 +258,8 @@ static void ipoib_ib_handle_tx_wc(struct
 
tx_req = &priv->tx_ring[wr_id];
 
-   dma_unmap_single(priv->ca->dma_device,
-pci_unmap_addr(tx_req, mapping),
-tx_req->skb->len,
-DMA_TO_DEVICE);
+   ib_dma_unmap_single(priv->ca, pci_unmap_addr(tx_req, mapping),
+   tx_req->skb->len, DMA_TO_DEVICE);
 
++priv->stats.tx_packets;
priv->stats.tx_bytes += tx_req->skb->len;
@@ -353,9 +348,9 @@ void ipoib_send(struct net_device *dev, 
 */
tx_req = &priv->tx_ring[priv->tx_head & (ipoib_sendq_size - 1)];
tx_req->skb = skb;
-   addr = dma_map_single(priv->ca->dma_device, skb->data, skb->len,
- DMA_TO_DEVICE);
-   if (unlikely(dma_mapping_error(addr))) {
+   addr = ib_dma_map_single(priv->ca, skb->data, skb->len,
+DMA_TO_DEVICE);
+   if (unlikely(ib_dma_mapping_error(priv->ca, addr))) {
++priv->stats.tx_errors;
dev_kfree_skb_any(skb);
return;
@@ -366,8 +361,7 @@ void ipoib_send(struct net_device *dev, 
   address->ah, qpn, addr, skb->len))) {
ipoib_warn(priv, "post_send failed\n");
++priv->stats.tx_errors;
-   dma_unmap_single(priv->ca->dma_device, addr, skb->len,
-DMA_TO_DEVICE);
+   ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE);
dev_kfree_skb_any(skb);
} else {
dev->trans_start = jiffies;
@@ -537,24 +531,28 @@ int ipoib_ib_dev_stop(struct net_device 
while ((int) priv->tx_tail - (int) priv->tx_head < 0) {
tx_req = &priv->tx_ring[priv->tx_tail &
(ipoib_sendq_size - 1)];
-   dma_unmap_single(priv->ca->dma_device,
-pci_unmap_addr(tx_req, 
mapping),
-tx_req->skb->len,
-DMA_TO_DEVICE);
+   ib_dma_unmap_sin

[openib-general] [GIT PULL] please pull infiniband.git

2006-11-02 Thread Roland Dreier
Linus, please pull from

master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
for-linus

This includes various fixes for 2.6.19-rc5:

Erez Zilber (1):
  IB/iser: Start connection after enabling iSER

Jack Morgenstein (1):
  IB/uverbs: Return sq_draining value in query_qp response

Krishna Kumar (1):
  RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count

Michael S. Tsirkin (1):
  IB/mthca: Fix MAD extended header format for MAD_IFC firmware command

Paul Mackerras (1):
  IB/ehca: Fix eHCA driver compilation for uniprocessor

Sean Hefty (1):
  RDMA/addr: Use client registration to fix module unload race

Steve Wise (2):
  IB/amso1100: Use dma_alloc_coherent() instead of kmalloc/dma_map_single
  IB/amso1100: Fix incorrect pr_debug()

 drivers/infiniband/core/addr.c|   28 ++-
 drivers/infiniband/core/cma.c |   31 +++-
 drivers/infiniband/core/uverbs_cmd.c  |2 +-
 drivers/infiniband/hw/amso1100/c2_alloc.c |   13 +++
 drivers/infiniband/hw/amso1100/c2_cq.c|   18 +++--
 drivers/infiniband/hw/amso1100/c2_rnic.c  |   56 -
 drivers/infiniband/hw/ehca/ehca_tools.h   |1 +
 drivers/infiniband/hw/mthca/mthca_cmd.c   |   14 
 drivers/infiniband/ulp/iser/iscsi_iser.c  |4 +-
 include/rdma/ib_addr.h|   20 ++-
 include/rdma/ib_user_verbs.h  |2 +-
 11 files changed, 114 insertions(+), 75 deletions(-)


diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 60d3fbd..e11187e 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -47,6 +47,7 @@ struct addr_req {
struct sockaddr src_addr;
struct sockaddr dst_addr;
struct rdma_dev_addr *addr;
+   struct rdma_addr_client *client;
void *context;
void (*callback)(int status, struct sockaddr *src_addr,
 struct rdma_dev_addr *addr, void *context);
@@ -61,6 +62,26 @@ static LIST_HEAD(req_list);
 static DECLARE_WORK(work, process_req, NULL);
 static struct workqueue_struct *addr_wq;
 
+void rdma_addr_register_client(struct rdma_addr_client *client)
+{
+   atomic_set(&client->refcount, 1);
+   init_completion(&client->comp);
+}
+EXPORT_SYMBOL(rdma_addr_register_client);
+
+static inline void put_client(struct rdma_addr_client *client)
+{
+   if (atomic_dec_and_test(&client->refcount))
+   complete(&client->comp);
+}
+
+void rdma_addr_unregister_client(struct rdma_addr_client *client)
+{
+   put_client(client);
+   wait_for_completion(&client->comp);
+}
+EXPORT_SYMBOL(rdma_addr_unregister_client);
+
 int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev,
 const unsigned char *dst_dev_addr)
 {
@@ -229,6 +250,7 @@ static void process_req(void *data)
list_del(&req->list);
req->callback(req->status, &req->src_addr, req->addr,
  req->context);
+   put_client(req->client);
kfree(req);
}
 }
@@ -264,7 +286,8 @@ static int addr_resolve_local(struct soc
return ret;
 }
 
-int rdma_resolve_ip(struct sockaddr *src_addr, struct sockaddr *dst_addr,
+int rdma_resolve_ip(struct rdma_addr_client *client,
+   struct sockaddr *src_addr, struct sockaddr *dst_addr,
struct rdma_dev_addr *addr, int timeout_ms,
void (*callback)(int status, struct sockaddr *src_addr,
 struct rdma_dev_addr *addr, void *context),
@@ -285,6 +308,8 @@ int rdma_resolve_ip(struct sockaddr *src
req->addr = addr;
req->callback = callback;
req->context = context;
+   req->client = client;
+   atomic_inc(&client->refcount);
 
src_in = (struct sockaddr_in *) &req->src_addr;
dst_in = (struct sockaddr_in *) &req->dst_addr;
@@ -305,6 +330,7 @@ int rdma_resolve_ip(struct sockaddr *src
break;
default:
ret = req->status;
+   atomic_dec(&client->refcount);
kfree(req);
break;
}
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 9ae4f3a..845090b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -63,6 +63,7 @@ static struct ib_client cma_client = {
 };
 
 static struct ib_sa_client sa_client;
+static struct rdma_addr_client addr_client;
 static LIST_HEAD(dev_list);
 static LIST_HEAD(listen_any_list);
 static DEFINE_MUTEX(lock);
@@ -1625,8 +1626,8 @@ int rdma_resolve_addr(struct rdma_cm_id
if (cma_any_addr(dst_addr))
ret = cma_resolve_loopback(id_priv);
else
-   ret = rdma_resolve

[openib-general] IB/ipath - Implement new verbs DMA mapping functions

2006-11-02 Thread Ralph Campbell
IB/ipath - Implement new verbs DMA mapping functions

This patch implements the interposing DMA mapping functions to allow
support for IOMMUs and remove the dependence on phys_to_virt().

From: Ralph Campbell <[EMAIL PROTECTED]>

diff -r f37bd0e41fec drivers/infiniband/hw/ipath/Makefile
--- a/drivers/infiniband/hw/ipath/Makefile  Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/hw/ipath/Makefile  Thu Oct 26 11:16:16 2006 -0800
@@ -6,6 +6,7 @@ ib_ipath-y := \
 ib_ipath-y := \
ipath_cq.o \
ipath_diag.o \
+   ipath_dma.o \
ipath_driver.o \
ipath_eeprom.o \
ipath_file_ops.o \
diff -r f37bd0e41fec drivers/infiniband/hw/ipath/ipath_keys.c
--- a/drivers/infiniband/hw/ipath/ipath_keys.c  Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/hw/ipath/ipath_keys.c  Fri Oct 27 16:22:43 2006 -0800
@@ -134,7 +134,7 @@ int ipath_lkey_ok(struct ipath_qp *qp, s
 */
if (sge->lkey == 0) {
isge->mr = NULL;
-   isge->vaddr = bus_to_virt(sge->addr);
+   isge->vaddr = (void *) sge->addr;
isge->length = sge->length;
isge->sge_length = sge->length;
ret = 1;
@@ -202,12 +202,12 @@ int ipath_rkey_ok(struct ipath_qp *qp, s
int ret;
 
/*
-* We use RKEY == zero for physical addresses
-* (see ipath_get_dma_mr).
+* We use RKEY == zero for kernel virtual addresses
+* (see ipath_get_dma_mr and ipath_dma.c).
 */
if (rkey == 0) {
sge->mr = NULL;
-   sge->vaddr = phys_to_virt(vaddr);
+   sge->vaddr = (void *) vaddr;
sge->length = len;
sge->sge_length = len;
ss->sg_list = NULL;
diff -r f37bd0e41fec drivers/infiniband/hw/ipath/ipath_mr.c
--- a/drivers/infiniband/hw/ipath/ipath_mr.cThu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/hw/ipath/ipath_mr.cThu Oct 26 13:35:12 2006 -0800
@@ -54,6 +54,8 @@ static inline struct ipath_fmr *to_ifmr(
  * @acc: access flags
  *
  * Returns the memory region on success, otherwise returns an errno.
+ * Note that all DMA addresses should be created via the
+ * struct ib_dma_mapping_ops functions (see ipath_dma.c).
  */
 struct ib_mr *ipath_get_dma_mr(struct ib_pd *pd, int acc)
 {
@@ -149,8 +151,7 @@ struct ib_mr *ipath_reg_phys_mr(struct i
m = 0;
n = 0;
for (i = 0; i < num_phys_buf; i++) {
-   mr->mr.map[m]->segs[n].vaddr =
-   phys_to_virt(buffer_list[i].addr);
+   mr->mr.map[m]->segs[n].vaddr = (void *) buffer_list[i].addr;
mr->mr.map[m]->segs[n].length = buffer_list[i].size;
mr->mr.length += buffer_list[i].size;
n++;
@@ -347,7 +348,7 @@ int ipath_map_phys_fmr(struct ib_fmr *ib
n = 0;
ps = 1 << fmr->page_shift;
for (i = 0; i < list_len; i++) {
-   fmr->mr.map[m]->segs[n].vaddr = phys_to_virt(page_list[i]);
+   fmr->mr.map[m]->segs[n].vaddr = (void *) page_list[i];
fmr->mr.map[m]->segs[n].length = ps;
if (++n == IPATH_SEGSZ) {
m++;
diff -r f37bd0e41fec drivers/infiniband/hw/ipath/ipath_verbs.c
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c Thu Oct 26 11:17:23 2006 -0800
@@ -1599,6 +1599,7 @@ int ipath_register_ib_device(struct ipat
dev->detach_mcast = ipath_multicast_detach;
dev->process_mad = ipath_process_mad;
dev->mmap = ipath_mmap;
+   dev->dma_ops = &ipath_dma_mapping_ops;
 
snprintf(dev->node_desc, sizeof(dev->node_desc),
 IPATH_IDSTR " %s", init_utsname()->nodename);
diff -r f37bd0e41fec drivers/infiniband/hw/ipath/ipath_verbs.h
--- a/drivers/infiniband/hw/ipath/ipath_verbs.h Thu Oct 26 21:44:41 2006 +0700
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.h Thu Oct 26 11:17:38 2006 -0800
@@ -812,4 +812,6 @@ extern unsigned int ib_ipath_max_srq_wrs
 
 extern const u32 ib_ipath_rnr_table[];
 
+extern struct ib_dma_mapping_ops ipath_dma_mapping_ops;
+
 #endif /* IPATH_VERBS_H */
diff -r f37bd0e41fec drivers/infiniband/hw/ipath/ipath_dma.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_dma.c   Fri Oct 27 10:40:03 2006 -0800
@@ -0,0 +1,229 @@
+/*
+ * Copyright (c) 2006 QLogic, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  

[openib-general] [PATCH 1/7] IB/core - Add DMA mapping functions to allow device drivers to interpose

2006-11-02 Thread Ralph Campbell
IB/core - Add DMA mapping functions to allow device drivers to interpose

The QLogic InfiniPath HCAs use programmed I/O instead of HW DMA.
This patch allows a verbs device driver to interpose on DMA mapping
function calls in order to avoid relying on bus_to_virt() and
phys_to_virt() to undo the mappings created by dma_map_single(),
dma_map_sg(), etc.

From: Ralph Campbell <[EMAIL PROTECTED]>

diff -r f37bd0e41fec include/rdma/ib_verbs.h
--- a/include/rdma/ib_verbs.h   Thu Oct 26 21:44:41 2006 +0700
+++ b/include/rdma/ib_verbs.h   Thu Oct 26 16:10:04 2006 -0800
@@ -43,6 +43,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -846,6 +848,42 @@ struct ib_cache {
struct ib_pkey_cache  **pkey_cache;
struct ib_gid_cache   **gid_cache;
u8 *lmc_cache;
+};
+
+struct ib_dma_mapping_ops {
+   int (*mapping_error)(struct ib_device *dev,
+dma_addr_t dma_addr);
+   dma_addr_t  (*map_single)(struct ib_device *dev,
+ void *ptr, size_t size,
+ enum dma_data_direction direction);
+   void(*unmap_single)(struct ib_device *dev,
+   dma_addr_t addr, size_t size,
+   enum dma_data_direction direction);
+   dma_addr_t  (*map_page)(struct ib_device *dev,
+   struct page *page, unsigned long offset,
+   size_t size,
+   enum dma_data_direction direction);
+   void(*unmap_page)(struct ib_device *dev,
+ dma_addr_t addr, size_t size,
+ enum dma_data_direction direction);
+   int (*map_sg)(struct ib_device *dev,
+ struct scatterlist *sg, int nents,
+ enum dma_data_direction direction);
+   void(*unmap_sg)(struct ib_device *dev,
+   struct scatterlist *sg, int nents,
+   enum dma_data_direction direction);
+   dma_addr_t  (*dma_address)(struct ib_device *dev,
+  struct scatterlist *sg);
+   unsigned int(*dma_len)(struct ib_device *dev,
+  struct scatterlist *sg);
+   void(*sync_single_for_cpu)(struct ib_device *dev,
+  dma_addr_t dma_handle,
+  size_t size,
+  enum dma_data_direction dir);
+   void(*sync_single_for_device)(struct ib_device *dev,
+ dma_addr_t dma_handle,
+ size_t size,
+ enum dma_data_direction dir);
 };
 
 struct iw_cm_verbs;
@@ -992,6 +1030,8 @@ struct ib_device {
  struct ib_mad *in_mad,
  struct ib_mad *out_mad);
 
+   struct ib_dma_mapping_ops   *dma_ops;
+
struct module   *owner;
struct class_device  class_dev;
struct kobject   ports_parent;
@@ -1395,8 +1435,182 @@ static inline int ib_req_ncomp_notif(str
  *   usable for DMA.
  * @pd: The protection domain associated with the memory region.
  * @mr_access_flags: Specifies the memory access rights.
+ *
+ * Note that the ib_dma_*() functions defined below must be used 
+ * to create/destroy addresses used with the Lkey or Rkey returned
+ * by ib_get_dma_mr().
  */
 struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
+
+/**
+ * ib_dma_mapping_error - check a dma_addr_t for error
+ * @device: The device for which the dma_addr was created
+ * @dma_addr: The DMA address to check
+ */
+static inline int ib_dma_mapping_error(struct ib_device *dev,
+  dma_addr_t dma_addr)
+{
+   return dev->dma_ops ?
+   dev->dma_ops->mapping_error(dev, dma_addr) :
+   dma_mapping_error(dma_addr);
+}
+
+/**
+ * ib_dma_map_single - Map a kernel virtual address to DMA address
+ * @device: The device for which the dma_addr is to be created
+ * @cpu_addr: The kernel virtual address
+ * @size: The size of the region in bytes
+ * @direction: The direction of the DMA
+ */
+static inline dma_addr_t ib_dma_map_single(struct ib_device *dev,
+  void *cpu_addr, size_t size,
+  enum dma_data_direction direction)
+{
+   return dev->dma_ops ?
+   dev->dma_ops->map_single(dev, cpu_addr, size, direction) :
+   dma_map_single(dev->dma_device, cpu_addr

Re: [openib-general] [PATCH v2] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Roland Dreier
Thanks, applied.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] management/libib*: strip trailing whitespaces

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 08:01, Sasha Khapyorsky wrote:
> Strip trailing whitespaces for libibcommon, libibumad, libibmad.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] local QP error notification

2006-11-02 Thread somenath

0. after disconnecting a cable (we just used a switch interface to 
disable a port)
for a connected qpair,  we get errors  when try to send a packet on that 
port
(this is expected).  but

1. don't get errors when try to post receive a packet. is this expected?
2. don't get any of the  IB_EVENT_CQ_ERR,
IB_EVENT_QP_FATAL, (ib_event_type) errors..

a. when is a good time for ULPs to issue disconnect and destroy qpairs in
such cases? should it do it in step  0? should it expect step 1 and 2 to 
happen?

b. what is expected  when we  connect the  cable  back?  I assume  for RC QP
we must  destroy the  connection/QP in case of such error, it will  
never come back
to ok state. is that right?

thanks, som.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] diags: strip trailing whitespaces

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 08:04, Sasha Khapyorsky wrote:
> Strip trailing whitespaces in diags.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Venkatesh Babu
I made the change you suggested.
On Active node I got the event IB_EVENT_PATH_MIG and then send failed 
with IB_WC_RETRY_EXC_ERR.
On Passive node I got 100 IB_EVENT_PATH_MIG_ERR events.

 VBabu

Sean Hefty wrote:

> Venkatesh Babu wrote:
>
>> I have the changes to the steps 6, 9.2 and 11. In step 9.2 
>> ib_cm_init_qp_attr() failed with -22 and then RCQP failed with 
>> IB_WC_RETRY_EXC_ERR.
>
>
> Did you set qp_attr.qp_state = IB_QPS_RTS before calling 
> ib_cm_init_qp_attr()? If not, can you try this?
>
> - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Sean Hefty
Venkatesh Babu wrote:
> I have the changes to the steps 6, 9.2 and 11. In step 9.2 
> ib_cm_init_qp_attr() failed with -22 and then RCQP failed with 
> IB_WC_RETRY_EXC_ERR.

Did you set qp_attr.qp_state = IB_QPS_RTS before calling ib_cm_init_qp_attr()? 
If not, can you try this?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Venkatesh Babu
I have the changes to the steps 6, 9.2 and 11. In step 9.2 
ib_cm_init_qp_attr() failed with -22 and then RCQP failed with 
IB_WC_RETRY_EXC_ERR.

 VBabu

Sean Hefty wrote:

>>Let me make the steps clear -
>>
>>
>
>This helps - thanks.
>
>  
>
>> 1. On Passive node register for remote port UP/DOWN event by
>>registering with ib_sa_serv_notice_hdlr()
>>
>>
>
>FYI - patches for this are being worked separately.
>
>  
>
>> 2. On Passive node start the listener by calling ib_cm_listen().
>> 3. On Active node create the RC QP and establish the connection by
>>calling ib_send_cm_req(). In struct ib_cm_req_param specify both primary
>>path (say, through Port1) and alternate path (say, through Port2).
>>NOTE:-Assume Port1 of Active node is connected to Port1 of Passive node;
>>and Port2 of Active node is connected to Port2 of  Passive node.
>>NOTE:- After this step QP's path_mig_state will be IB_MIG_ARMED.
>> 4. Let us say, Port1 on Active node fails
>> 5. IB_EVENT_PORT_ERR event is generated on  Active node; and remote
>>port error event is generated on Passive node.
>> 6. In those event handler call ib_qp_modify() to set the
>>path_mig_state to IB_MIG_MIGRATED. This will let the HCA's firmware know
>>to switch to the alternate path.
>>
>>
>
>At least the active side in your scenario should call ib_cm_notify() after this
>step.  Otherwise, the LAP will go out the primary path, which is down.  This
>isn't a big deal in your test case, since you wait for the primary path to
>return (step 7) before calling ib_send_cm_lap().
>
>  
>
>> 7. After a while, Port1 is comes back again.
>> 8. IB_EVENT_PORT_ACTIVE event is generated on Active node; and remote
>>port active event is generated on Passive node.
>> 9. On the Active node from  IB_EVENT_PORT_ACTIVE event handler call
>>the ib_send_cm_lap() to send the alternate path (through Port1) to the
>>Passive node.
>>   9.1 Passive node receives the LAP message
>>
>>
>
>The proposed patch will record the alternate path when the LAP is sent or
>received.  (Again, these patches are untested, so there can be some bugs here.
>I'm still working on writing a test program to use these interfaces.)
>
>  
>
>>   9.2 Calls ib_cm_init_rearm_attr() initialize the alternate path info
>>
>>
>
>This should now call ib_cm_init_qp_attr().
>
>  
>
>>   9.3 Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
>>   9.4 Send APR message back to the Active node.
>>10. Active node receives the APR message
>>11. Calls ib_cm_init_rearm_attr() initialize the alternate path info
>>
>>
>
>This should now call ib_cm_init_qp_attr().
>
>  
>
>>12. Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
>>13. Now when a first packet is passed between the Active and Passive
>>node the ib_core changes the path_mig_state to the IB_MIG_ARMED.
>> 14. Now it is all set for another failover.
>>
>>
>
>Using the proposed patches, where did you see a failure?
>
>- Sean
>  
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] opensm crash with topspin HCA

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 13:33, Viswanath Krishnamurthy wrote:
> 
> When we run opensm (OFED) release and if a Topspin HCA is in the IB
> network, opensm crashes in umad_receiver with NULL pointer exception. 
> The transaction ID is zero is the MAD'S from topspin HCA on windows.
> The crashes seems to random in umad_receiver. 

What OpenSM version ? 

There was a problem like this fixed back at the end of August:

r8920 | halr | 2006-08-14 09:09:28 -0400 (Mon, 14 Aug 2006) | 11 lines

OpenSM/osm_vendor_ibumad.c: In get_madw, check for TID 0 (resolves
NULL ptr crash with Cisco stack)

This change fixes an OSM crash when working with Cisco's stack.
Cisco's stack doesn't follow the same TID convention when generating transaction
 id which in some bad flow revealed this bug in the get_madw lookup.

The bug was in get_madw which does not detect lookup of its reserved "free" entr
y of key==0.

Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]>
Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>

-- Hal

> 
> 
> 
> HCA found:
> 
> hca_id=InfiniHost0
> 
> vendor_id=0x02C9
> 
> vendor_part_id=0x5A44
> 
> hw_ver=0xA0
> 
> fw_ver=0x40006
> 
> 
> 
> 
> __
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Arlin Davis
Sean Hefty wrote:

>One option is having the SA (or ib_umad?) return a busy status in response to 
>a 
>MAD, but we'd still have to be able to send this response as quickly as 
>requests 
>are being received.  We could then limit the number of requests that would be 
>queued in the kernel for a user.
>  
>

Another great option would be to have path record caching. Unfortunately 
OFED 1.1 did not include ib_local_sa in the release.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Sean Hefty
> Let me make the steps clear -

This helps - thanks.

>  1. On Passive node register for remote port UP/DOWN event by
>registering with ib_sa_serv_notice_hdlr()

FYI - patches for this are being worked separately.

>  2. On Passive node start the listener by calling ib_cm_listen().
>  3. On Active node create the RC QP and establish the connection by
>calling ib_send_cm_req(). In struct ib_cm_req_param specify both primary
>path (say, through Port1) and alternate path (say, through Port2).
>NOTE:-Assume Port1 of Active node is connected to Port1 of Passive node;
>and Port2 of Active node is connected to Port2 of  Passive node.
>NOTE:- After this step QP's path_mig_state will be IB_MIG_ARMED.
>  4. Let us say, Port1 on Active node fails
>  5. IB_EVENT_PORT_ERR event is generated on  Active node; and remote
>port error event is generated on Passive node.
>  6. In those event handler call ib_qp_modify() to set the
>path_mig_state to IB_MIG_MIGRATED. This will let the HCA's firmware know
>to switch to the alternate path.

At least the active side in your scenario should call ib_cm_notify() after this
step.  Otherwise, the LAP will go out the primary path, which is down.  This
isn't a big deal in your test case, since you wait for the primary path to
return (step 7) before calling ib_send_cm_lap().

>  7. After a while, Port1 is comes back again.
>  8. IB_EVENT_PORT_ACTIVE event is generated on Active node; and remote
>port active event is generated on Passive node.
>  9. On the Active node from  IB_EVENT_PORT_ACTIVE event handler call
>the ib_send_cm_lap() to send the alternate path (through Port1) to the
>Passive node.
>9.1 Passive node receives the LAP message

The proposed patch will record the alternate path when the LAP is sent or
received.  (Again, these patches are untested, so there can be some bugs here.
I'm still working on writing a test program to use these interfaces.)

>9.2 Calls ib_cm_init_rearm_attr() initialize the alternate path info

This should now call ib_cm_init_qp_attr().

>9.3 Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
>9.4 Send APR message back to the Active node.
> 10. Active node receives the APR message
> 11. Calls ib_cm_init_rearm_attr() initialize the alternate path info

This should now call ib_cm_init_qp_attr().

> 12. Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
> 13. Now when a first packet is passed between the Active and Passive
>node the ib_core changes the path_mig_state to the IB_MIG_ARMED.
>  14. Now it is all set for another failover.

Using the proposed patches, where did you see a failure?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Sean Hefty
Michael S. Tsirkin wrote:
> Adding registration at module start/stop seems simple enough and overhead is
> minimal. We already have this for other modules (e.g. ib_sa).  I don't really
> unerstand why is there such a resistance to this simple fix for unload race?

There's no resistance if someone is using it.  There's just an easier solution 
if no one is or was going to use it...  The change just isn't quite as trivial 
with the ib_cm or rdma_cm as it was with the ib_sa or ib_addr.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-11-02 Thread Roland Dreier
 > By the way, what's up with this project?
 > It's still planned for libibverbs 1.1, isn't it?

I working on it along with other things.

Where are your patches for using multiple EQs for CQ events? :)

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH v2] opensm: strict osm_log arguments/format check

2006-11-02 Thread Sasha Khapyorsky

This adds gcc attribute to osm_log() which causes the compiler to check
argument types against a format string. And also there are related fixes
in osm_log() usage.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 osm/include/opensm/osm_log.h |8 +++-
 osm/libvendor/osm_vendor_ibumad_sa.c |2 +-
 osm/opensm/main.c|3 ++-
 osm/opensm/osm_pkey_mgr.c|1 +
 osm/opensm/osm_port_info_rcv.c   |5 +++--
 osm/opensm/osm_sa_informinfo.c   |4 ++--
 osm/opensm/osm_sa_link_record.c  |8 
 osm/opensm/osm_sa_mad_ctrl.c |3 ++-
 osm/opensm/osm_sa_response.c |2 +-
 osm/opensm/osm_sm_state_mgr.c|3 ++-
 osm/opensm/osm_sminfo_rcv.c  |9 +
 osm/opensm/osm_state_mgr.c   |8 
 osm/osmtest/osmt_multicast.c |   12 +++-
 osm/osmtest/osmt_service.c   |6 +++---
 osm/osmtest/osmtest.c|8 
 15 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
index 6a1a93f..f51a1c8 100644
--- a/osm/include/opensm/osm_log.h
+++ b/osm/include/opensm/osm_log.h
@@ -60,6 +60,12 @@ #include 
 #include 
 #include 
 
+#ifdef __GNUC__
+#define STRICT_OSM_LOG_FORMAT __attribute__((format(printf, 3, 4)))
+#else
+#define STRICT_OSM_LOG_FORMAT
+#endif
+
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern "C" {
 #  define END_C_DECLS   }
@@ -377,7 +383,7 @@ void
 osm_log(
IN osm_log_t* const p_log,
IN const osm_log_level_t verbosity,
-   IN const char *p_str, ... );
+   IN const char *p_str, ... ) STRICT_OSM_LOG_FORMAT;
 
 void
 osm_log_raw(
diff --git a/osm/libvendor/osm_vendor_ibumad_sa.c 
b/osm/libvendor/osm_vendor_ibumad_sa.c
index 7fd0655..7c4a2f7 100644
--- a/osm/libvendor/osm_vendor_ibumad_sa.c
+++ b/osm/libvendor/osm_vendor_ibumad_sa.c
@@ -853,7 +853,7 @@ #ifdef DUAL_SIDED_RMPP
 if ( p_mpr_req->sgid_count + p_mpr_req->dgid_count > IB_MULTIPATH_MAX_GIDS 
)
 {
   osm_log( p_log, OSM_LOG_ERROR,
-   "osmv_query_sa DBG:001 MULTIPATH_REC ",
+   "osmv_query_sa DBG:001 MULTIPATH_REC "
"SGID count %d DGID count %d max count %d\n",
 p_mpr_req->sgid_count, p_mpr_req->dgid_count,
 IB_MULTIPATH_MAX_GIDS );
diff --git a/osm/opensm/main.c b/osm/opensm/main.c
index 729702a..752b546 100644
--- a/osm/opensm/main.c
+++ b/osm/opensm/main.c
@@ -460,7 +460,8 @@ parse_ignore_guids_file(IN char *guids_f
   {
 osm_log( &p_osm->log, OSM_LOG_ERROR,
  "parse_ignore_guids_file: ERR 0601: "
- "Unable to open ignore guids file (%s)\n" );
+ "Unable to open ignore guids file (%s)\n",
+ guids_file_name );
 status = IB_ERROR;
 goto Exit;
   }
diff --git a/osm/opensm/osm_pkey_mgr.c b/osm/opensm/osm_pkey_mgr.c
index f2cb221..735dc14 100644
--- a/osm/opensm/osm_pkey_mgr.c
+++ b/osm/opensm/osm_pkey_mgr.c
@@ -139,6 +139,7 @@ pkey_mgr_process_physical_port(
   "pkey_mgr_process_physical_port: ERR 0503: "
   "Failed to obtain P_Key 0x%04x block and index for node "
   "0x%016" PRIx64 " port %u\n",
+  ib_pkey_get_base( pkey ),
   cl_ntoh64( osm_node_get_node_guid( p_node ) ),
   osm_physp_get_port_num( p_physp ) );
   return;
diff --git a/osm/opensm/osm_port_info_rcv.c b/osm/opensm/osm_port_info_rcv.c
index 95112dc..f6d3595 100644
--- a/osm/opensm/osm_port_info_rcv.c
+++ b/osm/opensm/osm_port_info_rcv.c
@@ -724,8 +724,9 @@ osm_pi_rcv_process(
   {
 osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
  "osm_pi_rcv_process: "
- "Got light sweep response from remote port of parent node GUID = 
0x%" PRIx64
- " port = %u, Commencing heavy sweep\n",
+ "Got light sweep response from remote port of parent node "
+ "GUID = 0x%" PRIx64 " port = 0x%016" PRIx64
+ ", Commencing heavy sweep\n",
  cl_ntoh64( node_guid ),
  cl_ntoh64( port_guid ) );
 osm_state_mgr_process( p_rcv->p_state_mgr,
diff --git a/osm/opensm/osm_sa_informinfo.c b/osm/opensm/osm_sa_informinfo.c
index 69dca1d..0cec307 100644
--- a/osm/opensm/osm_sa_informinfo.c
+++ b/osm/opensm/osm_sa_informinfo.c
@@ -163,8 +163,8 @@ __validate_ports_access_rights(
 {
   osm_log( p_rcv->p_log, OSM_LOG_ERROR,
"__validate_ports_access_rights: ERR 4301: "
-   "Invalid port guid: 0x%016\n",
-   portguid );
+   "Invalid port guid: 0x%016" PRIx64 "\n",
+   cl_ntoh64(portguid) );
   valid = FALSE;
   goto Exit;
 }
diff --git a/osm/opensm/osm_sa_link_record.c b/osm/opensm/osm_sa_link_record.c
index 751023f..0ca9092 100644
--- a/osm/opensm/osm_sa_link_record.c
+++ b/osm/opensm/osm_sa_link_record.c
@@ -145,10 +145,10 @@ __osm_lr_rcv_build_physp_link(
 osm_log( p_

Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Sean Hefty
> We had an option to increase the RQ size for QP1 and QP0.
> This might help you too: try increasing IB_MAD_QP_RECV_SIZE.

Actually, dropping the requests actually helps the scalability.

If nothing gets dropped, the backlog of queued requests grows to hundreds of 
thousands, most of which will have timed out before the SA can get around to 
processing them.

One option is having the SA (or ib_umad?) return a busy status in response to a 
MAD, but we'd still have to be able to send this response as quickly as 
requests 
are being received.  We could then limit the number of requests that would be 
queued in the kernel for a user.

Unfortunately, when we are able to run on the cluster, modifying the kernel 
modules isn't available to use...

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] opensm crash with topspin HCA

2006-11-02 Thread Sasha Khapyorsky
On 10:33 Thu 02 Nov , Viswanath Krishnamurthy wrote:
> When we run opensm (OFED) release and if a Topspin HCA is in the IB network,
> opensm crashes in umad_receiver with NULL pointer exception. 

Do you have any logs, gdb backtrace or any other details?

Sasha

> The
> transaction ID is zero is the MAD'S from topspin HCA on windows. The crashes
> seems to random in umad_receiver.
> 
> 
> HCA found:
> 
>hca_id=InfiniHost0
> 
>vendor_id=0x02C9
> 
>vendor_part_id=0x5A44
> 
>hw_ver=0xA0
> 
>fw_ver=0x40006

> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] rdma/ib_cm: fix APM support

2006-11-02 Thread Venkatesh Babu
Sean Hefty wrote:

>>Are these changes to replace ib_cm_init_rearm_attr() interface ?
>>
>>
>
>Yes - you use ib_cm_init_qp_attr() to get the qp_attr after a loading a new
>alternate path.  The new path is loaded using ib_send_cm_lap().  So, after a
>path fails:
>  
>
  After path fails, I just call ib_qp_modify() on both active and 
passive side to switch to the alternate path by changing path_mig_state 
to IB_MIG_MIGRATED.

 Let me make the steps clear -
  1. On Passive node register for remote port UP/DOWN event by 
registering with ib_sa_serv_notice_hdlr()
  2. On Passive node start the listener by calling ib_cm_listen().
  3. On Active node create the RC QP and establish the connection by 
calling ib_send_cm_req(). In struct ib_cm_req_param specify both primary 
path (say, through Port1) and alternate path (say, through Port2).
NOTE:-Assume Port1 of Active node is connected to Port1 of Passive node; 
and Port2 of Active node is connected to Port2 of  Passive node.
NOTE:- After this step QP's path_mig_state will be IB_MIG_ARMED.
  4. Let us say, Port1 on Active node fails
  5. IB_EVENT_PORT_ERR event is generated on  Active node; and remote 
port error event is generated on Passive node.
  6. In those event handler call ib_qp_modify() to set the 
path_mig_state to IB_MIG_MIGRATED. This will let the HCA's firmware know 
to switch to the alternate path.
  7. After a while, Port1 is comes back again.
  8. IB_EVENT_PORT_ACTIVE event is generated on Active node; and remote 
port active event is generated on Passive node.
  9. On the Active node from  IB_EVENT_PORT_ACTIVE event handler call 
the ib_send_cm_lap() to send the alternate path (through Port1) to the 
Passive node.
9.1 Passive node receives the LAP message
9.2 Calls ib_cm_init_rearm_attr() initialize the alternate path info
9.3 Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
9.4 Send APR message back to the Active node.
 10. Active node receives the APR message
 11. Calls ib_cm_init_rearm_attr() initialize the alternate path info
 12. Calls ib_qp_modify() to update path_mig_state to IB_MIG_REARM
 13. Now when a first packet is passed between the Active and Passive 
node the ib_core changes the path_mig_state to the IB_MIG_ARMED.
  14. Now it is all set for another failover.

>One side calls ib_send_cm_lap() to propose a new alternate path.
>Second side responds by calling ib_send_cm_apr().
>Both sides call ib_cm_init_qp_attr(), then ib_modify_qp() to load the new path.
>
>This is intended to work if failover has occurred, or if the user detects that
>the alternate path is down and wants to replace it.
>
>There is an additional call, ib_cm_notify() which is used to let the CM know
>that the primary path has failed, and the alternate path should be used when
>sending future CM messages.  In case of failover, this needs to be called 
>before
>calling ib_send_cm_lap() to ensure that the LAP message reaches the remote 
>user.
>
>  
>
>>The path migration from Primary to Alternate succeeded, then reloaded
>>the alternate path.
>>
>>
>
>How did you reload the alternate path?
>  
>
  Steps 9 through 12.

>  
>
>>failed with the IB_WC_RETRY_EXC_ERR. But I got the event IB_EVENT_PATH_MIG.
>>
>>With the ib_cm_init_rearm_attr() being called, failover/failback worked
>>fine.
>>
>>
>
>Were you calling ib_send_cm_lap() to load a new alternate path, 
>
   Step 9

>or just assuming
>that the old path would work after failover occurred?
>  
>
   Before the failover occurring the QP's path_mig_state must be in 
IB_MIG_ARMED, otherwise failover doesn't work.
If it is IB_MIG_ARMED, then alternate path is already loaded, and just 
calling ib_qp_modify() to update path_mig_state to IB_MIG_MIGRATED, will 
toss the primary path and change the alternate path to primary path.

>- Sean
>  
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] opensm crash with topspin HCA

2006-11-02 Thread Viswanath Krishnamurthy
When we run opensm (OFED) release and if a Topspin HCA is in the IB network, opensm crashes in umad_receiver with NULL pointer exception.  The transaction ID is zero is the MAD'S from topspin HCA on windows. The crashes seems to random in umad_receiver.

 HCA found:
    
hca_id=InfiniHost0
    
vendor_id=0x02C9
    
vendor_part_id=0x5A44
    
hw_ver=0xA0
    
fw_ver=0x40006

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: scaling issues, was: uDAPL cma: add support for address and route 
> retries, call disconnect when recving dreq
> 
> Or Gerlitz wrote:
> > Can be very nice if you share with the community the IB stack issues 
> > revealed under scale-out testing... basically what was the testbed?
> 
> We have a 256 node (512 processors) cluster that we can test with on the 
> second 
> Tuesday following the first Monday of any month with two full moons.  We're 
> only 
> now getting some time on the cluster, and our test capabilities are limited.
> 
> The main issue that we saw was that the SA simply doesn't scale.


We had an option to increase the RQ size for QP1 and QP0.
This might help you too: try increasing IB_MAD_QP_RECV_SIZE.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Does SDP use this feature for events other than for connection requests?

Yes, it does.  But as I said, since SDP is out of tree, for now we
can just ignore the module unloading race.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> I use the callback method of destruction for new cm_id's in the ucm and ucma
> modules, so I want to keep this feature myself.  However, this method is
> unused, and likely unneeded, for events other than connection requests.  If
> this is the case, we can update the documentation, and remove this support
> except for new connections.
> 

I rethought the issue, and I don't think its a good assumption to make.
Let's stick to the old API.

For example, SDP uses the callback destrouction capability for all IDs.  For
example, if on the active side I get a reject, it is much nicer to get the id
cleaned up immediately since I have no reason to keep it around, and because I
want to put the socket back in the same state it was in before connect ( that is
without connection id), so new connect request will restart everything.
Otherwise it is quite awkward, I'm just wasting memory, and applications
actually *do* keep a huge number of inactive sockets around.

I expect we'll want something like this for IPoIB connected mode too -
keeping idle IDs and queueing work requests would be quite awkward I think.

Adding registration at module start/stop seems simple enough and overhead is
minimal. We already have this for other modules (e.g. ib_sa).  I don't really
unerstand why is there such a resistance to this simple fix for unload race?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Sean Hefty
>Another case is a request and then a reject.

Yes - I considered reject, disconnect, and device removal as good candidates to
make use of this.

It's just that in these cases, the user has had the option of allocating
resources with the cm_id that it can use to queue for destruction.  With a new
cm_id, the user may not be able to allocate the necessary resources in order to
destroy it from another thread.

Does SDP use this feature for events other than for connection requests?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] scaling issues, was: uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Sean Hefty
Or Gerlitz wrote:
> Can be very nice if you share with the community the IB stack issues 
> revealed under scale-out testing... basically what was the testbed?

We have a 256 node (512 processors) cluster that we can test with on the second 
Tuesday following the first Monday of any month with two full moons.  We're 
only 
now getting some time on the cluster, and our test capabilities are limited.

The main issue that we saw was that the SA simply doesn't scale.

>  From what the patch does I understand you attempt to handle timeout on 
> address and route resolution and long disconnect delay.

correct

> Was the issue with address resolution being ARP request or reply 
> messages getting lost?

This appears to be the case.  During test startup, we try to form all to all 
connections.  As we scaled, the number of address resolutions that timed out 
also increased.  We suspect that this is a result of the ipoib broadcast 
channel 
getting hit with a 100,000+ requests.

> Was the issue with route resolution being timeout on SA Path queries?

Yes - but the issues are more complex than that.

The SA was able to respond to 4000-6000 queries per second.  With an all to all 
connection model, it gets about 130,000 requests.  Assuming that none of these 
are lost and a 4 second timeout, it will be able to respond only a fraction of 
the original requests in time.  The next 100,000+ requests that it responds to 
have already timed out before it can send the response.

At 5000 queries per second, it will take the SA nearly 30 seconds to respond to 
the first set of requests, most of which will have timed out.  By the time it 
reached the end of the first 130,000 requests, it had hundreds of thousands of 
queued retries, most of which had also already timed out.  (E.g. even with a 
exponential backoff, you'd have retries at 4 seconds, 12 seconds, and 28 
seconds 
before the SA can finish processing the first set of requests.)

To further complicate the issue, retried requests are given new transaction IDs 
by the ib_sa module, which makes it impossible for the SA to detect retries 
from 
original requests.  It sees all requests as new.  On our largest run, we were 
never able to complete route resolution.

We're still exploring possibilities in this area.

> Was the issue with disconnect delay that peer A called 
> dat_ep_disconnect() (ie sending DREQ) and the DREP was sent only when 
> peer B got the disconnect event and called dat_ep_disconnect()? so now 
> the DREP is sent from within the provider code when it gets the DREQ?

The disconnect delay occurred because of remote nodes being slow to respond to 
disconnect requests.  We're still investigating this issue.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] for 2-6-19 rdma/addr: use client registration to fix 
> module unload race
> 
> > All active side users are fine I think.  But any client on the passive side
> > currently might destroy the new ID by returning error from the callback, 
> > and I
> > like this interface since it frees the resources immediately.
> 
> As long as only *newly* created (i.e. associated with a connection request) 
> cm_id's are destroyed this way, we're fine.  Newly created cm_id's are 
> associated with a listening cm_id.  Destruction of the listening cm_id is 
> blocked while a callback for a connection request is in progress.
> 
> > Since all such passive side users currently are out of tree, I don't think
> > it's urgent for us to do anything about the passive side race - but please 
> > do
> > not at least break code that uses passive side in major ways just yet.
> 
> I use the callback method of destruction for new cm_id's in the ucm and ucma
> modules, so I want to keep this feature myself.  However, this method is
> unused, and likely unneeded, for events other than connection requests.  If
> this is the case, we can update the documentation, and remove this support
> except for new connections.

Another case is a request and then a reject.

> I looked at the existing users and didn't find any module unload races with 
> either the ib_cm or rdma_cm, so I don't think that any immediate fixes are 
> necessary.
> 
> - Sean
> 

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] librdmacm: updated librdmacm to work with proposed 2.6.20 kernel CMA

2006-11-02 Thread Sean Hefty
> Have you looked on that? from the compilation failure against 
> libibverbs-1.0 the gap seem pretty small. If indeed this is the case, 
> since libibverbs-1.1 is in development lets check with Roland if it 
> makes sense for him to support these small-gap-features in 
> libibverbs-1.0.X, i guess what matters here is ABI versions...

I have not had time to look into this yet.

> I think we do want it. The rdma cm provide the means to offload ip 
> multicast to ib multicast though registration (join/leave etc) with the 
> ib_sa module. IP Multicast does use the send-only feature and hence IP 
> Multicast offloading apps need it as well. The rdma cm framework fits 
> very well for such apps and the ib_usa (which does not exist now, and i 
> am not sure needs to exist... it was a project of a summer student with 
> open-mpi that required that...) not.

Are you wanting the rdma cm to join the same multicast groups that ipoib does? 
(This is simple to change, but it does not join the same groups today.)

I will likely need to spin these patches again to incorporate the changes for 
path failover, so adding in join options wouldn't be difficult.  Are you just 
wanting to see them added the rdma_join_multicast directly?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Mellanox SRP target implementation

2006-11-02 Thread Vu Pham

>>*srp target* is still on gen1 code base - IBGD
>>
>>*nfs-rdma server* is on gen2 code base
> 
> 
> Any chance the MTD2000 runs openfiler?
> 


We have never installed openfiler. You can try

-vu


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-11-02 Thread Doug Ledford
On Thu, 2006-10-19 at 17:02 +0200, Or Gerlitz wrote:
> Doug Ledford wrote:
> > ... and reviewing arpingib
> > (which I'm going to remove from the ipoibtools and fix the native arping
> > in RHEL5 to work properly over IB without needing a new flag, the -A or
> > -U flags should be sufficient assuming those modes worked at all over IB
> > which they don't in either the native arping or the patched arpingib in
> > ipoibtools).  I should get to it today though.
> 
> Would you mind send the patch to arping for review?

OK, this patch to arping actually makes it work for me in all modes
(duplicate address detection, arp response, and unsolicited arp
response).  You shouldn't need any new flags to arping with this patch,
you should be able to just use the existing modes of operation as they
were intended to make the ipoibha.pl script work.  There's still some
debugging printf's in the patch, so don't consider this a final version.
How's it work?  The getsockname() function will return the full hw
address if you give it a buffer large enough to do so.  So, instead of
allocating a single struct sockaddr_ll for me and he, which caps the
address size at 8 bytes, allocate two and let the extra 12 bytes run
over into the second struct element.  Adjust the send_to and recv_from
calls to accomodate this intentional size overrun.  Finally, don't
assume the broadcast address is all 1's, use sysfs to get the actual
device broadcast address and convert it from text to binary (which will
accommodate any possible future interface types that similarly don't
have all 1's for broadcast address without requiring any recoding).
That's all I had to do in order to get it to work for me.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband
--- arping.c.infiniband	2006-10-18 13:59:13.0 -0400
+++ arping.c	2006-11-02 12:11:15.0 -0500
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "SNAPSHOT.h"
 
@@ -48,8 +49,13 @@ int unicasting;
 int s;
 int broadcast_only;
 
-struct sockaddr_ll me;
-struct sockaddr_ll he;
+/*
+ * Make these two structs have padding at the end so the overly long Infiniband
+ * hardware addresses can have the remainder of their address tacked onto
+ * the end of the struct without overlapping anything.
+ */
+struct sockaddr_ll me[2];
+struct sockaddr_ll he[2];
 
 struct timeval start, last;
 
@@ -124,7 +130,8 @@ int send_pack(int s, struct in_addr src,
 	p+=4;
 
 	gettimeofday(&now, NULL);
-	err = sendto(s, buf, p-buf, 0, (struct sockaddr*)HE, sizeof(*HE));
+	err = sendto(s, buf, p-buf, 0, (struct sockaddr*)HE, (ah->ar_hln > 8) ?
+		 sizeof(*HE) + ah->ar_hln - 8 : sizeof(*HE));
 	if (err == p-buf) {
 		last = now;
 		sent++;
@@ -174,7 +181,7 @@ void catcher(void)
 
 	if (last.tv_sec==0 || MS_TDIFF(tv,last) > 500) {
 		count--;
-		send_pack(s, src, dst, &me, &he);
+		send_pack(s, src, dst, &me[0], &he[0]);
 		if (count == 0 && unsolicited)
 			finish();
 	}
@@ -221,7 +228,7 @@ int recv_pack(unsigned char *buf, int le
 		return 0;
 	if (ah->ar_pln != 4)
 		return 0;
-	if (ah->ar_hln != me.sll_halen)
+	if (ah->ar_hln != me[0].sll_halen)
 		return 0;
 	if (len < sizeof(*ah) + 2*(4 + ah->ar_hln))
 		return 0;
@@ -232,7 +239,7 @@ int recv_pack(unsigned char *buf, int le
 			return 0;
 		if (src.s_addr != dst_ip.s_addr)
 			return 0;
-		if (memcmp(p+ah->ar_hln+4, &me.sll_addr, ah->ar_hln))
+		if (memcmp(p+ah->ar_hln+4, &me[0].sll_addr, ah->ar_hln))
 			return 0;
 	} else {
 		/* DAD packet was:
@@ -250,7 +257,7 @@ int recv_pack(unsigned char *buf, int le
 		 */
 		if (src_ip.s_addr != dst.s_addr)
 			return 0;
-		if (memcmp(p, &me.sll_addr, me.sll_halen) == 0)
+		if (memcmp(p, &me[0].sll_addr, me[0].sll_halen) == 0)
 			return 0;
 		if (src.s_addr && src.s_addr != dst_ip.s_addr)
 			return 0;
@@ -266,7 +273,7 @@ int recv_pack(unsigned char *buf, int le
 			printf("for %s ", inet_ntoa(dst_ip));
 			s_printed = 1;
 		}
-		if (memcmp(p+ah->ar_hln+4, me.sll_addr, ah->ar_hln)) {
+		if (memcmp(p+ah->ar_hln+4, me[0].sll_addr, ah->ar_hln)) {
 			if (!s_printed)
 printf("for ");
 			printf("[");
@@ -292,7 +299,7 @@ int recv_pack(unsigned char *buf, int le
 	if (quit_on_reply)
 		finish();
 	if(!broadcast_only) {
-		memcpy(he.sll_addr, p, me.sll_halen);
+		memcpy(he[0].sll_addr, p, me[0].sll_halen);
 		unicasting=1;
 	}
 	return 1;
@@ -458,9 +465,9 @@ main(int argc, char **argv)
 		close(probe_fd);
 	};
 
-	me.sll_family = AF_PACKET;
-	me.sll_ifindex = ifindex;
-	me.sll_protocol = htons(ETH_P_ARP);
+	me[0].sll_family = AF_PACKET;
+	me[0].sll_ifindex = ifindex;
+	me[0].sll_protocol = htons(ETH_P_ARP);
 	if (bind(s, (struct sockaddr*)&me, sizeof(me)) == -1) {
 		perror("bind");
 		exit(2);
@@ -473,14 +480,44 @@ main(int argc, char **argv)
 			exit(2);
 		}
 	}
-	if (me.sll_halen == 0) {
+	if (me[0].sll_halen == 0) {
 		if (!qui

Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Sean Hefty
> All active side users are fine I think.  But any client on the passive side
> currently might destroy the new ID by returning error from the callback, and I
> like this interface since it frees the resources immediately.

As long as only *newly* created (i.e. associated with a connection request) 
cm_id's are destroyed this way, we're fine.  Newly created cm_id's are 
associated with a listening cm_id.  Destruction of the listening cm_id is 
blocked while a callback for a connection request is in progress.

> Since all such passive side users currently are out of tree, I don't think
> it's urgent for us to do anything about the passive side race - but please do
> not at least break code that uses passive side in major ways just yet.

I use the callback method of destruction for new cm_id's in the ucm and ucma 
modules, so I want to keep this feature myself.  However, this method is 
unused, 
and likely unneeded, for events other than connection requests.  If this is the 
case, we can update the documentation, and remove this support except for new 
connections.

I looked at the existing users and didn't find any module unload races with 
either the ib_cm or rdma_cm, so I don't think that any immediate fixes are 
necessary.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] fixing sparse warnings ?

2006-11-02 Thread Bryan O'Sullivan
Ramachandra K wrote:
> I've been searching for a while but cant seem to find any pointers on
> how to fix sparse warnings (like cast to restricted type etc) or in general
> making code sparse check safe.

Add annotations to the data types that you're using, and make them 
consistent.  For example, if you have a function that takes a u16, and 
you pass in a __le16, you need to decide whether it's the function or 
the caller that needs fixing.  And you then need to propagate out those 
annotations until all of your sources of problems have gone away.

It's a very simple process.

For more information, do a Google search for "sparse site:lwn.net".

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Mellanox SRP target implementation

2006-11-02 Thread Cain, Brian (GE Healthcare)
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Vu Pham
> Sent: Thursday, November 02, 2006 2:29 AM
> To: Tomoaki Sato
> Cc: openib-general@openib.org
> Subject: Re: [openib-general] Mellanox SRP target implementation
> 
> Tomoaki,
> 
> > 
> > Can anybody tell me about the mellanox "SRP target" 
> implementation code which is included in MTD2000 with 
> NFS-RDMA server ?
> > Is this gen2 base ?
> > 
> 
> *srp target* is still on gen1 code base - IBGD
> 
> *nfs-rdma server* is on gen2 code base

Any chance the MTD2000 runs openfiler?

-Brian

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Hal Rosenstock
Hi Oliver,

On Thu, 2006-11-02 at 10:20, Oliver wrote:
> Hi, Hal -
> 
> > How is this being observed/measured ?
> 
> Host A, B, with 4x DDR both connected to Flextronic switch.
> A single process of ibv_read_bw gives about 1415MB /s average
> bandwidth. Two concurrent process report 714.45 MB/s each, dead even.
> Now if I bump up one process with a different SL, then I expect to see
> shaping to take place. Please let me if the scenario makes sense.

It makes sense. However, if the higher priority traffic does not fill
the scheduling, the low priority can take up the slack so I'm not sure
if this is what you are seeing or something else.

It might be interesting to try the same thing at SDR speeds.

-- Hal

> > Yes, 8 VLs should be supported in your subnet. You can verify this with
> > smpquery portinfo on the HCA port and examine OperVLs assuming the port
> > is ACTIVE.
> 
> yes, I verified the data VL support, it is 8. I will poke for more
> info with suggested commands by Sasha.
> 
> > > A related question is, if I modify qos setting in SM, do I need to
> > > restart SA on each hosts for it to see the changes? (I am hoping not,
> > > as I tried in the test, it doesn't seem to make a difference)
> >
> > Not sure what you mean. SA is tightly coupled with the OpenSM. Do you
> > mean SA client ? The client hosts don't need restarting but did you
> > restart OpenSM with your QoS configuration ?
> 
> I mean client SA. yes, I understand OpenSM needs to be restarted.
> 
> > BTW, which OpenSM are you running ?
> 
> OFED 1.1 based.
> 
> thanks
> 
> - Oliver


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] fixing sparse warnings ?

2006-11-02 Thread Ramachandra K
I've been searching for a while but cant seem to find any pointers on
how to fix sparse warnings (like cast to restricted type etc) or in general
making code sparse check safe.

I would appreciate if someone could point me in the right direction.

Regards,
Ram

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] opensm: strict osm_log arguments/format check

2006-11-02 Thread Sasha Khapyorsky
On 15:38 Thu 02 Nov , Yevgeny Kliteynik wrote:
> Hi Sasha.
> 
> Good catch with those missing arguments. 

It is compiler...

> One question: in several places you used cl_hton64() to print guid.
> Shouldn't there be cl_ntoh64() instead?

Right, it is mistake. Thanks for catching. Will resend.

Sasha

> 
> ...and yes, I know that these two functions are actually the same macro :)
> 
> Thanks
> 
> -- Yevgeny
> 
> 
> Sasha Khapyorsky wrote:
> > This adds gcc attribute to osm_log() which causes the compiler to check
> > argument types against a format string. And also there are related fixes
> > in osm_log() usage in opensm and osmtest.
> > 
> > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> > ---
> >  osm/include/opensm/osm_log.h |8 +++-
> >  osm/libvendor/osm_vendor_ibumad_sa.c |2 +-
> >  osm/opensm/main.c|3 ++-
> >  osm/opensm/osm_pkey_mgr.c|1 +
> >  osm/opensm/osm_port_info_rcv.c   |5 +++--
> >  osm/opensm/osm_sa_informinfo.c   |4 ++--
> >  osm/opensm/osm_sa_link_record.c  |8 
> >  osm/opensm/osm_sa_mad_ctrl.c |3 ++-
> >  osm/opensm/osm_sa_response.c |2 +-
> >  osm/opensm/osm_sm_state_mgr.c|3 ++-
> >  osm/opensm/osm_sminfo_rcv.c  |9 +
> >  osm/opensm/osm_state_mgr.c   |8 
> >  osm/osmtest/osmt_multicast.c |   12 +++-
> >  osm/osmtest/osmt_service.c   |6 +++---
> >  osm/osmtest/osmtest.c|8 
> >  15 files changed, 48 insertions(+), 34 deletions(-)
> > 
> > diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
> > index 62f3a0c..2b24886 100644
> > --- a/osm/include/opensm/osm_log.h
> > +++ b/osm/include/opensm/osm_log.h
> > @@ -60,6 +60,12 @@
> >  #include 
> >  #include 
> >  
> > +#ifdef __GNUC__
> > +#define STRICT_OSM_LOG_FORMAT __attribute__((format(printf, 3, 4)))
> > +#else
> > +#define STRICT_OSM_LOG_FORMAT
> > +#endif
> > +
> >  #ifdef __cplusplus
> >  #  define BEGIN_C_DECLS extern "C" {
> >  #  define END_C_DECLS   }
> > @@ -374,7 +380,7 @@ void
> >  osm_log(
> > IN osm_log_t* const p_log,
> > IN const osm_log_level_t verbosity,
> > -   IN const char *p_str, ... );
> > +   IN const char *p_str, ... ) STRICT_OSM_LOG_FORMAT;
> >  
> >  void
> >  osm_log_raw(
> > diff --git a/osm/libvendor/osm_vendor_ibumad_sa.c 
> > b/osm/libvendor/osm_vendor_ibumad_sa.c
> > index 7fd0655..7c4a2f7 100644
> > --- a/osm/libvendor/osm_vendor_ibumad_sa.c
> > +++ b/osm/libvendor/osm_vendor_ibumad_sa.c
> > @@ -853,7 +853,7 @@ osmv_query_sa(
> >  if ( p_mpr_req->sgid_count + p_mpr_req->dgid_count > 
> > IB_MULTIPATH_MAX_GIDS )
> >  {
> >osm_log( p_log, OSM_LOG_ERROR,
> > -   "osmv_query_sa DBG:001 MULTIPATH_REC ",
> > +   "osmv_query_sa DBG:001 MULTIPATH_REC "
> > "SGID count %d DGID count %d max count %d\n",
> >  p_mpr_req->sgid_count, p_mpr_req->dgid_count,
> >  IB_MULTIPATH_MAX_GIDS );
> > diff --git a/osm/opensm/main.c b/osm/opensm/main.c
> > index 729702a..752b546 100644
> > --- a/osm/opensm/main.c
> > +++ b/osm/opensm/main.c
> > @@ -460,7 +460,8 @@ parse_ignore_guids_file(IN char *guids_f
> >{
> >  osm_log( &p_osm->log, OSM_LOG_ERROR,
> >   "parse_ignore_guids_file: ERR 0601: "
> > - "Unable to open ignore guids file (%s)\n" );
> > + "Unable to open ignore guids file (%s)\n",
> > + guids_file_name );
> >  status = IB_ERROR;
> >  goto Exit;
> >}
> > diff --git a/osm/opensm/osm_pkey_mgr.c b/osm/opensm/osm_pkey_mgr.c
> > index f2cb221..735dc14 100644
> > --- a/osm/opensm/osm_pkey_mgr.c
> > +++ b/osm/opensm/osm_pkey_mgr.c
> > @@ -139,6 +139,7 @@ pkey_mgr_process_physical_port(
> >"pkey_mgr_process_physical_port: ERR 0503: "
> >"Failed to obtain P_Key 0x%04x block and index for node "
> >"0x%016" PRIx64 " port %u\n",
> > +  ib_pkey_get_base( pkey ),
> >cl_ntoh64( osm_node_get_node_guid( p_node ) ),
> >osm_physp_get_port_num( p_physp ) );
> >return;
> > diff --git a/osm/opensm/osm_port_info_rcv.c b/osm/opensm/osm_port_info_rcv.c
> > index 95112dc..f6d3595 100644
> > --- a/osm/opensm/osm_port_info_rcv.c
> > +++ b/osm/opensm/osm_port_info_rcv.c
> > @@ -724,8 +724,9 @@ osm_pi_rcv_process(
> >{
> >  osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
> >   "osm_pi_rcv_process: "
> > - "Got light sweep response from remote port of parent node 
> > GUID = 0x%" PRIx64
> > - " port = %u, Commencing heavy sweep\n",
> > + "Got light sweep response from remote port of parent node "
> > + "GUID = 0x%" PRIx64 " port = 0x%016" PRIx64
> > + ", Commencing heavy sweep\n",
> >   cl_ntoh64( node_guid ),
> >   cl_ntoh64( port_guid ) );
> >  osm_state_mgr_process( 

Re: [openib-general] [PATCH] librdmacm: updated librdmacm to work with proposed 2.6.20 kernel CMA

2006-11-02 Thread Or Gerlitz
Sean Hefty wrote:
>> 1) librdmacm does not get built against libibverbs-1.0 (see below) so 
>> i am using libibverbs (ie the non released yet libibverbs1.1)

> I need to think about what we can do here.  The librdmacm uses 
> functionality not found in libibverbs-1.0.

Have you looked on that? from the compilation failure against 
libibverbs-1.0 the gap seem pretty small. If indeed this is the case, 
since libibverbs-1.1 is in development lets check with Roland if it 
makes sense for him to support these small-gap-features in 
libibverbs-1.0.X, i guess what matters here is ABI versions...

If it is not possible, maybe we can somehow instrument the code of 
librdmacm to do well with libibverbs-1.0.Y

If this is not possible as well, i guess the way to use librdmacm for 
the time being is against a devel drop of libibverbs-1.1 as i am doing now.

>> 2) the cma rdma multicast does not let a consumer to join as send-only

> This would require some sort of change to the API and ABI, so if this is 
> needed, I'd like to incorporate this now.  (Adding it could be done by 
> specifying join parameters.)  Do we need/want this level of control in 
> the librdmacm, or should users go to a direct IB interface for this?

I think we do want it. The rdma cm provide the means to offload ip 
multicast to ib multicast though registration (join/leave etc) with the 
ib_sa module. IP Multicast does use the send-only feature and hence IP 
Multicast offloading apps need it as well. The rdma cm framework fits 
very well for such apps and the ib_usa (which does not exist now, and i 
am not sure needs to exist... it was a project of a summer student with 
open-mpi that required that...) not.

Currently, librdmacm does not have the means to distinguish between 
sender and receiver, so it joins the sender as full member and attaches 
its qp to this group mgid, this hurts performance, first and second 
might cause this sender CQ to receive the posts as well (i am not sure 
here) which can get it go crazy...

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Oliver
Hi, Hal -

> How is this being observed/measured ?

Host A, B, with 4x DDR both connected to Flextronic switch.
A single process of ibv_read_bw gives about 1415MB /s average
bandwidth. Two concurrent process report 714.45 MB/s each, dead even.
Now if I bump up one process with a different SL, then I expect to see
shaping to take place. Please let me if the scenario makes sense.



> Yes, 8 VLs should be supported in your subnet. You can verify this with
> smpquery portinfo on the HCA port and examine OperVLs assuming the port
> is ACTIVE.

yes, I verified the data VL support, it is 8. I will poke for more
info with suggested commands by Sasha.

> > A related question is, if I modify qos setting in SM, do I need to
> > restart SA on each hosts for it to see the changes? (I am hoping not,
> > as I tried in the test, it doesn't seem to make a difference)
>
> Not sure what you mean. SA is tightly coupled with the OpenSM. Do you
> mean SA client ? The client hosts don't need restarting but did you
> restart OpenSM with your QoS configuration ?

I mean client SA. yes, I understand OpenSM needs to be restarted.

> BTW, which OpenSM are you running ?

OFED 1.1 based.

thanks

- Oliver

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Or Gerlitz
Roland Dreier wrote:
> Unfortunately I don't think this solves the module unloading race at
> all: there is still a window where code in the client module callback
> is running, but the callback has dropped all references etc. so the
> client module will happily proceed to unload.
> 
>  > At the bottom line, users must call xxx_destory_id() explicitly the
>  > xxx module would be able to handle in_callback situations.
> 
> I think this is actually a good point for the CM case at least.
> Clients already have something registered with the CM (namely the CM
> ID itself), so if we required all consumers to destroy their IDs
> explicitly, then there's no reason to add additional client
> registration.

I agree. This applies also to the rdma cm. I think that as others 
pointed, the case of new id's generated by the cm / rdma cm for incoming 
connection request might be an exception, but lets first decide this is 
the only case we need to solve, and when time comes, discuss how to do 
that.

As for client registration with the ib_mad ib_sa and ib_addr modules, i 
understand the first two where already implemented... and now Sean wants 
to add it also for the ib_addr module.

Now, this module does not have ID's, so we can either add them or 
implement the registration... let it be what ever Sean prefers, i just 
think we should not take it to the cm and rdma cm level.

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 09:15, Makia Minich wrote:
> Hal Rosenstock wrote:
> > Makia,
> > 
> > On Wed, 2006-11-01 at 17:42, Makia Minich wrote:
> >> It just so happens that we've started looking at this here at ORNL as
> >> well.  I had a question about the options.  The manpage makes it seem
> >> that you can set these qos options (e.g. qos_high_limit) from the
> >> command line, but I haven't been overly successful.
> > 
> > What are you referring to in the man page ?
> 
> OK, re-reading the man page section on qos, I now realize that I didn't
> understand the statement "cached options file" on my initial read
> through.  So, now I've got it.
> 
> > Which OpenSM are you using (trunk or 1.1 based) ?
> 
> 1.1 based
> 
> >> Is there an example of this being done?
> > 
> > Yes in both the man page under QOS CONFIGURATION or under
> > osm/doc/qos-config.txt in the repository.
> 
> I see that that file doesn't install in the doc directory with OFED,
> perhaps that should be added (so that I can find it in the ${OFED}/doc
> directory).

I used that doc and put it pretty much verbatim into the man page so IMO
this is somewhat redundant but it could be added to the next release if
you think this adds value (having the separate docs).

-- Hal

> >>   Or is changing the /var/cache/osm/opensm.opts file
> >> the preferred method of changing the options?
> > 
> > I think it's the only way but it is imperative QoS is enabled for this
> > to have any effect.
> > 
> > -- Hal
> 
> That part I've got set in the opensm.opts file:
> 
> no_qos FALSE
> 
> >> Sasha Khapyorsky wrote:
> >>> On 16:52 Wed 01 Nov , Oliver wrote:
>  Hi, folks -
> 
>  I am trying to verify and evaluate IB QoS support, running openSM as
>  subnet manager. The perftest program is extended to set SL as command
>  line options instead of default 0, and by modifying VL arbitration
>  tables, I am expecting to see the traffic shaping can actually take
>  place, but it did not.  More details on configuration:
> 
>  in opensm.opts:
>  # QoS default options
>  qos_high_limit 255 # disable low priority table
>  qos_vlarb_high: 0:4,1:4,2:8,3:0, 4:0  # this is to give VL 2
>  (corresponding to SL 2) a higher weight 8
>  qos_sl2vl 0,1,2,3,4, ... # no changes here
> 
>  I think (though not verified) the Voltaire HCA we are using can
>  support 8 data VLs. I don't have much more information to go on why
>  qos shaping is not taking place, any suggestions?
> >>> You can verify actual port's parameters with smpquery (from diags), you
> >>> will need to run to get QoS related parameters:
> >>>
> >>>   smpquery portinfo ...
> >>>   smpquery vlarb ...
> >>>   smpquery sl2vl ...
> >>>
> >>> Sasha
> >>>
>  A related question is, if I modify qos setting in SM, do I need to
>  restart SA on each hosts for it to see the changes? (I am hoping not,
>  as I tried in the test, it doesn't seem to make a difference)
> 
>  Thanks for help.
>  -- 
>  Oliver
> 
>  ___
>  openib-general mailing list
>  openib-general@openib.org
>  http://openib.org/mailman/listinfo/openib-general
> 
>  To unsubscribe, please visit 
>  http://openib.org/mailman/listinfo/openib-general
> 
> >>> ___
> >>> openib-general mailing list
> >>> openib-general@openib.org
> >>> http://openib.org/mailman/listinfo/openib-general
> >>>
> >>> To unsubscribe, please visit 
> >>> http://openib.org/mailman/listinfo/openib-general
> >>>
> >>>
> > 
> > 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Makia Minich
Hal Rosenstock wrote:
> Makia,
> 
> On Wed, 2006-11-01 at 17:42, Makia Minich wrote:
>> It just so happens that we've started looking at this here at ORNL as
>> well.  I had a question about the options.  The manpage makes it seem
>> that you can set these qos options (e.g. qos_high_limit) from the
>> command line, but I haven't been overly successful.
> 
> What are you referring to in the man page ?

OK, re-reading the man page section on qos, I now realize that I didn't
understand the statement "cached options file" on my initial read
through.  So, now I've got it.

> Which OpenSM are you using (trunk or 1.1 based) ?

1.1 based

>> Is there an example of this being done?
> 
> Yes in both the man page under QOS CONFIGURATION or under
> osm/doc/qos-config.txt in the repository.

I see that that file doesn't install in the doc directory with OFED,
perhaps that should be added (so that I can find it in the ${OFED}/doc
directory).

>>   Or is changing the /var/cache/osm/opensm.opts file
>> the preferred method of changing the options?
> 
> I think it's the only way but it is imperative QoS is enabled for this
> to have any effect.
> 
> -- Hal

That part I've got set in the opensm.opts file:

no_qos FALSE

>> Sasha Khapyorsky wrote:
>>> On 16:52 Wed 01 Nov , Oliver wrote:
 Hi, folks -

 I am trying to verify and evaluate IB QoS support, running openSM as
 subnet manager. The perftest program is extended to set SL as command
 line options instead of default 0, and by modifying VL arbitration
 tables, I am expecting to see the traffic shaping can actually take
 place, but it did not.  More details on configuration:

 in opensm.opts:
 # QoS default options
 qos_high_limit 255 # disable low priority table
 qos_vlarb_high: 0:4,1:4,2:8,3:0, 4:0  # this is to give VL 2
 (corresponding to SL 2) a higher weight 8
 qos_sl2vl 0,1,2,3,4, ... # no changes here

 I think (though not verified) the Voltaire HCA we are using can
 support 8 data VLs. I don't have much more information to go on why
 qos shaping is not taking place, any suggestions?
>>> You can verify actual port's parameters with smpquery (from diags), you
>>> will need to run to get QoS related parameters:
>>>
>>>   smpquery portinfo ...
>>>   smpquery vlarb ...
>>>   smpquery sl2vl ...
>>>
>>> Sasha
>>>
 A related question is, if I modify qos setting in SM, do I need to
 restart SA on each hosts for it to see the changes? (I am hoping not,
 as I tried in the test, it doesn't seem to make a difference)

 Thanks for help.
 -- 
 Oliver

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit 
 http://openib.org/mailman/listinfo/openib-general

>>> ___
>>> openib-general mailing list
>>> openib-general@openib.org
>>> http://openib.org/mailman/listinfo/openib-general
>>>
>>> To unsubscribe, please visit 
>>> http://openib.org/mailman/listinfo/openib-general
>>>
>>>
> 
> 

-- 
Makia Minich <[EMAIL PROTECTED]>
National Center for Computation Science
Oak Ridge National Laboratory
Phone: 865.574.7460

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Hal Rosenstock
Makia,

On Wed, 2006-11-01 at 17:42, Makia Minich wrote:
> It just so happens that we've started looking at this here at ORNL as
> well.  I had a question about the options.  The manpage makes it seem
> that you can set these qos options (e.g. qos_high_limit) from the
> command line, but I haven't been overly successful.

What are you referring to in the man page ?

Which OpenSM are you using (trunk or 1.1 based) ?

> Is there an example of this being done?

Yes in both the man page under QOS CONFIGURATION or under
osm/doc/qos-config.txt in the repository.

>   Or is changing the /var/cache/osm/opensm.opts file
> the preferred method of changing the options?

I think it's the only way but it is imperative QoS is enabled for this
to have any effect.

-- Hal

> Sasha Khapyorsky wrote:
> > On 16:52 Wed 01 Nov , Oliver wrote:
> >> Hi, folks -
> >>
> >> I am trying to verify and evaluate IB QoS support, running openSM as
> >> subnet manager. The perftest program is extended to set SL as command
> >> line options instead of default 0, and by modifying VL arbitration
> >> tables, I am expecting to see the traffic shaping can actually take
> >> place, but it did not.  More details on configuration:
> >>
> >> in opensm.opts:
> >> # QoS default options
> >> qos_high_limit 255 # disable low priority table
> >> qos_vlarb_high: 0:4,1:4,2:8,3:0, 4:0  # this is to give VL 2
> >> (corresponding to SL 2) a higher weight 8
> >> qos_sl2vl 0,1,2,3,4, ... # no changes here
> >>
> >> I think (though not verified) the Voltaire HCA we are using can
> >> support 8 data VLs. I don't have much more information to go on why
> >> qos shaping is not taking place, any suggestions?
> > 
> > You can verify actual port's parameters with smpquery (from diags), you
> > will need to run to get QoS related parameters:
> > 
> >   smpquery portinfo ...
> >   smpquery vlarb ...
> >   smpquery sl2vl ...
> > 
> > Sasha
> > 
> >> A related question is, if I modify qos setting in SM, do I need to
> >> restart SA on each hosts for it to see the changes? (I am hoping not,
> >> as I tried in the test, it doesn't seem to make a difference)
> >>
> >> Thanks for help.
> >> -- 
> >> Oliver
> >>
> >> ___
> >> openib-general mailing list
> >> openib-general@openib.org
> >> http://openib.org/mailman/listinfo/openib-general
> >>
> >> To unsubscribe, please visit 
> >> http://openib.org/mailman/listinfo/openib-general
> >>
> > 
> > ___
> > openib-general mailing list
> > openib-general@openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] question on QoS support

2006-11-02 Thread Hal Rosenstock
Hi Oliver,

On Wed, 2006-11-01 at 16:52, Oliver wrote:
> Hi, folks -
> 
> I am trying to verify and evaluate IB QoS support, running openSM as
> subnet manager. The perftest program is extended to set SL as command
> line options instead of default 0, and by modifying VL arbitration
> tables, I am expecting to see the traffic shaping can actually take
> place,

How is this being observed/measured ?

>  but it did not.  More details on configuration:
> 
> in opensm.opts:
> # QoS default options
> qos_high_limit 255 # disable low priority table

This doesn't disable it but it won't be scheduled unless there are no
high priority packets to send.

> qos_vlarb_high: 0:4,1:4,2:8,3:0, 4:0  # this is to give VL 2
> (corresponding to SL 2) a higher weight 8
> qos_sl2vl 0,1,2,3,4, ... # no changes here
> 
> I think (though not verified) the Voltaire HCA we are using can
> support 8 data VLs.

Yes, 8 VLs should be supported in your subnet. You can verify this with
smpquery portinfo on the HCA port and examine OperVLs assuming the port
is ACTIVE.

>  I don't have much more information to go on why
> qos shaping is not taking place, any suggestions?

Sasha's email is a good start. We can go from there.

> A related question is, if I modify qos setting in SM, do I need to
> restart SA on each hosts for it to see the changes? (I am hoping not,
> as I tried in the test, it doesn't seem to make a difference)

Not sure what you mean. SA is tightly coupled with the OpenSM. Do you
mean SA client ? The client hosts don't need restarting but did you
restart OpenSM with your QoS configuration ?

BTW, which OpenSM are you running ?

-- Hal

> Thanks for help.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] librdmacm: updated librdmacm to work with proposed 2.6.20 kernel CMA

2006-11-02 Thread Steve Wise
On Wed, 2006-11-01 at 15:29 -0800, Sean Hefty wrote:
> > This patch removes rdma_get/set_option().  Is that what you intended?
> 
> Yes.  I wanted to reconsider the approach here.
> 
> I believe that there's a cleaner implementation for getting path records that 
> involves a userspace SA library/daemon than going through the rdma cm.  And 
> no 
> one was using the option to set a specific path.
> 
> For the CM timeout options, those were added to support uDAPL, but I believe 
> that a better approach which would accomplish the higher level goal is to 
> have 
> the kernel rdma cm issue MRA (message received acknowledged) messages for 
> clients which are slow to respond to requests.
> 

Ok thanks.  For my testing, I'll remove this code from uDAPL...

Steve.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Static linking with libibverbs

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> Subject: Re: Static linking with libibverbs
> 
> On Nov 2, 2006, at 8:13 AM, Michael S. Tsirkin wrote:
> 
> > Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> >> Yes.  See the FAQ items on the OMPI web site from my first mail.
> >
> > OK, I see.
> > So what it boils down to, is linking with
> > -Wl,--whole-archive -libverbs /mthca.a -Wl,--no-whole-archive
> > Is that right?
> 
> There's a few other details, but this is the Main Point, yes.
> 
> > But -u openib_driver_init will work as well, won't it?
> 
> I'm not entirely sure -- it might (I didn't try it).  It *should*  
> force creation of a valid code path into mthca.a and therefore use it  
> for all the resolution that is required (i.e., link in all the parts  
> of mthca.a that are actually required).

Since it worked for linking with static mthca.a and libiverbs.a,
I think it will link with -static as well.

> What I'm not sure about is whether the symbols that mthca needs from  
> libibverbs will be linked in properly (since the linker order is left  
> to right "-libverbs /mthca.a").  I *think* they'll be available  
> from when mthca.a was originally created (i.e., libibverbs.a was  
> statically linked into mthca.a),

Surely not. mthca.a does not include objects from libibverbs.a

> but I don't know if the linker will  
> be smart enough to realize that there are two copies of some symbols  
> in libibverbs and further to realize that they are actually  
> duplicates of the same underlying symbol, and one can be safely  
> eliminated.  It's worth trying (but I don't really care too much :-) ).
> 
> Every time I think I understand linkers, we get weirdo cases like  
> this that make me remember I have no clue how they work.  :-)

Note that .a files are not actually created by the linker.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] {PATCH] OpenSM: Add option for force SDR link speed

2006-11-02 Thread Yevgeny Kliteynik
Looks good, thanks.

-- Yevgeny

Hal Rosenstock wrote:
> OpenSM: Add option for force SDR link speed
> 
> Add option to opensm.opts to force link speed. Currently, only forcing
> to SDR link speed is supported.
> 
> Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>
> 
> Index: include/opensm/osm_subnet.h
> ===
> --- include/opensm/osm_subnet.h   (revision 10010)
> +++ include/opensm/osm_subnet.h   (working copy)
> @@ -34,7 +34,6 @@
>   * $Id$
>   */
>  
> -
>  /*
>   * Abstract:
>   *   Declaration of osm_subn_t.
> @@ -238,9 +237,10 @@ typedef struct _osm_subn_opt
>uint8_t   sm_priority;
>uint8_t   lmc;
>boolean_t lmc_esp0;
> -  uint8_t  max_op_vls;
> +  uint8_t   max_op_vls;
> +  uint8_t   force_link_speed;
>boolean_t reassign_lids;
> -  boolean_treassign_lfts;
> +  boolean_t reassign_lfts;
>boolean_t ignore_other_sm;
>boolean_t single_thread;
>boolean_t no_multicast_option;
> Index: opensm/osm_subnet.c
> ===
> --- opensm/osm_subnet.c   (revision 10018)
> +++ opensm/osm_subnet.c   (working copy)
> @@ -452,6 +452,7 @@ osm_subn_set_default_opt(
>p_opt->lmc = OSM_DEFAULT_LMC;
>p_opt->lmc_esp0 = FALSE;
>p_opt->max_op_vls = OSM_DEFAULT_MAX_OP_VLS;
> +  p_opt->force_link_speed = 0;
>p_opt->reassign_lids = FALSE;
>p_opt->reassign_lfts = TRUE;
>p_opt->ignore_other_sm = FALSE;
> @@ -840,6 +841,10 @@ osm_subn_parse_conf_file(
>  "max_op_vls",
>  p_key, p_val, &p_opts->max_op_vls);
>  
> +  __osm_subn_opts_unpack_uint8(
> +"force_link_speed",
> +p_key, p_val, &p_opts->force_link_speed);
> +
>__osm_subn_opts_unpack_boolean(
>  "reassign_lids",
>  p_key, p_val, &p_opts->reassign_lids);
> @@ -1061,6 +1066,9 @@ osm_subn_write_conf_file(
>  "leaf_head_of_queue_lifetime 0x%02x\n\n"
>  "# Limit the maximal operational VLs\n"
>  "max_op_vls %u\n\n"
> +"# Force switch links which are more than SDR capable to \n"
> +"# operate at SDR speed\n\n"
> +"force_link_speed %u\n\n"
>  "# The subnet_timeout code that will be set for all the ports\n"
>  "# The actual timeout is 4.096usec * 2^\n"
>  "subnet_timeout %u\n\n"
> @@ -1081,6 +1089,7 @@ osm_subn_write_conf_file(
>  p_opts->head_of_queue_lifetime,
>  p_opts->leaf_head_of_queue_lifetime,
>  p_opts->max_op_vls,
> +p_opts->force_link_speed,
>  p_opts->subnet_timeout,
>  p_opts->local_phy_errors_threshold,
>  p_opts->overrun_errors_threshold
> Index: opensm/osm_lid_mgr.c
> ===
> --- opensm/osm_lid_mgr.c  (revision 10010)
> +++ opensm/osm_lid_mgr.c  (working copy)
> @@ -1152,6 +1152,14 @@ __osm_lid_mgr_set_physp_pi(
>  sizeof(p_pi->link_width_enabled) ))
>send_set = TRUE;
>  
> +if ( p_mgr->p_subn->opt.force_link_speed )
> +  ib_port_info_set_link_speed_enabled( p_pi, IB_LINK_SPEED_ACTIVE_2_5 );
> +else
> +  ib_port_info_set_link_speed_enabled( p_pi, 
> ib_port_info_get_link_speed_enabled(p_old_pi) );
> +if (memcmp( &p_pi->link_speed, &p_old_pi->link_speed,
> +sizeof(p_pi->link_speed) ))
> +  send_set = TRUE;
> +
>  /* M_KeyProtectBits are always zero */
>  p_pi->mkey_lmc = p_mgr->p_subn->opt.lmc;
>  /* Check to see if the value we are setting is different than
> Index: opensm/osm_link_mgr.c
> ===
> --- opensm/osm_link_mgr.c (revision 10010)
> +++ opensm/osm_link_mgr.c (working copy)
> @@ -310,6 +310,14 @@ __osm_link_mgr_set_physp_pi(
>  sizeof(p_pi->link_width_enabled) ))
>send_set = TRUE;
>  
> +if ( p_mgr->p_subn->opt.force_link_speed )
> +  ib_port_info_set_link_speed_enabled( p_pi, IB_LINK_SPEED_ACTIVE_2_5 );
> +else
> +  ib_port_info_set_link_speed_enabled( p_pi, 
> ib_port_info_get_link_speed_enabled(p_old_pi) );
> +if (memcmp( &p_pi->link_speed, &p_old_pi->link_speed,
> +sizeof(p_pi->link_speed) ))
> +  send_set = TRUE;
> +
>  /* calc new op_vls and mtu */
>  op_vls =
>osm_physp_calc_link_op_vls( p_mgr->p_log, p_mgr->p_subn, p_physp );
> 
> 
> 
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-g

Re: [openib-general] [PATCH] opensm: strict osm_log arguments/format check

2006-11-02 Thread Yevgeny Kliteynik
Hi Sasha.

Good catch with those missing arguments. 
One question: in several places you used cl_hton64() to print guid.
Shouldn't there be cl_ntoh64() instead?

...and yes, I know that these two functions are actually the same macro :)

Thanks

-- Yevgeny


Sasha Khapyorsky wrote:
> This adds gcc attribute to osm_log() which causes the compiler to check
> argument types against a format string. And also there are related fixes
> in osm_log() usage in opensm and osmtest.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> ---
>  osm/include/opensm/osm_log.h |8 +++-
>  osm/libvendor/osm_vendor_ibumad_sa.c |2 +-
>  osm/opensm/main.c|3 ++-
>  osm/opensm/osm_pkey_mgr.c|1 +
>  osm/opensm/osm_port_info_rcv.c   |5 +++--
>  osm/opensm/osm_sa_informinfo.c   |4 ++--
>  osm/opensm/osm_sa_link_record.c  |8 
>  osm/opensm/osm_sa_mad_ctrl.c |3 ++-
>  osm/opensm/osm_sa_response.c |2 +-
>  osm/opensm/osm_sm_state_mgr.c|3 ++-
>  osm/opensm/osm_sminfo_rcv.c  |9 +
>  osm/opensm/osm_state_mgr.c   |8 
>  osm/osmtest/osmt_multicast.c |   12 +++-
>  osm/osmtest/osmt_service.c   |6 +++---
>  osm/osmtest/osmtest.c|8 
>  15 files changed, 48 insertions(+), 34 deletions(-)
> 
> diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
> index 62f3a0c..2b24886 100644
> --- a/osm/include/opensm/osm_log.h
> +++ b/osm/include/opensm/osm_log.h
> @@ -60,6 +60,12 @@
>  #include 
>  #include 
>  
> +#ifdef __GNUC__
> +#define STRICT_OSM_LOG_FORMAT __attribute__((format(printf, 3, 4)))
> +#else
> +#define STRICT_OSM_LOG_FORMAT
> +#endif
> +
>  #ifdef __cplusplus
>  #  define BEGIN_C_DECLS extern "C" {
>  #  define END_C_DECLS   }
> @@ -374,7 +380,7 @@ void
>  osm_log(
>   IN osm_log_t* const p_log,
>   IN const osm_log_level_t verbosity,
> - IN const char *p_str, ... );
> + IN const char *p_str, ... ) STRICT_OSM_LOG_FORMAT;
>  
>  void
>  osm_log_raw(
> diff --git a/osm/libvendor/osm_vendor_ibumad_sa.c 
> b/osm/libvendor/osm_vendor_ibumad_sa.c
> index 7fd0655..7c4a2f7 100644
> --- a/osm/libvendor/osm_vendor_ibumad_sa.c
> +++ b/osm/libvendor/osm_vendor_ibumad_sa.c
> @@ -853,7 +853,7 @@ osmv_query_sa(
>  if ( p_mpr_req->sgid_count + p_mpr_req->dgid_count > 
> IB_MULTIPATH_MAX_GIDS )
>  {
>osm_log( p_log, OSM_LOG_ERROR,
> -   "osmv_query_sa DBG:001 MULTIPATH_REC ",
> +   "osmv_query_sa DBG:001 MULTIPATH_REC "
> "SGID count %d DGID count %d max count %d\n",
>  p_mpr_req->sgid_count, p_mpr_req->dgid_count,
>  IB_MULTIPATH_MAX_GIDS );
> diff --git a/osm/opensm/main.c b/osm/opensm/main.c
> index 729702a..752b546 100644
> --- a/osm/opensm/main.c
> +++ b/osm/opensm/main.c
> @@ -460,7 +460,8 @@ parse_ignore_guids_file(IN char *guids_f
>{
>  osm_log( &p_osm->log, OSM_LOG_ERROR,
>   "parse_ignore_guids_file: ERR 0601: "
> - "Unable to open ignore guids file (%s)\n" );
> + "Unable to open ignore guids file (%s)\n",
> + guids_file_name );
>  status = IB_ERROR;
>  goto Exit;
>}
> diff --git a/osm/opensm/osm_pkey_mgr.c b/osm/opensm/osm_pkey_mgr.c
> index f2cb221..735dc14 100644
> --- a/osm/opensm/osm_pkey_mgr.c
> +++ b/osm/opensm/osm_pkey_mgr.c
> @@ -139,6 +139,7 @@ pkey_mgr_process_physical_port(
>  "pkey_mgr_process_physical_port: ERR 0503: "
>  "Failed to obtain P_Key 0x%04x block and index for node "
>  "0x%016" PRIx64 " port %u\n",
> +ib_pkey_get_base( pkey ),
>  cl_ntoh64( osm_node_get_node_guid( p_node ) ),
>  osm_physp_get_port_num( p_physp ) );
>return;
> diff --git a/osm/opensm/osm_port_info_rcv.c b/osm/opensm/osm_port_info_rcv.c
> index 95112dc..f6d3595 100644
> --- a/osm/opensm/osm_port_info_rcv.c
> +++ b/osm/opensm/osm_port_info_rcv.c
> @@ -724,8 +724,9 @@ osm_pi_rcv_process(
>{
>  osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
>   "osm_pi_rcv_process: "
> - "Got light sweep response from remote port of parent node GUID 
> = 0x%" PRIx64
> - " port = %u, Commencing heavy sweep\n",
> + "Got light sweep response from remote port of parent node "
> + "GUID = 0x%" PRIx64 " port = 0x%016" PRIx64
> + ", Commencing heavy sweep\n",
>   cl_ntoh64( node_guid ),
>   cl_ntoh64( port_guid ) );
>  osm_state_mgr_process( p_rcv->p_state_mgr,
> diff --git a/osm/opensm/osm_sa_informinfo.c b/osm/opensm/osm_sa_informinfo.c
> index 69dca1d..da96d35 100644
> --- a/osm/opensm/osm_sa_informinfo.c
> +++ b/osm/opensm/osm_sa_informinfo.c
> @@ -163,8 +163,8 @@ __validate_ports_access_rights(
>  {
>osm_log( p_rcv->p_log, OSM_LOG_ERROR,
> "__validate_ports_acc

Re: [openib-general] [PATCH v2] opensm: remove obsolete p_report_buf

2006-11-02 Thread Yevgeny Kliteynik
Hi Sasha.

Looks good, thanks.
--
Yevgeny

Sasha Khapyorsky wrote:
> This removes obsolete now shared sm->p_report_buf buffer and cleans
> up related code. And also introduces new log function osm_log_printf()
> which currently trivially sends formatted output to stdout.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> ---
>  osm/include/opensm/osm_base.h  |5 --
>  osm/include/opensm/osm_log.h   |3 +
>  osm/include/opensm/osm_sm.h|2 -
>  osm/include/opensm/osm_state_mgr.h |8 --
>  osm/include/opensm/osm_ucast_mgr.h |5 --
>  osm/opensm/libopensm.map   |3 +-
>  osm/opensm/osm_log.c   |   19 +
>  osm/opensm/osm_mcast_mgr.c |   11 ++--
>  osm/opensm/osm_sm.c|   15 +
>  osm/opensm/osm_state_mgr.c |  138 ---
>  osm/opensm/osm_ucast_mgr.c |   80 +++--
>  11 files changed, 104 insertions(+), 185 deletions(-)
> 
> diff --git a/osm/include/opensm/osm_base.h b/osm/include/opensm/osm_base.h
> index 57dd4fd..20e2cc3 100644
> --- a/osm/include/opensm/osm_base.h
> +++ b/osm/include/opensm/osm_base.h
> @@ -714,11 +714,6 @@ typedef enum _osm_state_mgr_mode
>  *
>  **/
>  
> -#define OSM_REPORT_BUF_SIZE  0x1
> -#define OSM_REPORT_LINE_SIZE 0x256
> -#define OSM_REPORT_BUF_THRESHOLD (OSM_REPORT_BUF_SIZE / OSM_REPORT_LINE_SIZE)
> -
> -
>  /d* OpenSM: Base/osm_sm_signal_t
>  * NAME
>  *osm_sm_signal_t
> diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
> index 62f3a0c..6a1a93f 100644
> --- a/osm/include/opensm/osm_log.h
> +++ b/osm/include/opensm/osm_log.h
> @@ -370,6 +370,9 @@ osm_log_is_active(
>  *osm_log_destroy
>  */
>  
> +extern int osm_log_printf(osm_log_t *p_log, osm_log_level_t level,
> +   const char *fmt, ...);
> +
>  void
>  osm_log(
>   IN osm_log_t* const p_log,
> diff --git a/osm/include/opensm/osm_sm.h b/osm/include/opensm/osm_sm.h
> index bc812f3..05b87ac 100644
> --- a/osm/include/opensm/osm_sm.h
> +++ b/osm/include/opensm/osm_sm.h
> @@ -178,8 +178,6 @@ typedef struct _osm_sm
>osm_vla_rcv_ctrl_t   vla_rcv_ctrl;
>osm_pkey_rcv_t   pkey_rcv;
>osm_pkey_rcv_ctrl_t  pkey_rcv_ctrl;
> -  char*p_report_buf;
> -
>  } osm_sm_t;
>  /*
>  * FIELDS
> diff --git a/osm/include/opensm/osm_state_mgr.h 
> b/osm/include/opensm/osm_state_mgr.h
> index ad4afa0..7aaab58 100644
> --- a/osm/include/opensm/osm_state_mgr.h
> +++ b/osm/include/opensm/osm_state_mgr.h
> @@ -121,7 +121,6 @@ typedef struct _osm_state_mgr
>cl_qlist_t idle_time_list;
>cl_plock_t *p_lock;
>cl_event_t *p_subnet_up_event;
> -  char  *p_report_buf;
>osm_sm_state_t state;
>osm_state_mgr_mode_t state_step_mode;
>osm_signal_t next_stage_signal;
> @@ -170,9 +169,6 @@ typedef struct _osm_state_mgr
>  *p_subnet_up_event
>  *Pointer to the event to set if/when the subnet comes up.
>  *
> -*p_report_buf
> -*Pointer to the large log buffer used for user reports.
> -*
>  *state
>  *State of the SM.
>  *
> @@ -380,7 +376,6 @@ osm_state_mgr_init(
>   IN const osm_sm_mad_ctrl_t* const p_mad_ctrl,
>   IN cl_plock_t*  const p_lock,
>   IN cl_event_t*  const p_subnet_up_event,
> - IN char*const p_report_buf,
>   IN osm_log_t*   const p_log );
>  /*
>  * PARAMETERS
> @@ -420,9 +415,6 @@ osm_state_mgr_init(
>  *p_subnet_up_event
>  *[in] Pointer to the event to set if/when the subnet comes up.
>  *
> -*p_report_buf
> -*[in] Pointer to the large log buffer used for user reports.
> -*
>  *p_log
>  *[in] Pointer to the log object.
>  *
> diff --git a/osm/include/opensm/osm_ucast_mgr.h 
> b/osm/include/opensm/osm_ucast_mgr.h
> index 0fbfc66..1c10abb 100644
> --- a/osm/include/opensm/osm_ucast_mgr.h
> +++ b/osm/include/opensm/osm_ucast_mgr.h
> @@ -105,7 +105,6 @@ typedef struct _osm_ucast_mgr
>   osm_req_t   *p_req;
>   osm_log_t   *p_log;
>   cl_plock_t  *p_lock;
> - char*p_report_buf;
>  } osm_ucast_mgr_t;
>  /*
>  * FIELDS
> @@ -204,7 +203,6 @@ osm_ucast_mgr_init(
>   IN osm_ucast_mgr_t* const p_mgr,
>   IN osm_req_t* const p_req,
>   IN osm_subn_t* const p_subn,
> - IN char* const p_report_buf,
>   IN osm_log_t* const p_log,
>   IN cl_plock_t* const p_lock );
>  /*
> @@ -218,9 +216,6 @@ osm_ucast_mgr_init(
>  *p_subn
>  *  

Re: [openib-general] Static linking with libibverbs

2006-11-02 Thread Jeff Squyres
On Nov 2, 2006, at 8:13 AM, Michael S. Tsirkin wrote:

> Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
>> Yes.  See the FAQ items on the OMPI web site from my first mail.
>
> OK, I see.
> So what it boils down to, is linking with
> -Wl,--whole-archive -libverbs /mthca.a -Wl,--no-whole-archive
> Is that right?

There's a few other details, but this is the Main Point, yes.

> But -u openib_driver_init will work as well, won't it?

I'm not entirely sure -- it might (I didn't try it).  It *should*  
force creation of a valid code path into mthca.a and therefore use it  
for all the resolution that is required (i.e., link in all the parts  
of mthca.a that are actually required).

What I'm not sure about is whether the symbols that mthca needs from  
libibverbs will be linked in properly (since the linker order is left  
to right "-libverbs /mthca.a").  I *think* they'll be available  
from when mthca.a was originally created (i.e., libibverbs.a was  
statically linked into mthca.a), but I don't know if the linker will  
be smart enough to realize that there are two copies of some symbols  
in libibverbs and further to realize that they are actually  
duplicates of the same underlying symbol, and one can be safely  
eliminated.  It's worth trying (but I don't really care too much :-) ).

Every time I think I understand linkers, we get weirdo cases like  
this that make me remember I have no clue how they work.  :-)

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
>  > I just look a quick look at the directory setup and if you are
>  > changing things I'd say you should also arrange to have the libibverbs
>  > soname stamped into the plugin path and soname. Something like
>  > libmthca-libibverbs.2.so.0. Once you do that it is pretty safe
>  > to put it in /usr/lib* 
> 
> That makes sense (although I guess it would be
> libmthca-libibverbs.2.so without the .0, since libmthca is just a
> plugin that doesn't have an independent soname of its own).  Then we
> could have each plugin drop a file in /etc/libibverbs.conf.d/ with the
> name -- something like
> 
> driver mthca
> 
> (and possibly also read $HOME/.libibverbs.conf if desired)
> 
> The only two things I need to figure out, I hope with help from
> smarter people:
>  - What is the autoconf/automake chicanery needed to make the
>libmthca figure out the right libibverbs soname to stick in the
>name of the .so it installs?
>  - And what is the autoconf/automake chicanery needed to fall back to
>having libmthca install plain mthca.so under /usr/lib/infiniband
>when it detects that it is being built against libibverbs 1.0?

By the way, what's up with this project?
It's still planned for libibverbs 1.1, isn't it?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Static linking with libibverbs

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> Yes.  See the FAQ items on the OMPI web site from my first mail.

OK, I see.
So what it boils down to, is linking with
-Wl,--whole-archive -libverbs /mthca.a -Wl,--no-whole-archive
Is that right?

But -u openib_driver_init will work as well, won't it?



-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH TRIVIAL] diags: strip trailing whitespaces

2006-11-02 Thread Sasha Khapyorsky
Strip trailing whitespaces in diags.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 diags/src/grouping.c  |   60 ++--
 diags/src/ibaddr.c|2 +-
 diags/src/ibnetdiscover.c |   12 
 diags/src/ibping.c|2 +-
 diags/src/ibportstate.c   |2 +-
 diags/src/ibroute.c   |4 +-
 diags/src/ibsysstat.c |4 +-
 diags/src/ibtracert.c |   24 +-
 diags/src/saquery.c   |2 +-
 diags/src/smpdump.c   |6 ++--
 diags/src/smpquery.c  |4 +-
 11 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/diags/src/grouping.c b/diags/src/grouping.c
index fbca4e0..09ac10a 100644
--- a/diags/src/grouping.c
+++ b/diags/src/grouping.c
@@ -77,7 +77,7 @@ char *get_chassis_slot(unsigned char cha
return ChassisSlotStr[chassisslot];
 }
 
-static struct ChassisList *find_chassisnum(unsigned char chassisnum) 
+static struct ChassisList *find_chassisnum(unsigned char chassisnum)
 {
ChassisList *current;
 
@@ -192,7 +192,7 @@ int anafa_spine4_slot_2_slb[25] = { 0, 1
 static void get_sfb_slot(Node *node, Port *lineport)
 {
ChassisRecord *ch = node->chrecord;
-   
+
ch->chassisslot = SPINE_CS;
if (is_spine_9096(node)) {
ch->chassistype = ISR9096_CT;
@@ -210,7 +210,7 @@ static void get_router_slot(Node *node,
ChassisRecord *ch = node->chrecord;
int guessnum = 0;
 
-   if (!ch) { 
+   if (!ch) {
if (!(node->chrecord = calloc(1, sizeof(ChassisRecord
IBPANIC("out of mem");
ch = node->chrecord;
@@ -229,7 +229,7 @@ static void get_router_slot(Node *node,
/* module 1 <--> remote anafa 3 */
/* module 2 <--> remote anafa 2 */
/* module 3 <--> remote anafa 1 */
-   ch->anafanum = (guessnum == 3 ? 1 : (guessnum == 1 ? 3 : 2)); 
+   ch->anafanum = (guessnum == 3 ? 1 : (guessnum == 1 ? 3 : 2));
}
 }
 
@@ -260,7 +260,7 @@ static void fill_chassis_record(Node *no
 
if (node->chrecord) /* somehow this node has already been passed */
return;
-   
+
if (!(node->chrecord = calloc(1, sizeof(ChassisRecord
IBPANIC("out of mem");
 
@@ -285,7 +285,7 @@ static void fill_chassis_record(Node *no
/* we assume here that remoteport belongs to 
line */
get_sfb_slot(node, port->remoteport);
 
-   /* we could break here, but need to find if 
more routers connected */ 
+   /* we could break here, but need to find if 
more routers connected */
}
 
} else if (is_line(node)) {
@@ -307,7 +307,7 @@ static int get_line_index(Node *node)
 {
int retval = 3 * (node->chrecord->slotnum - 1) + 
node->chrecord->anafanum;
 
-   if (retval > LINES_MAX_NUM || retval < 1) 
+   if (retval > LINES_MAX_NUM || retval < 1)
IBPANIC("Grouping: Internal error");
return retval;
 }
@@ -319,9 +319,9 @@ static int get_spine_index(Node *node)
if (is_spine_9288(node))
retval = 3 * (node->chrecord->slotnum - 1) + 
node->chrecord->anafanum;
else
-   retval = node->chrecord->slotnum; 
+   retval = node->chrecord->slotnum;
 
-   if (retval > SPINES_MAX_NUM || retval < 1) 
+   if (retval > SPINES_MAX_NUM || retval < 1)
IBPANIC("Grouping: Internal error");
return retval;
 }
@@ -330,7 +330,7 @@ static void insert_line_router(Node *nod
 {
int i = get_line_index(node);
 
-   if (chassislist->linenode[i]) 
+   if (chassislist->linenode[i])
return; /* already filled slot */
 
chassislist->linenode[i] = node;
@@ -357,7 +357,7 @@ static void pass_on_lines_catch_spines(C
for (i = 1; i <= LINES_MAX_NUM; i++) {
node = chassislist->linenode[i];
 
-   if (!(node && is_line(node))) 
+   if (!(node && is_line(node)))
continue;   /* empty slot or router */
 
for (port = node->ports; port; port = port->next) {
@@ -383,7 +383,7 @@ static void pass_on_spines_catch_lines(C
 
for (i = 1; i <= SPINES_MAX_NUM; i++) {
node = chassislist->spinenode[i];
-   if (!node) 
+   if (!node)
continue;   /* empty slot */
for (port = node->ports; port; port = port->next) {
if (!port->remoteport)
@@ -399,34 +399,34 @@ static void pass_on_spines_catch_lines(C
 
 /*
Stupid interpolation algorithm...
-   But nothing to do - have to be compliant with VoltaireSM/NMS 
+   But nothing to do - have to be compliant with VoltaireSM/NMS
 */
 static void pass_on_spines_interpolate_chguid(ChassisList *chassisl

[openib-general] [PATCH TRIVIAL] management/libib*: strip trailing whitespaces

2006-11-02 Thread Sasha Khapyorsky

Strip trailing whitespaces for libibcommon, libibumad, libibmad.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 libibcommon/include/infiniband/common.h |2 +-
 libibcommon/src/hash.c  |6 +++---
 libibmad/include/infiniband/mad.h   |4 ++--
 libibmad/src/dump.c |   12 ++--
 libibmad/src/fields.c   |4 ++--
 libibmad/src/gs.c   |6 +++---
 libibmad/src/register.c |2 +-
 libibmad/src/resolve.c  |4 ++--
 libibmad/src/rpc.c  |2 +-
 libibmad/src/sa.c   |2 +-
 libibmad/src/serv.c |2 +-
 libibmad/src/smp.c  |4 ++--
 libibumad/include/infiniband/umad.h |4 ++--
 libibumad/src/umad.c|   26 +-
 14 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/libibcommon/include/infiniband/common.h 
b/libibcommon/include/infiniband/common.h
index 3537bdf..83c0679 100644
--- a/libibcommon/include/infiniband/common.h
+++ b/libibcommon/include/infiniband/common.h
@@ -152,7 +152,7 @@ __attribute__((unused)) static char _bui
 #endif
 
 __attribute__((unused)) static inline char*
-get_build_version(void) 
+get_build_version(void)
 {
return _build_version;
 }
diff --git a/libibcommon/src/hash.c b/libibcommon/src/hash.c
index 8f216a1..d05d221 100644
--- a/libibcommon/src/hash.c
+++ b/libibcommon/src/hash.c
@@ -57,16 +57,16 @@ For every delta with one or two bits set
   have at least 1/4 probability of changing.
 * If mix() is run forward, every bit of c will change between 1/3 and
   2/3 of the time.  (Well, 22/100 and 78/100 for some 2-bit deltas.)
-mix() was built out of 36 single-cycle latency instructions in a 
+mix() was built out of 36 single-cycle latency instructions in a
   structure that could supported 2x parallelism, like so:
-  a -= b; 
+  a -= b;
   a -= c; x = (c>>13);
   b -= c; a ^= x;
   b -= a; x = (a<<8);
   c -= a; b ^= x;
   c -= b; x = (b>>13);
   ...
-  Unfortunately, superscalar Pentiums and Sparcs can't take advantage 
+  Unfortunately, superscalar Pentiums and Sparcs can't take advantage
   of that parallelism.  They've also turned some of those single-cycle
   latency instructions into multi-cycle latency instructions.  Still,
   this is the fastest good hash I could find.  There were about 2^^68
diff --git a/libibmad/include/infiniband/mad.h 
b/libibmad/include/infiniband/mad.h
index 523f630..b6bbcbc 100644
--- a/libibmad/include/infiniband/mad.h
+++ b/libibmad/include/infiniband/mad.h
@@ -257,7 +257,7 @@ enum MAD_FIELDS {
IB_SM_DATA_F,
 
/* bytes 64 - 256 */
-   IB_GS_DATA_F, 
+   IB_GS_DATA_F,
 
/* bytes 128 - 191 */
IB_DRSMP_PATH_F,
@@ -602,7 +602,7 @@ enum {
IB_NODE_ROUTER,
NODE_RNIC,
 
-   IB_NODE_MAX = NODE_RNIC 
+   IB_NODE_MAX = NODE_RNIC
 };
 
 
/**/
diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c
index 1042ab1..eab3a8e 100644
--- a/libibmad/src/dump.c
+++ b/libibmad/src/dump.c
@@ -247,7 +247,7 @@ mad_dump_linkwidthsup(char *buf, int buf
break;
case 15:
snprintf(buf, bufsz, "1X or 4X or 8X or 12X");
-   break;  
+   break;
default:
IBWARN("bad width %d", width);
buf[0] = 0;
@@ -637,7 +637,7 @@ ib_slvl_get_i(ib_slvl_table_t *tbl, int
 }
 
 typedef struct _ib_vl_arb_element {
-   uint8_t res_vl; 
+   uint8_t res_vl;
uint8_t weight;
 } __attribute__((packed)) ib_vl_arb_element_t;
 
@@ -806,7 +806,7 @@ _mad_dump_field(ib_field_t *f, char *nam
dots[32 - l] = 0;
}
 
-   n = snprintf(buf, bufsz, "%s:%s", name, dots); 
+   n = snprintf(buf, bufsz, "%s:%s", name, dots);
_mad_dump_val(f, buf + n, bufsz - n, val);
buf[bufsz - 1] = 0;
 
@@ -816,13 +816,13 @@ _mad_dump_field(ib_field_t *f, char *nam
 int
 _mad_dump(ib_mad_dump_fn *fn, char *name, void *val, int valsz)
 {
-   ib_field_t f = { .def_dump_fn = fn, .bitlen = valsz * 8}; 
+   ib_field_t f = { .def_dump_fn = fn, .bitlen = valsz * 8};
char buf[512];
 
-   return printf("%s\n", _mad_dump_field(&f, name, buf, sizeof buf, val)); 
+   return printf("%s\n", _mad_dump_field(&f, name, buf, sizeof buf, val));
 }
 
-int 
+int
 _mad_print_field(ib_field_t *f, char *name, void *val, int valsz)
 {
return _mad_dump(f->def_dump_fn, name ? name : f->name, val, valsz ? 
valsz : ALIGN(f->bitlen, 8) / 8);
diff --git a/libibmad/src/fields.c b/libibmad/src/fields.c
index 3f0ed44..d100713 100644
--- a/libibmad/src/fields.c
+++ b/libibmad/src/fields.c
@@ -323,7 +323,7 @@ ib_field_t ib_mad_f [] = {
[IB_ATS_SM_MAGIC_KEY_F] {BITSOFFS(16*8, 16), "ATSMagicKey", 
mad_dump_hex},
  

Re: [openib-general] Static linking with libibverbs

2006-11-02 Thread Jeff Squyres
On Nov 2, 2006, at 3:19 AM, Michael S. Tsirkin wrote:

>>> static linking actually can be made to work even with older  
>>> library versions.
>>> See this HowTo (written on 02 of November, 2005).
>>> https://openib.org/tiki/tiki-index.php?page=HowToFAQ
>>
>> That's not really static linking.
>
> OK, its a difference of terms then :)

Static linking means making an executable that does not link to  
dynamic libraries at all (e.g., run "ldd a.out" and it says "not a  
dynamic executable").  Linking to static libraries is simply that --  
linking to static libraries.

>> If you try to build a true static
>> executable, which contains static libc and in particular static  
>> libdl,
>> there's no way the old code can work, for multiple reasons.  For one
>> thing, dlopen(NULL, RTLD_NOW) doesn't work on static executables so
>> libibverbs couldn't find a low-level driver that is statically linked
>> in.
>
> Does linking in low level driver work now even with -static?

Yes.  See the FAQ items on the OMPI web site from my first mail.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] opensm: osm_sm_state_mgr.h trivial indentation fixes

2006-11-02 Thread Hal Rosenstock
On Thu, 2006-11-02 at 05:57, Sasha Khapyorsky wrote:
> Trivial indentation fixes in osm_sm_state_mgr.h
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] opensm: trivial log message fix

2006-11-02 Thread Hal Rosenstock
On Wed, 2006-11-01 at 16:36, Sasha Khapyorsky wrote:
> Trivial log message fix.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 1/3] uDAPL cma: add support for new client register event

2006-11-02 Thread Or Gerlitz
Arlin Davis wrote:
> Added support for new ib verbs client register event. No extra 
> processing required at the uDAPL level. Shows up if opensm bounces.
> Index: dapl/openib_cma/dapl_ib_util.c
> ===
> --- dapl/openib_cma/dapl_ib_util.c  (revision 9916)
> +++ dapl/openib_cma/dapl_ib_util.c  (working copy)
> @@ -744,9 +744,16 @@

Arlin,

Can you please generate the patches with the -p flag which adds the 
function/structure context and resend? else it is not really possible to
review your work.

You might want to use this alias

alias svndiff='/usr/bin/svn diff --diff-cmd=/usr/bin/diff -x -up'

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-11-02 Thread Moni Levy
On 11/2/06, Ramachandra K <[EMAIL PROTECTED]> wrote:
> Michael S. Tsirkin wrote:
> Quoting r. Moni Levy <[EMAIL PROTECTED]>:

> AFAIK Module.symvers is used in compile time only so the same logic
that is
> used for .h files (the devel package) seems reasonable for it.

> I agree. It would be nice however for all devel files to go under prefix/.

> That raises a basic doubt for me. What is the general convention about
> include
files in /lib/modules/... when installing new kernel modules ?
> Should the include
files always correspond to the kernel modules that are
> installed ?

I am thinking of a scenario where a user does not install the
> development package,
in which case their IB include files and the kernel
> Module.symvers are essentially stale.

I think that the basic assumption should be that there are no .h files
unless kernel-devel or kernel-sources type packages are installed.
User that intends to develop/compile should install the devel/source
package and in that case he will have the appropriate matching
snapshot of the .h files.

-- Moni

Later on, if they try to compile
> another kernel module that depends on the IB modules,
that module will
> refuse to load due to the difference in symbol versions in the
old
> Module.symvers and the currently loaded IB kernel modules.


Regards,
Ram
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 3/3] uDAPL cma: add support for address and route retries, call disconnect when recving dreq

2006-11-02 Thread Or Gerlitz
Arlin Davis wrote:
> Fix some timeout and long disconnect delay issues discovered during 
> scale-out testing. Added support to retry rdma_cm address and route 
> resolution with configuration options and provide a disconnect call when 
> receiving the disconnect request to force an immediate disconnect reply 
> to the remote side.

Can be very nice if you share with the community the IB stack issues 
revealed under scale-out testing... basically what was the testbed?

 From what the patch does I understand you attempt to handle timeout on 
address and route resolution and long disconnect delay.

Was the issue with address resolution being ARP request or reply 
messages getting lost?

Was the issue with route resolution being timeout on SA Path queries?

Please note that for the first two, you want to retry if the event 
status is -ETIMEDOUT, the patch ignores the status field.

Was the issue with disconnect delay that peer A called 
dat_ep_disconnect() (ie sending DREQ) and the DREP was sent only when 
peer B got the disconnect event and called dat_ep_disconnect()? so now 
the DREP is sent from within the provider code when it gets the DREQ?

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH TRIVIAL] opensm: osm_sm_state_mgr.h trivial indentation fixes

2006-11-02 Thread Sasha Khapyorsky
Trivial indentation fixes in osm_sm_state_mgr.h

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 osm/include/opensm/osm_sm_state_mgr.h |   40 +---
 1 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/osm/include/opensm/osm_sm_state_mgr.h 
b/osm/include/opensm/osm_sm_state_mgr.h
index 87dc5a7..5f60276 100644
--- a/osm/include/opensm/osm_sm_state_mgr.h
+++ b/osm/include/opensm/osm_sm_state_mgr.h
@@ -107,15 +107,15 @@ BEGIN_C_DECLS
 */
 typedef struct _osm_sm_state_mgr
 {
-   cl_spinlock_t state_lock;
-   cl_timer_t polling_timer;
-   uint32_t   retry_number;
-   ib_net64_t master_guid;
-   osm_state_mgr_t*   p_state_mgr;
-   osm_subn_t*   p_subn;
-   osm_req_t*p_req;
-   osm_log_t*p_log;
-  osm_remote_sm_t*p_polling_sm;
+   cl_spinlock_tstate_lock;
+   cl_timer_t   polling_timer;
+   uint32_t retry_number;
+   ib_net64_t   master_guid;
+   osm_state_mgr_t* p_state_mgr;
+   osm_subn_t*  p_subn;
+   osm_req_t*   p_req;
+   osm_log_t*   p_log;
+   osm_remote_sm_t* p_polling_sm;
 } osm_sm_state_mgr_t;
 
 /*
@@ -124,26 +124,28 @@ typedef struct _osm_sm_state_mgr
 *  Spinlock guarding the state and processes.
 *
 *  retry_number
-*  Used on Standby state - to count the number of retries of 
queries to the master SM.
+*  Used on Standby state - to count the number of retries
+*  of queries to the master SM.
 *
-*  polling_timer
-* Timer for polling
+*  polling_timer
+*  Timer for polling
 *
-*  p_state_mgr
-* Point to the state manager object
+*  p_state_mgr
+*  Point to the state manager object
 *
 *  p_subn
 *  Pointer to the Subnet object for this subnet.
 *
-*  p_req
+*  p_req
 *  Pointer to the generic attribute request object.
 *
 *  p_log
 *  Pointer to the log object.
 *
-*  p_polling_sm
-* Pointer to a osm_remote_sm_t object. When our SM needs to poll on a 
remote
-* sm, this will be the pointer of the polled SM.
+*  p_polling_sm
+*  Pointer to a osm_remote_sm_t object. When our SM needs
+*  to poll on a remote sm, this will be the pointer of the
+*  polled SM.
 *
 * SEE ALSO
 *  SM State Manager object
@@ -298,7 +300,7 @@ osm_sm_state_mgr_process(
 *
 * DESCRIPTION
 *  Signals that the remote Master SM is alive.
-*  Need to clear the retry_number variable.
+*  Need to clear the retry_number variable.
 *
 * SYNOPSIS
 */
-- 
1.4.3.3.g8387


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-11-02 Thread Ramachandra K




Michael S. Tsirkin wrote:

  Quoting r. Moni Levy <[EMAIL PROTECTED]>:
  
  
AFAIK Module.symvers is used in compile time only so the same logic
that is used for .h files (the devel package) seems reasonable for it.

  
  
I agree. It would be nice however for all devel files to go under prefix/.

  

That raises a basic doubt for me. What is the general convention about include
files in /lib/modules/... when installing new kernel modules ? Should the include
files always correspond to the kernel modules that are installed ?

I am thinking of a scenario where a user does not install the development package,
in which case their IB include files and the kernel Module.symvers are essentially stale.
Later on, if they try to compile another kernel module that depends on the IB modules,
that module will refuse to load due to the difference in symbol versions in the
old Module.symvers and the currently loaded IB kernel modules.


Regards,
Ram



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] opensm: strict osm_log arguments/format check

2006-11-02 Thread Sasha Khapyorsky
This adds gcc attribute to osm_log() which causes the compiler to check
argument types against a format string. And also there are related fixes
in osm_log() usage in opensm and osmtest.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 osm/include/opensm/osm_log.h |8 +++-
 osm/libvendor/osm_vendor_ibumad_sa.c |2 +-
 osm/opensm/main.c|3 ++-
 osm/opensm/osm_pkey_mgr.c|1 +
 osm/opensm/osm_port_info_rcv.c   |5 +++--
 osm/opensm/osm_sa_informinfo.c   |4 ++--
 osm/opensm/osm_sa_link_record.c  |8 
 osm/opensm/osm_sa_mad_ctrl.c |3 ++-
 osm/opensm/osm_sa_response.c |2 +-
 osm/opensm/osm_sm_state_mgr.c|3 ++-
 osm/opensm/osm_sminfo_rcv.c  |9 +
 osm/opensm/osm_state_mgr.c   |8 
 osm/osmtest/osmt_multicast.c |   12 +++-
 osm/osmtest/osmt_service.c   |6 +++---
 osm/osmtest/osmtest.c|8 
 15 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/osm/include/opensm/osm_log.h b/osm/include/opensm/osm_log.h
index 62f3a0c..2b24886 100644
--- a/osm/include/opensm/osm_log.h
+++ b/osm/include/opensm/osm_log.h
@@ -60,6 +60,12 @@
 #include 
 #include 
 
+#ifdef __GNUC__
+#define STRICT_OSM_LOG_FORMAT __attribute__((format(printf, 3, 4)))
+#else
+#define STRICT_OSM_LOG_FORMAT
+#endif
+
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern "C" {
 #  define END_C_DECLS   }
@@ -374,7 +380,7 @@ void
 osm_log(
IN osm_log_t* const p_log,
IN const osm_log_level_t verbosity,
-   IN const char *p_str, ... );
+   IN const char *p_str, ... ) STRICT_OSM_LOG_FORMAT;
 
 void
 osm_log_raw(
diff --git a/osm/libvendor/osm_vendor_ibumad_sa.c 
b/osm/libvendor/osm_vendor_ibumad_sa.c
index 7fd0655..7c4a2f7 100644
--- a/osm/libvendor/osm_vendor_ibumad_sa.c
+++ b/osm/libvendor/osm_vendor_ibumad_sa.c
@@ -853,7 +853,7 @@ osmv_query_sa(
 if ( p_mpr_req->sgid_count + p_mpr_req->dgid_count > IB_MULTIPATH_MAX_GIDS 
)
 {
   osm_log( p_log, OSM_LOG_ERROR,
-   "osmv_query_sa DBG:001 MULTIPATH_REC ",
+   "osmv_query_sa DBG:001 MULTIPATH_REC "
"SGID count %d DGID count %d max count %d\n",
 p_mpr_req->sgid_count, p_mpr_req->dgid_count,
 IB_MULTIPATH_MAX_GIDS );
diff --git a/osm/opensm/main.c b/osm/opensm/main.c
index 729702a..752b546 100644
--- a/osm/opensm/main.c
+++ b/osm/opensm/main.c
@@ -460,7 +460,8 @@ parse_ignore_guids_file(IN char *guids_f
   {
 osm_log( &p_osm->log, OSM_LOG_ERROR,
  "parse_ignore_guids_file: ERR 0601: "
- "Unable to open ignore guids file (%s)\n" );
+ "Unable to open ignore guids file (%s)\n",
+ guids_file_name );
 status = IB_ERROR;
 goto Exit;
   }
diff --git a/osm/opensm/osm_pkey_mgr.c b/osm/opensm/osm_pkey_mgr.c
index f2cb221..735dc14 100644
--- a/osm/opensm/osm_pkey_mgr.c
+++ b/osm/opensm/osm_pkey_mgr.c
@@ -139,6 +139,7 @@ pkey_mgr_process_physical_port(
   "pkey_mgr_process_physical_port: ERR 0503: "
   "Failed to obtain P_Key 0x%04x block and index for node "
   "0x%016" PRIx64 " port %u\n",
+  ib_pkey_get_base( pkey ),
   cl_ntoh64( osm_node_get_node_guid( p_node ) ),
   osm_physp_get_port_num( p_physp ) );
   return;
diff --git a/osm/opensm/osm_port_info_rcv.c b/osm/opensm/osm_port_info_rcv.c
index 95112dc..f6d3595 100644
--- a/osm/opensm/osm_port_info_rcv.c
+++ b/osm/opensm/osm_port_info_rcv.c
@@ -724,8 +724,9 @@ osm_pi_rcv_process(
   {
 osm_log( p_rcv->p_log, OSM_LOG_VERBOSE,
  "osm_pi_rcv_process: "
- "Got light sweep response from remote port of parent node GUID = 
0x%" PRIx64
- " port = %u, Commencing heavy sweep\n",
+ "Got light sweep response from remote port of parent node "
+ "GUID = 0x%" PRIx64 " port = 0x%016" PRIx64
+ ", Commencing heavy sweep\n",
  cl_ntoh64( node_guid ),
  cl_ntoh64( port_guid ) );
 osm_state_mgr_process( p_rcv->p_state_mgr,
diff --git a/osm/opensm/osm_sa_informinfo.c b/osm/opensm/osm_sa_informinfo.c
index 69dca1d..da96d35 100644
--- a/osm/opensm/osm_sa_informinfo.c
+++ b/osm/opensm/osm_sa_informinfo.c
@@ -163,8 +163,8 @@ __validate_ports_access_rights(
 {
   osm_log( p_rcv->p_log, OSM_LOG_ERROR,
"__validate_ports_access_rights: ERR 4301: "
-   "Invalid port guid: 0x%016\n",
-   portguid );
+   "Invalid port guid: 0x%016" PRIx64 "\n",
+   cl_hton64(portguid) );
   valid = FALSE;
   goto Exit;
 }
diff --git a/osm/opensm/osm_sa_link_record.c b/osm/opensm/osm_sa_link_record.c
index 751023f..0ca9092 100644
--- a/osm/opensm/osm_sa_link_record.c
+++ b/osm/opensm/osm_sa_link_record.c
@@ -145,10 +145,10 @@ __osm_lr_rcv_build_physp_link(
 osm_log(

Re: [openib-general] OFED 1.1 Build Issue

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Moni Levy <[EMAIL PROTECTED]>:
> AFAIK Module.symvers is used in compile time only so the same logic
> that is used for .h files (the devel package) seems reasonable for it.

I agree. It would be nice however for all devel files to go under prefix/.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-11-02 Thread Moni Levy
Vlad,
On 10/31/06, Vladimir Sokolovsky <[EMAIL PROTECTED]> wrote:
>
> Ramachandra K wrote:
> > Moni Shoua wrote:
> >
> >> We already tried to go this way and found that a local Module.symvers
> >> is not always generated (but we might have missed something though).
> >> I suggest that you check that this alternative way works under all
> >> OSs compilation (SuSE and RedHat to be precise)...
> >>
> >>
> > I think Module.symvers generation for external modules was added sometime
> > around 2.6.16, so its not generated on the older kernels (for eg 2.6.9
> > kernels
> > on RHEL)
> >
> > In this scenario, when there is no Module.symvers file, I guess the other
> > option is to use a single Kbuild file to build both modules,
> > as explained in section 7.3 of Documentation/kbuild/modules.txt.
> >
> > But this may not be feasible always. Come to think of it, why does the
> > OFED installation procedure not update the kernel Module.symvers file
> > when it replaces the old kernel modules present in /lib/modules/
> > with the new ones ?
> >
> >> BTW, Why not updating the kernel Module.symvers when kernel-ib-devel
> >> is installed? This will free the developer from copying it to
> >> his/hers private directory.
> >>
> >>
> > It might be a good idea to update the Module.symvers file as part of the
> > normal installation and not only kernel-ib-devel. Because if the kernel
> > modules are being replaced (or new modules are being added), shouldn't
> > the Module.symvers file also be updated ?
> > Regards,
> > Ram
> Agree,
> Module.symvers should be updated by kernel-ib RPM.

AFAIK Module.symvers is used in compile time only so the same logic
that is used for .h files (the devel package) seems reasonable for it.

--Moni

> So, need to implement Moni's suggestion with light changes: update
> kernel-ib RPM %post and %preun sections instead of kernel-ib-devel RPM
> %pre and %postun.
>
> Regards,
> Vladimir
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Mellanox SRP target implementation

2006-11-02 Thread Vu Pham
Tomoaki,

> 
> Can anybody tell me about the mellanox "SRP target" implementation code which 
> is included in MTD2000 with NFS-RDMA server ?
> Is this gen2 base ?
> 

*srp target* is still on gen1 code base - IBGD

*nfs-rdma server* is on gen2 code base

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH repost] IB/srp: destroy/recreate qp/cq at reconnect

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH repost] IB/srp: destroy/recreate qp/cq at reconnect
> 
>  > Roland, what do you think about this patch?
>  > Seems like a good idea, to me.
> 
> Sorry, I haven't made this a high priority.  It seems a little like
> fiddling with the code just for the sake of fiddling -- why pick this
> one place to recreate a CQ?  Why not ipoib, etc?

Mainly because changing the QPN in ipoib will affect hardware address,
so it creates more problems than it solves.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Static linking with libibverbs

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] Static linking with libibverbs
> 
>  > static linking actually can be made to work even with older library 
> versions.
>  > See this HowTo (written on 02 of November, 2005).
>  > https://openib.org/tiki/tiki-index.php?page=HowToFAQ
> 
> That's not really static linking.

OK, its a difference of terms then :)

> If you try to build a true static
> executable, which contains static libc and in particular static libdl,
> there's no way the old code can work, for multiple reasons.  For one
> thing, dlopen(NULL, RTLD_NOW) doesn't work on static executables so
> libibverbs couldn't find a low-level driver that is statically linked
> in.

Does linking in low level driver work now even with -static?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for 2-6-19 rdma/addr: use client registration to fix module unload race

2006-11-02 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] for 2-6-19 rdma/addr: use client registration to fix 
> module unload race
> 
> >>I think this is actually a good point for the CM case at least.
> >>Clients already have something registered with the CM (namely the CM
> >>ID itself), so if we required all consumers to destroy their IDs
> >>explicitly, then there's no reason to add additional client
> >>registration.
> > 
> > The issue is more related to cm_id's that are created when a new connection 
> > request arrives.  For the user to destroy the new id's, they either need to 
> > be 
> > able to queue them somewhere for later destruction, call destroy from the 
> > callback, or indicate that the id's should be destroyed when the callback 
> > returns.
> 
> I should add that the point is taken though.  If we only allow new cm_id's to 
> be 
> destroyed this way, then we avoid the issue.
> 
> I _think_ that all users of the ib_cm and rdma_cm behave this way, but I need 
> to 
> verify this to be sure.

All active side users are fine I think.  But any client on the passive side
currently might destroy the new ID by returning error from the callback, and I
like this interface since it frees the resources immediately.

Since all such passive side users currently are out of tree, I don't think
it's urgent for us to do anything about the passive side race - but please do
not at least break code that uses passive side in major ways just yet.

Once there are in-tree passive side users, I think registration at module 
load/unload
time would be the best approach.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general