[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Michael S. Tsirkin
 Quoting Or Gerlitz [EMAIL PROTECTED]:
 Subject: Re: [PATCH for-2.6.22] IB/cm: improve local id allocation
 
 Michael S. Tsirkin wrote:
 IB/cm uses idr for local id allocations, with a running counter
 as start_id. This fails to generate distinct ids 
 
 static int cm_alloc_id(struct cm_id_private *cm_id_priv)
 {
 unsigned long flags;
 int ret, id;
 static int next_id;
 
 do {
 spin_lock_irqsave(cm.lock, flags);
 ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
 next_id++, id);
 spin_unlock_irqrestore(cm.lock, flags);
 } while( (ret == -EAGAIN)  idr_pre_get(cm.local_id_table, 
 GFP_KERNEL) );
 
 cm_id_priv-id.local_id = (__force __be32) (id ^ 
 cm.random_id_operand);
 
 Doesn't this Xor of the resulted ID with a random value done after the 
 idr allocation causes the cm to --always-- produce distinct ids???
 
 Or.

No - the cm.random_id_operand is initialized at module load time.
That's why we have the static next_id iterator.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Roland Dreier
  IB/cm uses idr for local id allocations, with a running counter
  as start_id. This fails to generate distinct ids in the scenario where
  1. An id is constantly created and destroyed
  2. A chunk of ids just beyond the current next_id value is occupied
  
  This in turn leads to an increased chance of connection request being 
  mis-detected
  as a duplicate, sometimes for several retries, until next_id gets past
  the block of allocated ids. This has been observed in practice.
  
  As a fix, remember the last id allocated and start immediately above it.

OK I guess but this needs some explanation about why the impact is so
severe we want to merge it after rc2 is already out.

  +next_id = (unsigned)id + 1;

what happens when this wraps and becomes negative?

in fact the idr stuff all works with plain signed ints -- could
idr_get_new() ever give a negative id?  (too lazy too look at the
source right now)

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH for-2.6.22] IB/cm: improve local id allocation
 
   IB/cm uses idr for local id allocations, with a running counter
   as start_id. This fails to generate distinct ids in the scenario where
   1. An id is constantly created and destroyed
   2. A chunk of ids just beyond the current next_id value is occupied
   
   This in turn leads to an increased chance of connection request being 
 mis-detected
   as a duplicate, sometimes for several retries, until next_id gets past
   the block of allocated ids. This has been observed in practice.
   
   As a fix, remember the last id allocated and start immediately above it.
 
 OK I guess but this needs some explanation about why the impact is so
 severe we want to merge it after rc2 is already out.

Well, it's a single-liner, so it seemed safe.
The impact currently is that CM times out, we re-create
a connection, either the applicatin aborts, or this process repeats 
until we get a good id, which can take a couple of minutes.

   +  next_id = (unsigned)id + 1;
 
 what happens when this wraps and becomes negative?
 
 in fact the idr stuff all works with plain signed ints -- could
 idr_get_new() ever give a negative id?  (too lazy too look at the
 source right now)

Good point, I'll check.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Michael S. Tsirkin
   +  next_id = (unsigned)id + 1;
 
 what happens when this wraps and becomes negative?
 
 in fact the idr stuff all works with plain signed ints -- could
 idr_get_new() ever give a negative id?  (too lazy too look at the
 source right now)

A quick looks makes it look like idr stuff is *really* not designed to
get a negative input: and note that old code has the wrap-around problem, too.
So, I think the following would be a better fix:

Hmm?

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]


diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index eff591d..5e77b01 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -306,7 +306,9 @@ static int cm_alloc_id(struct cm_id_private *cm_id_priv)
do {
spin_lock_irqsave(cm.lock, flags);
ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
-   next_id++, id);
+   next_id, id);
+   if (!ret)
+   next_id = id == 0x7ff ? 0 : id + 1;
spin_unlock_irqrestore(cm.lock, flags);
} while( (ret == -EAGAIN)  idr_pre_get(cm.local_id_table, 
GFP_KERNEL) );
 

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH for-2.6.22] IB/cm: improve local id allocation
 
 lib/idr.c says it returns positive IDs always (actually the comments
 say in the range 0 ... 0x7fff).  So I guess we would want
 something like:
 
   if (!ret)
   next_id = id == INT_MAX ? 0 : id + 1;

True, except INT_MAX isn't defined in kernel headers I think,
so I just put 0x7fff there.

 (current code has a similar bug, plus exposes undefined behavior of
 signed overflow).


-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Roland Dreier
  A quick looks makes it look like idr stuff is *really* not designed to
  get a negative input: and note that old code has the wrap-around problem, 
  too.
  So, I think the following would be a better fix:

Yes, that's basically what I just proposed (although see below).  It
all looks pretty safe to me...  Sean, what do you think about this for
2.6.22?

  diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
  index eff591d..5e77b01 100644
  --- a/drivers/infiniband/core/cm.c
  +++ b/drivers/infiniband/core/cm.c
  @@ -306,7 +306,9 @@ static int cm_alloc_id(struct cm_id_private *cm_id_priv)
   do {
   spin_lock_irqsave(cm.lock, flags);
   ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
  -next_id++, id);
  +next_id, id);
  +if (!ret)
  +next_id = id == 0x7ff ? 0 : id + 1;

...except I used MAX_INT here, and indeed your patch only has 6 'f's
in that constant.  Actually digging a little I see that we should use
MAX_ID_MASK to be really correct.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Roland Dreier
  True, except INT_MAX isn't defined in kernel headers I think,
  so I just put 0x7fff there.

It doesn't really matter (see my other reply) but actually INT_MAX and
others are in linux/kernel.h

 - R.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH for-2.6.22] IB/cm: improve local id allocation
 
   A quick looks makes it look like idr stuff is *really* not designed to
   get a negative input: and note that old code has the wrap-around problem, 
 too.
   So, I think the following would be a better fix:
 
 Yes, that's basically what I just proposed (although see below).  It
 all looks pretty safe to me...  Sean, what do you think about this for
 2.6.22?
 
   diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
   index eff591d..5e77b01 100644
   --- a/drivers/infiniband/core/cm.c
   +++ b/drivers/infiniband/core/cm.c
   @@ -306,7 +306,9 @@ static int cm_alloc_id(struct cm_id_private 
 *cm_id_priv)
  do {
  spin_lock_irqsave(cm.lock, flags);
  ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
   -  next_id++, id);
   +  next_id, id);
   +  if (!ret)
   +  next_id = id == 0x7ff ? 0 : id + 1;
 
 ...except I used MAX_INT here, and indeed your patch only has 6 'f's
 in that constant.  Actually digging a little I see that we should use
 MAX_ID_MASK to be really correct.

And since it's a *mask*, we can do it this way if you like:

   +  if (!ret)
   +  next_id = ((unsigned)id + 1)  MAX_ID_MASK;

which might generate a bit less code.

-- 
MST
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Sean Hefty

Yes, that's basically what I just proposed (although see below).  It
all looks pretty safe to me...  Sean, what do you think about this for
2.6.22?

  diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
  index eff591d..5e77b01 100644
  --- a/drivers/infiniband/core/cm.c
  +++ b/drivers/infiniband/core/cm.c
  @@ -306,7 +306,9 @@ static int cm_alloc_id(struct cm_id_private *cm_id_priv)
do {
spin_lock_irqsave(cm.lock, flags);
ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
  - next_id++, id);
  + next_id, id);
  + if (!ret)
  + next_id = id == 0x7ff ? 0 : id + 1;

...except I used MAX_INT here, and indeed your patch only has 6 'f's
in that constant.  Actually digging a little I see that we should use
MAX_ID_MASK to be really correct.


Looks good.  Thanks

Acked by: Sean Hefty [EMAIL PROTECTED]
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: [PATCH for-2.6.22] IB/cm: improve local id allocation

2007-05-21 Thread Roland Dreier
  And since it's a *mask*, we can do it this way if you like:
  
 +   if (!ret)
 +   next_id = ((unsigned)id + 1)  MAX_ID_MASK;
  
  which might generate a bit less code.

Good point.  In fact it is 8 bytes smaller for x86-64 at least, so
this is what I just merged:

commit 9f81036c54ed1f860d2807c5a6aa4f2b30c21204
Author: Michael S. Tsirkin [EMAIL PROTECTED]
Date:   Mon May 21 19:06:54 2007 +0300

IB/cm: Improve local id allocation

The IB CM uses an idr for local id allocations, with a running counter
as start_id.  This fails to generate distinct ids if

1. An id is constantly created and destroyed
2. A chunk of ids just beyond the current next_id value is occupied

This in turn leads to an increased chance of connection request being
mis-detected as a duplicate, sometimes for several retries, until
next_id gets past the block of allocated ids. This has been observed
in practice.

As a fix, remember the last id allocated and start immediately above it.
This also fixes a problem with the old code, where next_id might
overflow and become negative.

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]
Acked-by: Sean Hefty [EMAIL PROTECTED]
Signed-off-by: Roland Dreier [EMAIL PROTECTED]

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index eff591d..e840434 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -306,7 +306,9 @@ static int cm_alloc_id(struct cm_id_private *cm_id_priv)
do {
spin_lock_irqsave(cm.lock, flags);
ret = idr_get_new_above(cm.local_id_table, cm_id_priv,
-   next_id++, id);
+   next_id, id);
+   if (!ret)
+   next_id = ((unsigned) id + 1)  MAX_ID_MASK;
spin_unlock_irqrestore(cm.lock, flags);
} while( (ret == -EAGAIN)  idr_pre_get(cm.local_id_table, 
GFP_KERNEL) );
 
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general