Hi All,
Recently I experienced a performance issue with bmi_ib. The problem is as
following:
Assuming that it is a client-server architecture, and they exchange lots of
messages through bmi_ib. If multiple client processes are started at the same
time, they cannot run concurrently at the same time. Instead, they will be
serialized to run one after another. It's a strange behavior and it hurts the
performance greatly.
After some investigation to the source code, I found the reason for that is as
following:
The new coming connection are handled in the function
ib_tcp_server_check_new_connections(); and this is called inside the function
ib_block_for_activity(). However the ib_block_for_activity() is only called
when the network is idle in BMI_ib_testcontext() or BMI_ib_testunexpected().
As a result, when the server is busy serving one client process, the other
processes can't make a new connections to the server and thus they can't
transfer data to the server concurrently.
I made a pretty simple fix for this problem and it worked for me. The idea is
checking new connections inside the testunexpected() so that new connections
can be handled in time to avoid starvation of client processes. Here it is:
diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c
index 0808797..b349938 100644
--- a/src/io/bmi/bmi_ib/ib.c
+++ b/src/io/bmi/bmi_ib/ib.c
@@ -1436,6 +1436,8 @@ restart:
}
}
+ ib_tcp_server_check_new_connections();
+
*outcount = n;
return activity + n;
}
Please feel free to share your thoughts and comments, thank you very much.
Best Regards,
Jingwang.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers