Sasha Khapyorsky wrote:
On 09:40 Wed 19 Dec , Yevgeny Kliteynik wrote:
Sasha Khapyorsky wrote:
Hi Yevgeny,
On 15:33 Mon 17 Dec , Yevgeny Kliteynik wrote:
If a heavy sweep requested during idle queue processing, OSM continues
to process it till the end and only then notices the heavy sweep request.
In some cases this might leave a topology change unhandled for several
minutes.
Could you provide more details about such cases?
As far as I know the idle queue is used only for multicast re-routing.
If so, it is interesting by itself why it takes minutes and where. Is
where MCG join/leave storm?
Exactly. The problem was discovered on a big cluster with hundreds of mcast
groups,
when there is some massive change in the subnet (like rebooting hundreds of
nodes).
Ok, then proposed patch looks like half solution for me.
During mcast join/leave storm idle queue will be filled with requests to
rebuild mcast routing. OpenSM will process it one by one (and this will
take a lot of time) instead of process all pended mcast groups in one
run. I think it is first improvement needed here.
Even with such improvement we will not be able to control the order of
heavy sweep/mcast join requests, so basically idea of breaking idle
queue processing looks fine for me, but it is not all what should be
done here. Heavy sweep by itself recalculates mcast routing for all
existing groups, it should invalidate all pended mcast rerouting
requests instead of continuing idle queue processing after heavy
sweep. Make sense?
OK, makes sense.
So bottom line, when breaking the idle queue processing because of immediate
sweep request, state manager should just purge the whole idle queue and then
start the new heavy sweep.
I'll work on it.
-- Yevgeny
Sasha
-- Yevgeny
Or single re-routing cycle takes minutes?
Sasha
Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]>
---
opensm/opensm/osm_state_mgr.c | 31 ++++++++++++++++++++++++-------
1 files changed, 24 insertions(+), 7 deletions(-)
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 5c39f11..6ee5ee6 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t *
const p_mgr,
/* CALL the done function */
__process_idle_time_queue_done(p_mgr);
- /*
- * Set the signal to
OSM_SIGNAL_IDLE_TIME_PROCESS
- * so that the next element in the queue gets
processed
- */
-
- signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
- p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
+ if (p_mgr->p_subn->force_immediate_heavy_sweep)
{
+ /*
+ * Do not read next item from the idle
queue.
+ * Immediate heavy sweep is requested,
so it's
+ * more important.
+ * Besides, there is a chance that
after the
+ * heavy sweep complition, idle queue
processing
+ * that SM would have performed here
will be obsolete.
+ */
+ if (osm_log_is_active(p_mgr->p_log,
OSM_LOG_DEBUG))
+ osm_log(p_mgr->p_log,
OSM_LOG_DEBUG,
+ "osm_state_mgr_process: "
+ "interrupting idle time queue processing - heavy sweep
requested\n");
+ signal = OSM_SIGNAL_NONE:
+ p_mgr->state = OSM_SM_STATE_IDLE;
+ }
+ else {
+ /*
+ * Set the signal to
OSM_SIGNAL_IDLE_TIME_PROCESS
+ * so that the next element in the
queue gets processed
+ */
+ signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
+ p_mgr->state =
OSM_SM_STATE_PROCESS_REQUEST;
+ }
break;
default:
--
1.5.1.4
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general