Hello all, The 'simple' IOC sweep-on-demand solution tested out to be valid using a Linux RHEL 5.3, OFED-1.5.1 SRP target(vdisks). Specifically, perform an IOC sweep only when
1) requested (QUERY_DEVICE_RELATIONS for device 'IB Bus') or
2) a PORT_ACTIVE pnp event occurs.
The registry key 'IocPollInterval' value definitions have been expanded:
0 == no IOC sweeping/rescan.
1 == IOC sweep on demand ( QUERY_DEVICE_RELATIONS for device 'IB Bus', 'devcon
rescan') or PORT_ACTIVE pnp event occurs.
> 1 == IOC sweep every 'IocPollInterval' milliseconds (current behavior).
Testing consisted of loading SRP & IOU drivers on each Windows system prior to
the SRP target coming online; systems without SRP drivers loaded did not see
the SRP targets as expected.
Two Server 2008 R2 systems were used: one with the current IOC sweep every 30
seconds, the other system was IocPollInterval ==1 (sweep on demand).
OpenSM 3.3.9 was running on a separate Svr 2008 R2 (x86) system.
On each Windows system the 'Computer Management-->Storage Manager-->Disk
Manager view was opened.
Once the Linux SRP targets were started (SRP targets (vdisks) exported:
/dev/sdb1, /dev/sdb2 & /dev/sdb3.
The 30 sec sweeping system (SS) reported all 3 of the expected SRP targets
within a few seconds, the IOC-sweep-on-demand (SOD) system did not register the
3 SRP targets until a 'device rescan' was forced via 'devcon.exe rescan'.
SRP target functionality was verified using the SOD system.
The Linux SRP targets were taken offline at the Linux box.
The sweeping Windows system reported the SRP targets had been removed within a
few seconds, while the SOD system continued to display the SRP targets.
Once the SOD system had a forced an IOC rescan (devcon.exe rescan) the SRP
targets were no longer displayed.
Initial IOC sweep on demand experiments demonstrate the feasibility of the code
changes.
More testing needs to take place using more and different fabric IOC's which I
do not have access to; I will continue SRP target experiments.
At this juncture, I would recommend the code changes be committed as the
original IOC periodic sweep functionality is still available.
Furthermore the default IocPollInterval should be set == 1 (sweep on demand).
--- core/al/kernel/al_ioc_pnp.c Tue Jun 07 13:41:54 2011
+++ core/al/kernel/al_ioc_pnp.c Tue Jun 07 13:35:54 2011
@@ -294,7 +294,6 @@
cl_async_proc_item_t async_item;
sweep_state_t state;
ioc_pnp_svc_t *p_svc;
- atomic32_t query_cnt;
cl_fmap_t iou_map;
} ioc_sweep_results_t;
@@ -313,8 +312,12 @@
ioc_pnp_mgr_t *gp_ioc_pnp = NULL;
uint32_t g_ioc_query_timeout = 250;
uint32_t g_ioc_query_retries = 4;
-uint32_t g_ioc_poll_interval = 30000;
-
+uint32_t g_ioc_poll_interval = 1;
+ /* 0 == no IOC rescan
+ * 1 == IOC rescan on demand
(IB_PNP_SM_CHANGE, IB_PNP_PORT_ACTIVE,
+ *
QUERY_DEVICE_RELATIONS for device 'IB Bus')
+ * > 1 == rescan interval in
millisecond units.
+ */
/******************************************************************************
@@ -1204,6 +1207,24 @@
}
+void
+ioc_pnp_request_ioc_rescan(void)
+{
+ ib_api_status_t status;
+
+ AL_ENTER( AL_DBG_PNP );
+
+ CL_ASSERT( gp_ioc_pnp );
+ if ( g_ioc_poll_interval == 1 && !gp_ioc_pnp->query_cnt )
+ {
+ AL_PRINT( TRACE_LEVEL_ERROR, AL_DBG_ERROR, ("Requesting IOC
rescan\n") );
+ status = cl_timer_start( &gp_ioc_pnp->sweep_timer, 50 );
+ CL_ASSERT( status == CL_SUCCESS );
+ }
+ AL_EXIT( AL_DBG_PNP );
+}
+
+
/*
* PnP callback for port event notifications.
*/
@@ -1213,12 +1234,14 @@
{
ib_api_status_t status = IB_SUCCESS;
cl_status_t cl_status;
+#if DBG
+ const char *evt = ib_get_pnp_event_str(
p_pnp_rec->pnp_event );
+#endif
AL_ENTER( AL_DBG_PNP );
AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_PNP,
- ("p_pnp_rec->pnp_event = 0x%x (%s)\n",
- p_pnp_rec->pnp_event, ib_get_pnp_event_str(
p_pnp_rec->pnp_event )) );
+ ("p_pnp_rec->pnp_event = 0x%x (%s)\n", p_pnp_rec->pnp_event,
evt) );
switch( p_pnp_rec->pnp_event )
{
@@ -1257,8 +1280,19 @@
((ioc_pnp_svc_t*)p_pnp_rec->context)->obj.pfn_destroy(
&((ioc_pnp_svc_t*)p_pnp_rec->context)->obj, NULL );
p_pnp_rec->context = NULL;
+ break;
+
+ case IB_PNP_IOU_ADD:
+ case IB_PNP_IOU_REMOVE:
+ case IB_PNP_IOC_ADD:
+ case IB_PNP_IOC_REMOVE:
+ case IB_PNP_IOC_PATH_ADD:
+ case IB_PNP_IOC_PATH_REMOVE:
+ AL_PRINT( TRACE_LEVEL_ERROR, AL_DBG_PNP, ("!Handled PNP Event
%s\n",evt));
+ break;
default:
+ AL_PRINT( TRACE_LEVEL_ERROR, AL_DBG_ERROR, ("Ignored PNP Event
%s\n",evt));
break; /* Ignore other PNP events. */
}
@@ -2630,11 +2664,14 @@
__remove_ious( &old_ious );
CL_ASSERT( !cl_fmap_count( &old_ious ) );
- /* Reset the sweep timer. */
- if( g_ioc_poll_interval )
+ /* Reset the sweep timer.
+ * 0 == No IOC polling.
+ * 1 == IOC poll on demand.
+ * > 1 == IOC poll every g_ioc_poll_interval milliseconds.
+ */
+ if( g_ioc_poll_interval > 1)
{
- status = cl_timer_start(
- &gp_ioc_pnp->sweep_timer, g_ioc_poll_interval );
+ status = cl_timer_start( &gp_ioc_pnp->sweep_timer,
g_ioc_poll_interval );
CL_ASSERT( status == CL_SUCCESS );
}
@@ -3045,8 +3082,7 @@
else
{
/* Report the IOU to all clients registered for IOU events. */
- cl_qlist_find_from_head( &gp_ioc_pnp->iou_reg_list,
- __notify_users, &event );
+ cl_qlist_find_from_head( &gp_ioc_pnp->iou_reg_list,
__notify_users, &event );
/* Report IOCs - this will in turn report the paths. */
__add_iocs( p_iou, &p_iou->ioc_map, NULL );
--- core/bus/kernel/bus_port_mgr.c Tue Jun 07 13:44:20 2011
+++ core/bus/kernel/bus_port_mgr.c Tue Jun 07 13:39:10 2011
@@ -75,6 +75,7 @@
extern pkey_array_t g_pkeys;
+static pnp_port_active;
/*
* Function prototypes.
@@ -103,6 +104,9 @@
port_mgr_port_remove(
IN ib_pnp_port_rec_t*
p_pnp_rec );
+void
+ioc_pnp_request_ioc_rescan(void);
+
static NTSTATUS
port_start(
IN DEVICE_OBJECT* const
p_dev_obj,
@@ -501,11 +505,17 @@
break;
case IB_PNP_PORT_REMOVE:
+ if (pnp_port_active > 0)
+ pnp_port_active--;
port_mgr_port_remove( (ib_pnp_port_rec_t*)p_pnp_rec );
break;
+ case IB_PNP_PORT_ACTIVE:
+ pnp_port_active++;
+ break;
+
default:
- XBUS_PRINT( BUS_DBG_PNP, ("Unhandled PNP Event %s\n",
+ BUS_PRINT( BUS_DBG_PNP, ("Ignored PNP Event %s\n",
ib_get_pnp_event_str(p_pnp_rec->pnp_event) ));
break;
}
@@ -567,6 +577,15 @@
p_bfi->whoami, ca_guid,
p_port_mgr) );
if (!p_port_mgr)
return STATUS_NO_SUCH_DEVICE;
+
+ if ( g_ioc_poll_interval == 1 && pnp_port_active
+ && p_bfi->p_bus_ext->cl_ext.vfptr_pnp_po->identity
+ && strcmp(p_bfi->p_bus_ext->cl_ext.vfptr_pnp_po->identity, "IB
Bus") == 0 )
+ {
+ BUS_PRINT(BUS_DBG_PNP, ("***** device '%s' requesting IOC
rescan\n",
+
p_bfi->p_bus_ext->cl_ext.vfptr_pnp_po->identity) );
+ ioc_pnp_request_ioc_rescan();
+ }
cl_mutex_acquire( &p_port_mgr->pdo_mutex );
status = bus_get_relations( &p_port_mgr->port_list, ca_guid, p_irp );
The pnp_port_active usage is about skipping an on-demand IOC sweep request
before 'any' IB ports have come active; as in the current implementation the
IOC sweep timer is not started until after the 1st IB port goes active.
Perhaps you could suggest a better solution?
--- hw/mlx4/kernel/hca/mlx4_hca.inx Tue Jun 07 14:29:06 2011
+++ hw/mlx4/kernel/hca/mlx4_hca.inx Tue Jun 07 13:15:48 2011
@@ -296,7 +296,11 @@
HKR,"Parameters","SmiPollInterval",%REG_DWORD_NO_CLOBBER%,20000
HKR,"Parameters","IocQueryTimeout",%REG_DWORD_NO_CLOBBER%,250
HKR,"Parameters","IocQueryRetries",%REG_DWORD_NO_CLOBBER%,4
-HKR,"Parameters","IocPollInterval",%REG_DWORD_NO_CLOBBER%,30000
+
+; IocPollInterval: 0 == no ioc poll, 1 == poll on demand (device rescan)
+; (> 1) poll every x milliseconds, 30000 (30 secs) previous default.
+HKR,"Parameters","IocPollInterval",%REG_DWORD_NO_CLOBBER%,1
+
HKR,"Parameters","DebugFlags",%REG_DWORD%,0x80000000
HKR,"Parameters","ReportPortNIC",%REG_DWORD%,1
--- hw/mthca/kernel/mthca.inx Tue Jun 07 14:31:42 2011
+++ hw/mthca/kernel/mthca.inx Tue Jun 07 13:15:20 2011
@@ -297,7 +297,11 @@
HKR,"Parameters","SmiPollInterval",%REG_DWORD_NO_CLOBBER%,20000
HKR,"Parameters","IocQueryTimeout",%REG_DWORD_NO_CLOBBER%,250
HKR,"Parameters","IocQueryRetries",%REG_DWORD_NO_CLOBBER%,4
-HKR,"Parameters","IocPollInterval",%REG_DWORD_NO_CLOBBER%,30000
+
+; IocPollInterval: 0 == no ioc poll, 1 == poll on demand (device rescan)
+; (> 1) poll every x milliseconds, 30000 (30 secs) previous default.
+HKR,"Parameters","IocPollInterval",%REG_DWORD_NO_CLOBBER%,1
+
HKR,"Parameters","DebugFlags",%REG_DWORD%,0x80000000
HKR,"Parameters","ReportPortNIC",%REG_DWORD%,1
Should 'IocPollInterval' %REG_DWORD_NO_CLOBBER% be changed to %REG_DWORD% to
prevent possible install confusion?
Thanks,
Stan.
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of Smith, Stan
>Sent: Wednesday, June 01, 2011 5:36 PM
>To: Fab Tillier; Hefty, Sean; Tzachi Dar; [email protected]
>Subject: Re: [ofw] Doing queries on subnet every 30 seconds
>
>Tzachi,
> Upon further code review, there seems to be a rather simple solution which
> covers most concerns.
>
>The bus driver (ioc_manager) is coded such that when the IOC rescan routine is
>finished it restarts the IOC rescan timer if IocPollInterval > 0
>using IocPollInterval as the timer expiration value.
>Your solution (IocPollInterval = 0) prohibits starting the IOC rescan timer
>for all events, thus a new IOC/U will not be recognized; OK for
>most installations.
>
>To prohibit IOC scanning every 30 seconds and yet recognize a new
>IOC/IOU...... upon completion of an IOC rescan operation, the IOC
>rescan timer is not restarted?
>Currently IB_PNP_SM_CHANGE and IB_PNP_PORT_ACTIVE cause the IOC rescan timer
>to start and expire after 250 ms; no code change.
>Upon recognition of QUERY_DEVICE_RELATIONS for device 'IB Bus' the IOC rescan
>timer is started; this would cover the 'devcon.exe
>rescan' case.
>
>BTW, the IOC rescan timer callback function is coded such that only a single
>instance of the IOC rescan function will run.
>
>To summarize:
>Do not automatically restart the IOC rescan timer (IocPollInterval) after
>completing an IOC rescan.
>Restart the IOC rescan timer upon recognition of QUERY_DEVICE_RELATIONS for
>device 'IB Bus'.
>
>Simple, minor code changes?
>
>What have I missed?
>
>Thanks,
>
>Stan.
>_______________________________________________
>ofw mailing list
>[email protected]
>http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
al_ioc_pnp.c.patch
Description: al_ioc_pnp.c.patch
bus_port_mgr.c.patch
Description: bus_port_mgr.c.patch
_______________________________________________ ofw mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
