Erez Zilber wrote: > Maintain a list of nop-out PDUs that almost timed out. > With this information, you can understand and debug the > whole system: you can check your target and see what caused > it to be so slow on that specific time, you can see if your > network was very busy during that time etc. > > Signed-off-by: Erez Zilber <erezzi.l...@gmail.com> >
Sorry for the late reply. Thanks for doing this work! @@ -88,11 +89,12 @@ static struct option const long_options[] = {"killiscsid", required_argument, NULL, 'k'}, {"debug", required_argument, NULL, 'd'}, {"show", no_argument, NULL, 'S'}, + {"noopinfo", required_argument, NULL, 'N'}, {"version", no_argument, NULL, 'V'}, {"help", no_argument, NULL, 'h'}, {NULL, 0, NULL, 0}, }; Do you think you want something more generic like a get-debug-info type of feature? Maybe in the future we could also show how many times a scsi command has timed out or almost timed out (user can adjust scsi timer then) or how many times the scsi eh has fired and how many times a abort, lu or target reset has timed out? So maybe it would be get like get stats where then the noop info is the first stat and in the future we will have more? @@ -994,8 +997,25 @@ static int iscsi_nop_out_rsp(struct iscsi_task *task, if (iscsi_recv_pdu(conn->cls_conn, (struct iscsi_hdr *)nop, data, datalen)) rc = ISCSI_ERR_CONN_FAILED; - } else + } else { + ping_delay = jiffies - conn->last_ping; + + /* Check if the ping has almost timed out */ + if (ping_delay >= + (session->noop_threshold * (conn->ping_timeout * HZ)) / + 100) { + mutex_lock(&conn->noop_info_mutex); We annot use a mutex here, because we run from a bottom half (softirq for iscsi and a tasklet for iser), and we cannot sleep in a bh since there is no process context. + idx = conn->noop_info_arr_idx; + conn->noop_info_arr[idx].tv = conn->tmp_noop_tv; + conn->noop_info_arr[idx].elapsed_time_msec = ping_delay; I think ping_delay is just in jiffies, so you have to do some sort of jiffies_to_msec call (I cannot rememeber the name of the helper but there is one). + conn->noop_info_arr[idx].init = 1; + conn->noop_info_arr_idx = + (conn->noop_info_arr_idx + 1) % NOOP_INFO_ARR_LEN; I am not getting the reason for the division? + mutex_unlock(&conn->noop_info_mutex); + } + mod_timer(&conn->transport_timer, jiffies + conn->recv_timeout); + } iscsi_complete_task(task, ISCSI_TASK_COMPLETED); return rc; } @@ -2026,6 +2046,12 @@ static void iscsi_check_transport_timeouts(unsigned long data) if (time_before_eq(last_recv + recv_timeout, jiffies)) { /* send a ping to try to provoke some traffic */ ISCSI_DBG_CONN(conn, "Sending nopout as ping\n"); + + /* First, save the time of the ping for later use */ + mutex_lock(&conn->noop_info_mutex); Cannot use a mutex here either, because this is run from a timer. Timers are run from bhs too. + do_gettimeofday(&conn->tmp_noop_tv); + mutex_unlock(&conn->noop_info_mutex); + -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.