On Tue, Dec 15, 2009 at 10:34 AM, Mike Christie <[email protected]> wrote:
> Erez Zilber wrote:
>> Maintain a list of nop-out PDUs that almost timed out.
>> With this information, you can understand and debug the
>> whole system: you can check your target and see what caused
>> it to be so slow on that specific time, you can see if your
>> network was very busy during that time etc.
>>
>> Signed-off-by: Erez Zilber <[email protected]>
>>
>
> Sorry for the late reply. Thanks for doing this work!
>
>
> @@ -88,11 +89,12 @@ static struct option const long_options[] =
> {"killiscsid", required_argument, NULL, 'k'},
> {"debug", required_argument, NULL, 'd'},
> {"show", no_argument, NULL, 'S'},
> + {"noopinfo", required_argument, NULL, 'N'},
> {"version", no_argument, NULL, 'V'},
> {"help", no_argument, NULL, 'h'},
> {NULL, 0, NULL, 0},
> };
>
>
> Do you think you want something more generic like a get-debug-info type
> of feature? Maybe in the future we could also show how many times a scsi
> command has timed out or almost timed out (user can adjust scsi timer
> then) or how many times the scsi eh has fired and how many times a
> abort, lu or target reset has timed out?
>
> So maybe it would be get like get stats where then the noop info is the
> first stat and in the future we will have more?
Sounds good. Let's change it to {"debuginfo", required_argument, NULL, 'i'}. OK?
>
>
>
>
> @@ -994,8 +997,25 @@ static int iscsi_nop_out_rsp(struct iscsi_task *task,
> if (iscsi_recv_pdu(conn->cls_conn, (struct iscsi_hdr *)nop,
> data, datalen))
> rc = ISCSI_ERR_CONN_FAILED;
> - } else
> + } else {
> + ping_delay = jiffies - conn->last_ping;
> +
> + /* Check if the ping has almost timed out */
> + if (ping_delay >=
> + (session->noop_threshold * (conn->ping_timeout * HZ)) /
> + 100) {
> + mutex_lock(&conn->noop_info_mutex);
>
>
>
> We annot use a mutex here, because we run from a bottom half (softirq
> for iscsi and a tasklet for iser), and we cannot sleep in a bh since
> there is no process context.
You're right. I'll replace the mutex with a spinlock.
>
>
> + idx = conn->noop_info_arr_idx;
> + conn->noop_info_arr[idx].tv = conn->tmp_noop_tv;
> + conn->noop_info_arr[idx].elapsed_time_msec =
> ping_delay;
>
> I think ping_delay is just in jiffies, so you have to do some sort of
> jiffies_to_msec call (I cannot rememeber the name of the helper but
> there is one).
I think it should be: conn->noop_info_arr[idx].elapsed_time_msec =
ping_delay * HZ / 1000
>
>
> + conn->noop_info_arr[idx].init = 1;
> + conn->noop_info_arr_idx =
> + (conn->noop_info_arr_idx + 1) %
> NOOP_INFO_ARR_LEN;
>
>
> I am not getting the reason for the division?
Are you talking about "conn->noop_info_arr_idx =
(conn->noop_info_arr_idx + 1) % NOOP_INFO_ARR_LEN"? It's a cyclic
array.
>
>
> + mutex_unlock(&conn->noop_info_mutex);
> + }
> +
> mod_timer(&conn->transport_timer, jiffies +
> conn->recv_timeout);
> + }
> iscsi_complete_task(task, ISCSI_TASK_COMPLETED);
> return rc;
> }
>
>
>
> @@ -2026,6 +2046,12 @@ static void
> iscsi_check_transport_timeouts(unsigned long data)
> if (time_before_eq(last_recv + recv_timeout, jiffies)) {
> /* send a ping to try to provoke some traffic */
> ISCSI_DBG_CONN(conn, "Sending nopout as ping\n");
> +
> + /* First, save the time of the ping for later use */
> + mutex_lock(&conn->noop_info_mutex);
>
> Cannot use a mutex here either, because this is run from a timer. Timers
> are run from bhs too.
mutex -> spinlock.
Let me know if you're OK with my suggestions and I'll resend the patch.
Erez
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.