On Wed, 9 Jul 2008 17:23:19 +0900
FUJITA Tomonori <[EMAIL PROTECTED]> wrote:

> On Wed, 09 Jul 2008 10:16:41 +0200
> Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> 
> > FUJITA Tomonori schrieb:
> > > On Wed, 09 Jul 2008 08:36:32 +0200
> > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > > 
> > >> FUJITA Tomonori schrieb:
> > >>> On Wed, 09 Jul 2008 08:03:05 +0200
> > >>> Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > >>>
> > >>>> FUJITA Tomonori schrieb:
> > >>>>> On Mon, 30 Jun 2008 10:54:48 +0200
> > >>>>> Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > >>>>>
> > >>>>>> Tomasz Chmielewski schrieb:
> > >>>>>>> ronnie sahlberg schrieb:
> > >>>>>>>> Hi Tomasz,
> > >>>>>>>>
> > >>>>>>>> I could not get that configuration to work.
> > >>>>>>>>
> > >>>>>>>> Can you please provide more detailed instructions exactly how to 
> > >>>>>>>> set
> > >>>>>>>> up hosts A B and C
> > >>>>>>>> so I can try to reproduce it.
> > >>>>>>>>
> > >>>>>>>> Please provide the exact commandline for each and every command I 
> > >>>>>>>> need
> > >>>>>>>> to run on the three hosts and Ill try to
> > >>>>>>>> reproduce it under gdb.
> > >>>>>>> A faulty RAID is just one way to crash tgtd.
> > >>>>>>>
> > >>>>>>> A simpler one is to just block the traffic between the target and 
> > >>>>>>> the 
> > >>>>>>> initiator - just login to the target, make sure there is some iSCSI 
> > >>>>>>> traffic between the target and the initiator, then block incoming 
> > >>>>>>> iSCSI 
> > >>>>>>> traffic on the initiator with:
> > >>>>>>>
> > >>>>>>> initiator# iptables -I INPUT -s <target IP> -p tcp --sport 3260 -j 
> > >>>>>>> DROP
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> After a while, you will see that only one tgtd process is running, 
> > >>>>>>> whereas the second has crashed.
> > >>>>>> Note - the above seems to be valid if:
> > >>>>>>
> > >>>>>> - there are two initiators connected (from different IPs), perhaps 
> > >>>>>> more
> > >>>>>> - there is traffic from these two initiators
> > >>>>>> - we block traffic on one of these initiators
> > >>>>>>
> > >>>>>>
> > >>>>>> I couldn't reproduce the issue with only one initiator connected.
> > >>>>> Can you provide the detailed configuration?
> > >>>>>
> > >>>>> Do you mean:
> > >>>>>
> > >>>>> 1. there are three machines, say A, B, and C.
> > >>>> yes
> > >>>>
> > >>>>> 2. you run tgtd on A and setup one target in tgtd.
> > >>>> yes
> > >>>>
> > >>>>> 3. B and C work as an initiator. They connect to A. So the target on A
> > >>>>> has two sessions.
> > >>>> yes
> > >>>>
> > >>>>> Then you block the traffic btwwen A and B, then tgtd on A dies?
> > >>>>>
> > >>>>> Right?
> > >>>> Yes, exactly like that.
> > >>>> I'm not sure if blocking traffic in both ways is needed, or is it 
> > >>>> sufficient/needed to block the traffic from the initiator to the 
> > >>>> target 
> > >>>> (and not from target to the initiator, i.e., -I OUTPUT chain).
> > >>> You block the traffic on the initiator and then on the target?
> > >> No, only on the initiator.
> > >>
> > >>
> > >>>>> I think that the output of tgtadm will enable us to understand your
> > >>>>> configuration easily.
> > >>>> What output?
> > >>> As I said, the output of tgtadm shows what tgtd has:
> > >>>
> > >>> Target 1: iqn.2001-04.org.osrg:viola
> > >>>     System information:
> > >>>         Driver: iscsi
> > >>>         State: ready
> > >> Aah, this output.
> > >>
> > >> Nothing special there - two targets configured, each target has one 
> > >> initiator coming from a different IP.
> > > 
> > > Two targets? Hmm, I thought that you have one target machine and
> > > configure one target object.
> > > 
> > > Please tell me about your target objects (configured in tgtd) and
> > > physical target machines.
> > 
> > One target machine with two (or more) targets configured, like below; 
> > here is the output - right now, only one initiator is connected; I can 
> > reproduce the issue when a second initiator connects, but I can't do it 
> > right now.
> 
> In your configuration, a second initiator connects to target 2 or
> 3. Target 1 doesn't have two initiators, right? If so, it's a bit
> different from Ronnie's configuration.

OK, I think that you guys hit the same bug. I can reproduce it with
both configurations.

I think that the problem is that conn_close() calls
iscsi_free_cmd_task against tasks in conn->tx_clist. But we have non
SCSI command tasks in conn->tx_clist (like NOOP). We can't call
cmd_hlist_remove for such tasks.

Here's a fix. Can you try this?

diff --git a/usr/iscsi/conn.c b/usr/iscsi/conn.c
index 25ad170..2e83e7a 100644
--- a/usr/iscsi/conn.c
+++ b/usr/iscsi/conn.c
@@ -85,7 +85,7 @@ void conn_close(struct iscsi_connection *conn)
 
        conn->tp->ep_close(conn);
 
-       dprintf("connection closed\n");
+       eprintf("connection closed %p\n", conn);
 
        /* may not have been in FFP yet */
        if (!conn->session)
@@ -100,28 +100,44 @@ void conn_close(struct iscsi_connection *conn)
                if (task->conn != conn)
                        continue;
 
-               dprintf("Forcing release of pending task %" PRIx64 "\n",
-                       task->tag);
+               eprintf("Forcing release of pending task %p %" PRIx64 "\n",
+                       task, task->tag);
                list_del(&task->c_list);
                iscsi_free_task(task);
        }
 
        list_for_each_entry_safe(task, tmp, &conn->tx_clist, c_list) {
-               dprintf("Forcing release of tx task %" PRIx64 "\n",
-                       task->tag);
-               iscsi_free_cmd_task(task);
+               uint8_t op;
+
+               op = task->req.opcode & ISCSI_OPCODE_MASK;
+
+               eprintf("Forcing release of tx task %p %" PRIx64 " %x\n",
+                       task, task->tag, op);
+               switch (op) {
+               case ISCSI_OP_SCSI_CMD:
+                       iscsi_free_cmd_task(task);
+                       break;
+               case ISCSI_OP_NOOP_OUT:
+               case ISCSI_OP_LOGOUT:
+               case ISCSI_OP_SCSI_TMFUNC:
+                       iscsi_free_task(task);
+                       break;
+               default:
+                       eprintf("%x\n", op);
+                       break;
+               }
        }
 
        if (conn->rx_task) {
-               dprintf("Forcing release of rx task %" PRIx64 "\n",
-                       conn->rx_task->tag);
+               eprintf("Forcing release of rx task %p %" PRIx64 "\n",
+                       conn->rx_task, conn->rx_task->tag);
                iscsi_free_task(conn->rx_task);
        }
        conn->rx_task = NULL;
 
        if (conn->tx_task) {
-               dprintf("Forcing release of tx task %" PRIx64 "\n",
-                       conn->tx_task->tag);
+               eprintf("Forcing release of tx task %p %" PRIx64 "\n",
+                       conn->tx_task, conn->tx_task->tag);
                iscsi_free_task(conn->tx_task);
        }
        conn->tx_task = NULL;
_______________________________________________
Stgt-devel mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/stgt-devel

Reply via email to