Re: [Lsf] It's time to put together the schedule

Sagi Grimberg Wed, 25 Feb 2015 09:44:34 -0800

On 2/25/2015 12:36 PM, Mike Christie wrote:

On 02/22/2015 09:25 PM, Mike Christie wrote:

I just hit a bug in the userspace code. Will send that later.


Hey Sagi,

Attached is the userspace patch, user-mq6.patch. It is made over
0001-iscsid-make-sure-actor-is-delated-before-reschedulin.patch (the
patch to fix that double schedule bug you guys found).

I am also attaching a updated kernel patch. It has some fixes for logout
and iscsi_tcp mq setup.

To use the patches, just set the new iscsid.conf setting
"node.session.queue_ids". It is just a string of ints:

node.session.queue_ids = 1 2 4 8

that get passed in to the kernel. For each id, iscsid will create a
session and have the LLD map whatever they want to that id value. Login
is the same:

iscsiadm -m node -T yourtargget -p ip --login

However, after you login you have to manually scan

iscsiadm -m session --rescan


For logout, you currently have to make sure you logout all the sessions,
so use:

iscsiadm -m node -T yourtargget -p ip --logout

or

iscsiadm -m session --logout

If you just pass in a specific session id like here:

iscsiadm -m session -r SID --logout

then that will wait for all the other sessions in the group to be logged
out before completion the task. I did this because I was not yet sure
how to handle dynamic hctx updates in the kernel.


For the LLD implementation, I hooked in iscsi_tcp to the session/group
creation code. Like I said before, I was not sure what every
driver/fw/hw was going to map to, so the queue id that is getting passed
into the session/connection/ep creation functions is really generic and
you can map it to whatever you like right now.

For ib_iser, you should look at iscsi_tcp.c's create_session_grp and
destroy_session_grp callouts to see how to allocate the host in a
backward compatible way.


I'll do that.

Sofware iscsi/iser is doing a host per session
still, then doing a session_grp per host and multiple sessions per
group. HW iscsi offload will continue to do a host per some hw/fw
resource, then it can have multiple groups and multiple sessions per group.

I am passing in the queue_id to bind to in every object callout
(ep_connect, conn_create, session_create), because I was not sure at
what time all the drivers needed to bind/setup-mappings at. So pick
which ever makes sense and let me know.

I have not had time to break this into a proper patchset. Was not ready
to send as a RFC set. There is debugging and // comments in places, but
feel free to give me any feedback.


That's not a problem, we will get it ready for submission together...


If you did get my other mails/patches a while back then make sure you
are using the new userspace patches/tools in this mail with the updated
kernel patch in this mail. I have not yet added kernel/user compat code,
so you will hit hangs/crashes if you mix and match.


Thanks Mike, I'm working with upstream both in user and kernel.

Couple of quick first comments:

- Passing a list in node.session.queue_ids indeed allows the user
  a degree of freedom, but it might be an overshoot. We should allow
  giving a range type of queue_ids and the default should be a range
  [0-default_nr_queues].
  On a side note, I suspect we will pretty soon find out that this
  linear assignment will be the only useful setting and leave only
  nr_queues setting.

- In the single queue case, we need to pass the kernel a default
  WILDCARD_QUEUE_ID so drivers can spread the completion contexts of
  each session as they do today (and don't introduce a performance
  regression).

- About the queue_id. sw_tcp will need it for the TX/RX threads
  CPU binding, so it is used as a conn attribute. iser (and I assume
  other offloads as well) will need it for MSIX vector assignments, so
  it is used as an endpoint attribute. The session is completely
  logical, thus it should not hold a queue_id assignment (IMO at least).

- The session group should allocate session_map to only hold the number
  of sessions it was passed with (not nr_cpu_ids with possible holes).
  The session selection is based on the mq mapping, thus it should be a
  1x1 mapping to hctx. So it is basically boils down to:

  idx = sc->request->q->mq_map[sc->request->mq_ctx->cpu];
  session = grp->session_map[idx];

  and when we will properly use block layer tagging:

  tag = blk_mq_unique_tag(sc->request);
  idx = blk_mq_unique_tag_to_hwq(tag);
  session = grp->session_map[idx];

  So I guess my point is, we should not assign a queue_id to a session,
  the ep/conn queue_id was used at establishment for context assignment.

- About shared tags. So for scsi commands and TMFs we don't have a
  problem since we are guaranteed ITTs are unique. I wander how will we
  allocate a unique ITT for iscsi specific tasks (LOGIN, TEXT, LOGOUT,
  NOOP_OUT). My implementation did it per-session, so I reserved a range
  of ITTs for iscsi specific commands (in a kfifo), I wander how we can
  do that for multiple sessions. We need some kind of tag allocator for
  iscsi specific commands. Perhaps Bart/Christoph can advise if a LLD
  can allocate a unique tag for an LLD specific command using block
  layer tags.

Let me work on some of the things mentioned here.

Cheers,
Sagi.

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Re: [Lsf] It's time to put together the schedule

Reply via email to