Mike Christie wrote:
Alexander Nezhinsky wrote:
Mike Christie wrote:
tgt/stgt is probably two frameworks from your point of view. There is a kernel part for target LLDs to hook into. The kernel part is similar to
scsi-ml, actually it builds onto it and uses some of the scsi-ml
functions, and provides code to share for tasks like creating scatter
lists and mapping commands between the kernel and userspace. The target LLD basically handles lower level issues like DMAing the data, transport
issues, etc, pretty much what a scsi-ml initiator driver does. For
iscsi, the tgt lld performs similar tasks as the initiator. It parses
the iscsi PDUs or puts them on the interconnect, handles session and
connection manamgement (this would be done like open-iscsi though), but
then then passes the scsi command to tgt's kernel code.

The other part of the framework is the userspace component. The tgt
kernel component basically passes scsi commands and task management
functions to a userspace daemon. The daemon contains the scsi state
machine and execute the IO. When it is done it informs the the kernel
component which in turn maps the data into the kernel, forms scatter
lists, and then passes them to the target LLD to send out.
In the cited paper's abstract you wrote:
> In order to provide block I/O services, users have had to use modified kernel code, > binary kernel modules, or specialized hardware. With Linux now having iSCSI, > Fibre Channel, and RDMA initiator support, Linux target framework (tgt) aims to > fill the gap in storage functionality by consolidating several storage target drivers...

So i guess one of the added values (if not the main one) of implementing the entire scsi command interface of tgt in userspace is gaining easy access to block I/O drivers. But the block I/O subsystem has a clear intra-kernel interface. If the kernel part of

Which interface are you referring to? bio, REQ_PC or REQ_BLOCK_PC, or read/write so you can take advantage of the kernel cache?

tgt would anyway allocate memory and build the scatter-gather lists, it could pass the commands along with the buffer descriptors down to the storage stack, addressing either the appropriate block I/O driver or scsi-ml itself. This extra code should be quite thin, it uses only existing interfaces, makes no modification to the existing
kernel code.

Some of those options above require minor changes or you have to work around them in your own code.

The user space code can do all the administration stuff,
and specifically
choosing the right driver and passing to the kernel part all necessary identification
and configuration info about it.


Actually we did this, and it was not acceptable to the scsi maintainer:

For example we could send IO by:

1. we use the sg io kernel interfaces to do a passthrough type of interface.

2. read/write to device from kernel

If you look at the different trees on that berili site you will see different versions of this. And it sends up being the same amount of code. See below.


Are there other reasons for pushing SCSI commands from kernel to user space and
performing them from there?


By pushing it to userspace you please the kernel reviewers and there is not a major difference in performance (not that we have found yet). Also when pushing it to userspace we use the same API that is used to execute SG_IO request and push its data between the kernel and userspace so it is not like we are creating something completely new. Just hooking up some pieces. The major new part is the netlink interface which is a couple hundred lines. Some of that interface is for management though.


That is not really answering your question... Besides the kernel reviewers, the problem with using REQ_PC or REQ_BLOCK_PC and bios is that you cannot reuse the kernel's caching layer. By doing it in userspace you can do a mmap, advantage of the kernels caching code and it is async. To do the same thing in the kernel you have to either create a thread to do each read/write, hook in the async read/write interface in the kernel (which may be nicer to do now, but was not when we looked at it), or implement your own cache layer and I do not think that would be easy to merge.
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to