Hi Mike,

Just got two more questions...

First one is about the scsi_host_template->use_clustering flag, is
there any specific reason why it is disabled in open-iscsi?

Second one is the way we use sendpage() in iscsi_tcp.c. I was
wondering can we maybe send more than one pages per sendpage() call,
say we copy a few pages to one buffer fist and then send them at one
time?

Thanks a lot!

Jack

On May 5, 9:26 pm, Jack Z <brianzhang2...@gmail.com> wrote:
> Hi Mike,
>
> Thank you for your help again. Following the guidance in your reply, I
> traced the kernel code a bit more and eventually found out a possible
> path for open-iscsi to get 4K pages in the scatterlist.
>
> Kernel version: 2.6.30
> open-iscsi version: 2.0.871
>
> Trace 1: SCSI from a write request to data written into pages
>
> --> function pointer
>
> sg_fops.write --> sg_write()
>                    /       \
>       sg_common_write()<- sg_new_write()
>                    |
>               sg_start_req()
>                    |
>            blk_rq_map_user_iov()------------------
>               /   (first)  \                     |
> __bio_map_user_iov()     __bio_copy_user_iov()   |
>               \            /                     |
>              bio_add_pc_page()             (then)|
>                      |                           |
>              __bio_add_page()                    |
>                                                  |
>
> -----------------------------
> |
>                      |
>              blk_rq_bio_prep()
>
> blk_rq_map_user_iov
>
> __bio_copy_user_iov first creates new memory pages for the incoming
> data, and then calls bio_add_pc_page() (and in turn __bio_add_page) to
> insert the pages created into a structure named bio, which stands for
> block IO. And then it calls __bio_copy_iov() to copy the user data
> into the those pages. __bio_map_user_iov(), unlike
> __bio_copy_user_iov, calls bio_add_pc_page() (and in turn
> __bio_add_page) to directly map the user pages into the structure bio
> without any duplication. After the structure bio is filled with proper
> data,  blk_rq_bio_prep() is called to associate the struture bio with
> the write request.
>
> In __bio_add_page(), we see  "bvec->bv_page = page" and "bvec->bv_len
> = len". In the context of the above function calles, len should
> (mostly) be PAGE_SIZE, which is 4096 on a x86 32 bit machine. Now we
> know how user data is arranged into 4K size pages.
>
> Trace 2: From a request dequeue to data read out of SCSI buffer
>
> elv_next_request()
>        | (through funcdtion pointer q->prep_rq_fn)
> sr_prep_fn()
>        |
> scsi_setup_blk_pc_cmnd()
>        |
> scsi_init_io()
>        |
> scsi_init_sgtable
>        | (using (req->q, req, sdb->table.sgl))
> blk_rq_map_sg(struct request_queue *q, struct request *rq, struct
> scatterlist *sglist)
>
> In blk_rq_map_sg(), the pages saved in the structure bio, which is
> part of the structure request, are mapped to the parameter sglist,
> which is the scatterlist in the structure scsi_data_buffer 
> (task->sc.sdb.table.sgl in open-iscsi code). Also, we can see "nbytes = bvec-
> >bv_len" and "sg_set_page(sg, bvec->bv_page, nbytes, bvec-
> >bv_offset)" (Please note that this part takes open-iscsi option
>
> ".use_clustering = DISABLE_CLUSTERING" into consideration). The latter
> will set sc.sdb.table.sgl->length to nbytes, which is bvec->bv_len.
> From the first part we know that bvec->bv_len is PAGE_SIZE. Now we see
> why the size of the elements in the scatterlist used in open-iscsi is
> 4096, which is the PAGE_SIZE on x86-32 machines.
>
> And in iscsi_tcp.c, we can have "r = tcp_sw_conn->sendpage(sk,
> sg_page(sg), offset, copy, flags)". Since sg_page(sg) returns one page
> in the scatterlist, it explains why open-iscsi tries to send 4096
> bytes at one time on x86-32 machines.
>
> On May 5, 10:47 am, Mike Christie <micha...@cs.wisc.edu> wrote:
>
>
>
>
>
> > On 05/03/2010 06:51 AM, Jack Z wrote:
>
> > > Hi group,
>
> > > I have been tracing the code related to sending PDUs from iscsi
> > > initiator (ver 2.0-871).
>
> > > And through some printk()s i realize that starting from
> > > iscsi_sw_tcp_pdu_init(), all the functions using scatterlist (struct
> > > scatterlist *sg) seem to use 4096 as the length (sg->length).
>
> > > But I was not able to trace down where this 4096 is initially assigned
> > > to sg->length... I searched through the code for "4096" and only two
> > > spots came up:  ".sg_tablesize = 4096" in struct scsi_host_template
> > > iscsi_sw_tcp_sht and "#define ISCSI_TOTAL_CMDS_MAX 4096". But changing
> > > these two values did not affect the sg->length value, which was still
> > > 4096.
>
> > > I was guessing this 4096 had something to do with the fs block size
> > > and this value was somewhat from "struct scsi_data_buffer *sdb =
> > > scsi_out(task->sc);" in iscsi_sw_tcp_pdu_init()... but still don't
> > > have a clue about how and why iscsi initiator gets this value as the
> > > length for the scatterlist...
>
> > > Could anyone maybe explain a bit or point me to some relevant
> > > document?
>
> > The fs/block layer is going to send down some struct called a bio, which
> > has a mapping of pages to some sectors to read/write. The block layer's
> > elevator code is then going to try and make large IO requests by merging
> > bios. So if there was a bio to read sectors 0 - 7 into page0 and a bio
> > to read sectors 8 - 15 into page1, then they would be merged into the
> > same request to read sector 0 - 15.
>
> > At some point this request is then sent to the scsi layer, which will
> > use some block layer helper to create a scatterlist from the pages in
> > the requests's bios. The sg->page pointer points to the first page in a
> > group of pages that are contiguous in memory, and sg->length is then the
> > total length in bytes of all those pages. So in my example, if the 2
> > pages in each bio were next to each other then they could be merged into
> > 1 sg entry. This does not happen for iscsi_tcp though. In your case, you
> > see sg->lenth as at most 4096 because iscsi_tcp only supports 1 page per
> > sg entry and the page size on your arch is PAGE_SIZE=4096 (we set the
> > scsi_host_template->use_clustering flag to indicate that we only want
> > one page per sg entry btw).
>
> > Next is where sg_tablesize comes into play. Here, we are setting it to
> > 4096 to indicate that at most we want 4096 entries on that scatterlist
> > that is made (4096 being the page size and sg_tablesize is just a
> > coincidence). So for us in your setup we can have at most 4096 sg
> > entries, with each entry having 4096 bytes. We could actually have a
> > smaller sg list, because there are other settings that limit the size of
> > the request like the sht->max_sectors.
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "open-iscsi" group.
> > To post to this group, send email to open-is...@googlegroups.com.
> > To unsubscribe from this group, send email to 
> > open-iscsi+unsubscr...@googlegroups.com.
> > For more options, visit this group 
> > athttp://groups.google.com/group/open-iscsi?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "open-iscsi" group.
> To post to this group, send email to open-is...@googlegroups.com.
> To unsubscribe from this group, send email to 
> open-iscsi+unsubscr...@googlegroups.com.
> For more options, visit this group 
> athttp://groups.google.com/group/open-iscsi?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to