Hi,

My comments are interspersed in the email below.


On Thu, 18 May 2000 21:47:59 EDT, "Eric Youngdale" <[EMAIL PROTECTED]> wrote:
>
>> I've only implemented/tested it against scsi-disks to date.
>>
>> Here's the patch against a 2.3.99-pre2 kernel. To recap, two primary
>reasons
>> for this patch:
>> 1. To enhance the queueing and scsi-mid layers to handle kiobuf-based
>>    requests as well,
>>
>> 2. Remove request size limits on the upper layers (above ll_rw_blk.c).
>>    The KIO_MAX_SECTORS seems to have been inspired by MAX_SECTORS
>>    (128 per request) in ll_rw_blk.c. The scsi mid-layer should handle
>>    `oversize' requests based on the HBA sg_tablesize.
>>
>> I'm not too sure about 2. above; so I'd love to hear from more
>knowledgeable
>> people on that score.
>>
>> I'd highly appreciate any feedback before I submit this patch
>`officially'.
>
>    I guess my main comment is that you (or someone) re-indented some of the
>files for some reason, and as a result it is harder to read the patches and
>figure out what they really do.  Is there any way you can fix it to use the
>standard indentation?  Alan would remember the parameters to indent that he
>used when he re-indented a lot of the SCSI code.

Thats interesting...I was using emacs to edit the files and it picks up the
local variables (indentation among other things) to set for the buffer, 
specified at the end of these files. It shouldn't have changed the indentation.


>    Alternatively, you could use "diff -b" to ignore whitespace
>differences - this usually causes diff to ignore indentation changes.

I'm resending the patches out with a "diff -abruN" for the files. That should 
sort things out.

>    My other comment is that it looked like you cloned a lot of code from
>the existing buffered request handling and I am wondering if there is any
>way that we can reduce the amount of duplicated code.  I don't understand
>the new code well enough to say that it could be done, but it is something
>to look at.

Yes, especially in the __make_kio_request() function, if that is what you
are talking about. I'd originally tried to do it within the existing functions
(ll_rw_block() and __make_request()) but decided against it for the following
reasons:

        - the functions became (almost infuriatingly) riddled with my 
          #if CONFIG_KIOBUF_IO cases 

        - was too long to easily follow the logic in the function 

        - was better encapsulated as a separate function for the time (in
          the not too distant future, hopefully) when the buffer_head based
          code may be removed from the kernel?

>    One nitpick - you made a change to sd.c that didn't look correct at
>first:
>
>@@ -542,7 +543,12 @@
>  (SCpnt->sense_buffer[4] << 16) |
>  (SCpnt->sense_buffer[5] << 8) |
>  SCpnt->sense_buffer[6];
>- if (SCpnt->request.bh != NULL)
>+#if CONFIG_KIOBUF_IO
>+ /* Tweak to support kiobuf-based I/O requests, [[EMAIL PROTECTED]] */
>+ if (SCpnt->request.kiobuf != NULL)
>+ block_sectors = SCpnt->request.kiobuf->length >> 9;
>+ else if (SCpnt->request.bh != NULL)
>+#endif
>  block_sectors = SCpnt->request.bh->b_size >> 9;
>  switch (SCpnt->device->sector_size) {
>  case 1024:
>
>I am really not wild about the way this is structured with the conditional
>compilation of a part of a conditional - it is just asking for someone to go
>in and break it.

Point taken. I've removed the "conditional compilation of a part of the 
conditional" and instead have the conditional compilation of a conditional
only. 

Hopefully, the patch should be more readable this time around.
I much appreciate this prompt feedback....Thanks much!

Cheers,
-Chait.


--- linux-2.3pure/linux/drivers/block/ll_rw_blk.c       Thu May 18 12:21:40 2000
+++ linux-2.3rawio/linux/drivers/block/ll_rw_blk.c      Thu May 18 10:48:57 2000
@@ -4,6 +4,7 @@
  * Copyright (C) 1991, 1992 Linus Torvalds
  * Copyright (C) 1994,      Karl Keyte: Added support for disk statistics
  * Elevator latency, (C) 2000  Andrea Arcangeli <[EMAIL PROTECTED]> SuSE
+ * Added support for kiobuf-based I/O requests: Chaitanya Tumuluri [[EMAIL PROTECTED]]
  */
 
 /*
@@ -656,7 +657,10 @@
                        starving = 1;
                if (latency < 0)
                        continue;
-
+#if CONFIG_KIOBUF_IO
+               if (req->kiobuf)
+                       continue;
+#endif
                if (req->sem)
                        continue;
                if (req->cmd != rw)
@@ -761,6 +765,9 @@
        req->nr_hw_segments = 1; /* Always 1 for a new request. */
        req->buffer = bh->b_data;
        req->sem = NULL;
+#if CONFIG_KIOBUF_IO
+       req->kiobuf = NULL; 
+#endif
        req->bh = bh;
        req->bhtail = bh;
        req->q = q;
@@ -903,6 +910,310 @@
        __ll_rw_block(rw, nr, bh, 1);
 }
 
+#if CONFIG_KIOBUF_IO
+/*
+ * Function:    __make_kio_request()
+ *
+ * Purpose:     Construct a kiobuf-based request and insert into request queue.
+ *
+ * Arguments:   q      - request queue of device
+ *              rw      - read/write
+ *              kiobuf  - collection of pages 
+ *             dev     - device against which I/O requested
+ *             blocknr - dev block number at which to start I/O
+ *              blksize - units (512B or other) of blocknr
+ *
+ * Lock status: No lock held upon entry.
+ *  
+ * Returns:     Nothing
+ *
+ * Notes:       Requests generated by this function should _NOT_  be merged by
+ *             the __make_request() (new check for `req->kiobuf')
+ *
+ *             All (relevant) req->Y parameters are expressed in sector size 
+ *             of 512B for kiobuf based I/O. This is assumed in the scsi
+ *             mid-layer as well.
+ */    
+static inline void __make_kio_request(request_queue_t * q,
+                                     int rw,
+                                     struct kiobuf * kiobuf,
+                                     kdev_t dev,
+                                     unsigned long blocknr,
+                                     size_t blksize)
+{
+       int major = MAJOR(dev);
+       unsigned int sector, count, nr_bytes, total_bytes, nr_seg;
+       struct request * req;
+       int rw_ahead, max_req;
+       unsigned long flags;
+       struct list_head * head = &q->queue_head;
+       size_t curr_offset;
+       int orig_latency;
+       elevator_t * elevator;
+       int correct_size, i, kioind;
+       
+       /*
+        * Sanity Tests:        
+        *
+        * The input arg. `blocknr' is in units of the 
+        * input arg. `blksize' (inode->i_sb->s_blocksize).
+        * Convert to 512B unit used in blk_size[] array.
+        */
+       count = kiobuf->length >> 9; 
+       sector = blocknr * (blksize >> 9); 
+
+       if (blk_size[major]) {
+               unsigned long maxsector = (blk_size[major][MINOR(dev)] << 1) + 1;
+
+               if (maxsector < count || maxsector - count < sector) {
+                       if (!blk_size[major][MINOR(dev)]) {
+                               kiobuf->errno = -EINVAL;
+                               goto end_io;
+                       }
+                       /* This may well happen - the kernel calls bread()
+                          without checking the size of the device, e.g.,
+                          when mounting a device. */
+                       printk(KERN_INFO
+                               "attempt to access beyond end of device\n");
+                       printk(KERN_INFO "%s: rw=%d, want=%d, limit=%d\n",
+                               kdevname(dev), rw,
+                              (sector + count)>>1,
+                              blk_size[major][MINOR(dev)]);
+                       kiobuf->errno = -ESPIPE;
+                       goto end_io;
+               }
+       }
+       /*
+        * Allow only basic block size multiples in the
+        * kiobuf->length. 
+        */
+       correct_size = BLOCK_SIZE;
+       if (blksize_size[major]) {
+               i = blksize_size[major][MINOR(dev)];
+               if (i)
+                       correct_size = i;
+       }
+       if ((kiobuf->length % correct_size) != 0) {
+               printk(KERN_NOTICE "ll_rw_kio: "
+                      "request size [%d] not a multiple of device [%s] block-size 
+[%d]\n",
+                      kiobuf->length,
+                      kdevname(dev),
+                      correct_size);
+               kiobuf->errno = -EINVAL;
+               goto end_io;
+       }
+       rw_ahead = 0;   /* normal case; gets changed below for READA */
+       switch (rw) {
+               case READA:
+                       rw_ahead = 1;
+                       rw = READ;      /* drop into READ */
+               case READ:
+                       kstat.pgpgin++;
+                       max_req = NR_REQUEST;   /* reads take precedence */
+                       break;
+               case WRITERAW:
+                       rw = WRITE;
+                       goto do_write;  /* Skip the buffer refile */
+               case WRITE:
+               do_write:
+                       /*
+                        * We don't allow the write-requests to fill up the
+                        * queue completely:  we want some room for reads,
+                        * as they take precedence. The last third of the
+                        * requests are only for reads.
+                        */
+                       kstat.pgpgout++;
+                       max_req = (NR_REQUEST * 2) / 3;
+                       break;
+               default:
+                       BUG();
+                       kiobuf->errno = -EINVAL;
+                       goto end_io;
+       }
+
+       /*
+        * Creation of bounce buffers for data in high memory
+        * should (is) be handled lower in the food-chain.
+        * Ccurrently done in scsi_merge.c for scsi disks.
+        *
+        * Look for a free request with spinlock held.
+        * Apart from atomic queue access, it prevents
+        * another thread that has already queued a kiobuf-request
+        * into this queue from starting it, till we are done.
+        */
+       elevator = &q->elevator;
+       orig_latency = elevator_request_latency(elevator, rw);
+       spin_lock_irqsave(&io_request_lock,flags);
+       
+       if (list_empty(head))
+               q->plug_device_fn(q, dev);
+       /*
+        * The scsi disk and cdrom drivers completely remove the request
+        * from the queue when they start processing an entry.  For this
+        * reason it is safe to continue to add links to the top entry
+        * for those devices.
+        *
+        * All other drivers need to jump over the first entry, as that
+        * entry may be busy being processed and we thus can't change
+        * it.
+        */
+       if (q->head_active && !q->plugged)
+               head = head->next;
+
+       /* find an unused request. */
+       req = get_request(max_req, dev);
+
+       /*
+        * if no request available: if rw_ahead, forget it,
+        * otherwise try again blocking..
+        */
+       if (!req) {
+               spin_unlock_irqrestore(&io_request_lock,flags);
+               if (rw_ahead){
+                       kiobuf->errno = -EBUSY;
+                       goto end_io;
+               }
+               req = __get_request_wait(max_req, dev);
+               spin_lock_irqsave(&io_request_lock,flags);
+
+               /* revalidate elevator */
+               head = &q->queue_head;
+               if (q->head_active && !q->plugged)
+                       head = head->next;
+       }
+
+       /* fill up the request-info, and add it to the queue */
+       req->cmd = rw;
+       req->errors = 0;
+       req->sector = sector;
+       req->nr_hw_segments = 1;                /* Always 1 for a new request. */
+       req->nr_sectors = count;                /* Length of kiobuf */
+       req->sem = NULL;
+       req->kiobuf = kiobuf; 
+       req->bh = NULL;       
+       req->bhtail = NULL;   
+       req->q = q;
+       /* Calculate req->buffer */
+       curr_offset = kiobuf->offset;
+       for (kioind=0; kioind<kiobuf->nr_pages; kioind++)
+               if (curr_offset >= PAGE_SIZE)   
+                       curr_offset -= PAGE_SIZE;
+               else    
+                       break;
+       req->buffer = (char *) page_address(kiobuf->maplist[kioind]) +
+            curr_offset; 
+
+       /* Calculate current_nr_sectors and # of scatter gather segments needed */
+       total_bytes = kiobuf->length;
+       nr_bytes = (PAGE_SIZE - curr_offset) > total_bytes ?
+            total_bytes : (PAGE_SIZE - curr_offset);
+       req->current_nr_sectors = nr_bytes >> 9; 
+       
+       for (nr_seg = 1;
+            kioind<kiobuf->nr_pages && nr_bytes != total_bytes;
+            kioind++) {
+            ++nr_seg;
+            if((nr_bytes + PAGE_SIZE) > total_bytes){
+                 break;
+            } else {
+                 nr_bytes += PAGE_SIZE;
+            }  
+       }       
+       req->nr_segments = nr_seg;
+
+       add_request(q, req, head, orig_latency);
+       elevator_account_request(elevator, req);
+
+       spin_unlock_irqrestore(&io_request_lock, flags);
+
+end_io:
+       return;
+}
+
+
+
+/*
+ * Function:    ll_rw_kio()
+ *
+ * Purpose:     Insert kiobuf-based request into request queue.
+ *
+ * Arguments:   rw      - read/write
+ *              kiobuf  - collection of pages
+ *             dev     - device against which I/O requested
+ *             blocknr - dev block number at which to start I/O
+ *              sector  - units (512B or other) of blocknr
+ *              error   - return status
+ *
+ * Lock status: Assumed no lock held upon entry.
+ *             Assumed that the pages in the kiobuf ___ARE LOCKED DOWN___.
+ *
+ * Returns:     Nothing
+ *
+ * Notes:       This function is called from any subsystem using kiovec[]
+ *             collection of kiobufs for I/O (e.g. `pagebufs', raw-io). 
+ *             Relies on "kiobuf" field in the request structure.
+ */    
+void ll_rw_kio(int rw,
+              struct kiobuf *kiobuf,
+              kdev_t dev,
+              unsigned long blocknr,
+              size_t sector,
+              int *error)
+{
+       request_queue_t *q;
+       /*
+        * Only support SCSI disk for now.
+        *
+        * ENOSYS to indicate caller
+        * should try ll_rw_block()
+        * for non-SCSI (e.g. IDE) disks.
+        */
+       if (!SCSI_DISK_MAJOR(MAJOR(dev))){
+               *error = -ENOSYS;
+               goto end_io;
+       }
+       /*
+        * Sanity checks
+        */
+       q = blk_get_queue(dev);
+       if (!q) {
+               printk(KERN_ERR
+       "ll_rw_kio: Trying to read nonexistent block-device %s\n",
+               kdevname(dev));
+               *error = -ENODEV;
+               goto end_io;
+       }
+       if ((rw & WRITE) && is_read_only(dev)) {
+               printk(KERN_NOTICE "Can't write to read-only device %s\n",
+                      kdevname(dev));
+               *error = -EPERM;
+               goto end_io;
+       }
+       if (q->make_request_fn) {
+               printk(KERN_ERR
+       "ll_rw_kio: Unexpected device [%s] queueing function encountered\n",
+               kdevname(dev));
+               *error = -ENOSYS;
+               goto end_io;
+       }
+       
+       __make_kio_request(q, rw, kiobuf, dev, blocknr, sector);
+       if (kiobuf->errno != 0) {
+               *error = kiobuf->errno;
+               goto end_io;
+       }
+       
+       return;
+end_io:
+       /*
+        * We come here only on an error so, just set
+        * kiobuf->errno and call the completion fn.
+        */
+       if(kiobuf->errno == 0)
+               kiobuf->errno = *error;
+}
+#endif /* CONFIG_KIOBUF_IO */
+
 #ifdef CONFIG_STRAM_SWAP
 extern int stram_device_init (void);
 #endif
@@ -1085,3 +1396,7 @@
 EXPORT_SYMBOL(blk_queue_pluggable);
 EXPORT_SYMBOL(blk_queue_make_request);
 EXPORT_SYMBOL(generic_make_request);
+#if CONFIG_KIOBUF_IO
+EXPORT_SYMBOL(__make_kio_request);
+EXPORT_SYMBOL(ll_rw_kio);
+#endif
--- linux-2.3pure/linux/drivers/scsi/Config.in  Thu May 18 12:21:58 2000
+++ linux-2.3rawio/linux/drivers/scsi/Config.in Mon May 15 15:44:09 2000
@@ -20,6 +20,7 @@
 
 #if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
    bool '  Enable extra checks in new queueing code' CONFIG_SCSI_DEBUG_QUEUES
+   bool '  Enable kiobuf-based I/O request handling' CONFIG_KIOBUF_IO
 #fi
 
 bool '  Probe all LUNs on each SCSI device' CONFIG_SCSI_MULTI_LUN
--- linux-2.3pure/linux/drivers/scsi/scsi_lib.c Thu May 18 12:22:01 2000
+++ linux-2.3rawio/linux/drivers/scsi/scsi_lib.c        Thu May 18 19:51:33 2000
@@ -15,6 +15,8 @@
  * a low-level driver if they wished.   Note however that this file also
  * contains the "default" versions of these functions, as we don't want to
  * go through and retrofit queueing functions into all 30 some-odd drivers.
+ *
+ * Added support for kiobuf-based I/O requests. [Chaitanya Tumuluri, [EMAIL PROTECTED]]
  */
 
 #define __NO_VERSION__
@@ -369,6 +371,161 @@
        spin_unlock_irqrestore(&io_request_lock, flags);
 }
 
+
+/*
+ * Function:    __scsi_collect_bh_sectors()
+ *
+ * Purpose:     Helper routine for __scsi_end_request() to mark some number
+ *             (or all, if that is the case) of sectors complete.
+ *
+ * Arguments:   req      - request struct. from scsi command block.
+ *              uptodate - 1 if I/O indicates success, 0 for I/O error.
+ *              sectors  - number of sectors we want to mark.
+ *             leftovers- indicates if any sectors were not done.
+ *
+ * Lock status: Assumed that lock is not held upon entry.
+ *
+ * Returns:     Nothing
+ *
+ * Notes:      Separate buffer-head processing from kiobuf processing
+ */
+__inline static void __scsi_collect_bh_sectors(struct request *req,
+                                              int uptodate,
+                                              int sectors,
+                                              char **leftovers)
+{
+       struct buffer_head *bh;
+       
+       do {
+               if ((bh = req->bh) != NULL) {
+                       req->bh = bh->b_reqnext;
+                       req->nr_sectors -= bh->b_size >> 9;
+                       req->sector += bh->b_size >> 9;
+                       bh->b_reqnext = NULL;           
+                       sectors -= bh->b_size >> 9;
+                       bh->b_end_io(bh, uptodate);
+                       if ((bh = req->bh) != NULL) {
+                               req->current_nr_sectors = bh->b_size >> 9;
+                               if (req->nr_sectors < req->current_nr_sectors) {
+                                       req->nr_sectors = req->current_nr_sectors;
+                                       printk("collect_bh: buffer-list destroyed\n");
+                               }       
+                       }       
+               }
+       } while (sectors && bh);
+
+       /* Check for leftovers */
+       if (req->bh) {
+               *leftovers = req->bh->b_data;
+       }       
+       return; 
+
+}      
+
+#if CONFIG_KIOBUF_IO
+/*
+ * Function:    __scsi_collect_kio_sectors()
+ *
+ * Purpose:     Helper routine for __scsi_end_request() to mark some number
+ *             (or all) of the I/O sectors and attendant pages complete.
+ *             Updates the request nr_segments, nr_sectors accordingly.
+ *
+ * Arguments:   req      - request struct. from scsi command block.
+ *              uptodate - 1 if I/O indicates success, 0 for I/O error.
+ *              sectors  - number of sectors we want to mark.
+ *             leftovers- indicates if any sectors were not done.
+ *
+ * Lock status: Assumed that lock is not held upon entry.
+ *
+ * Returns:     Nothing
+ *
+ * Notes:      Separate buffer-head processing from kiobuf processing.
+ *             We don't know if this was a single or multi-segment sgl
+ *             request. Treat it as though it were a multi-segment one.
+ */
+__inline static void __scsi_collect_kio_sectors(struct request *req,
+                                              int uptodate,
+                                              int sectors,
+                                              char **leftovers)
+{
+       int pgcnt, nr_pages;
+       size_t curr_offset;
+       unsigned long va = 0;
+       unsigned int nr_bytes, total_bytes, page_sectors;
+       
+       nr_pages = req->kiobuf->nr_pages;
+       total_bytes = (req->nr_sectors << 9);
+       curr_offset = req->kiobuf->offset;
+
+       /*
+        * In the case of leftover requests, the kiobuf->length
+        * remains the same, but req->nr_sectors would be smaller.
+        * Adjust curr_offset in this case. If not a leftover,
+        * the following makes no difference.
+        */
+       curr_offset += (((req->kiobuf->length >> 9) - req->nr_sectors) << 9);
+
+       /* How far into the kiobuf is the offset? */
+       for (pgcnt=0; pgcnt<nr_pages; pgcnt++) {
+               if(curr_offset >= PAGE_SIZE) {
+                       curr_offset -= PAGE_SIZE;
+                       continue;
+               } else {
+                       break;
+               }
+       }
+       /*              
+        * Reusing the pgcnt and va value from above:
+        * Harvest pages to account for number of sectors 
+        * passed into function. 
+        */
+       for (nr_bytes = 0;
+            pgcnt<nr_pages && nr_bytes != total_bytes;
+            pgcnt++) {
+               va = page_address(req->kiobuf->maplist[pgcnt])
+                       + curr_offset;
+               /* First page or final page? Partial page? */
+               if (curr_offset != 0) {
+                    page_sectors = (PAGE_SIZE - curr_offset) > total_bytes ?
+                         total_bytes >> 9 : (PAGE_SIZE - curr_offset) >> 9;
+                    curr_offset = 0;
+               } else if((nr_bytes + PAGE_SIZE) > total_bytes) {
+                       page_sectors = (total_bytes - nr_bytes) >> 9;
+               } else {
+                       page_sectors = PAGE_SIZE >> 9;
+               }
+               nr_bytes += (page_sectors << 9);
+               /* Leftover sectors in this page (onward)? */
+               if (sectors < page_sectors) {
+                       req->nr_sectors -= sectors;
+                       req->sector += sectors;
+                       req->current_nr_sectors = page_sectors - sectors;
+                       va += (sectors << 9); /* Update for req->buffer */
+                       sectors = 0;
+                       break;
+               } else {
+                       /* Mark this page as done */
+                       req->nr_segments--;   /* No clustering for kiobuf */ 
+                       req->nr_sectors -= page_sectors;
+                       req->sector += page_sectors;
+                       if (!uptodate && (req->kiobuf->errno != 0)){
+                            req->kiobuf->errno = -EIO;
+                       }
+                       sectors -= page_sectors;
+               }
+       }
+       
+       /* Check for leftovers */
+       if (req->nr_sectors) {
+               *leftovers = (char *)va;
+       } else if (req->kiobuf->end_io) {
+            req->kiobuf->end_io(req->kiobuf);
+       }
+       
+       return; 
+}      
+#endif
+
 /*
  * Function:    scsi_end_request()
  *
@@ -396,7 +553,7 @@
                                     int requeue)
 {
        struct request *req;
-       struct buffer_head *bh;
+       char * leftovers = NULL;
 
        ASSERT_LOCK(&io_request_lock, 0);
 
@@ -406,39 +563,32 @@
                printk(" I/O error: dev %s, sector %lu\n",
                       kdevname(req->rq_dev), req->sector);
        }
-       do {
-               if ((bh = req->bh) != NULL) {
-                       req->bh = bh->b_reqnext;
-                       req->nr_sectors -= bh->b_size >> 9;
-                       req->sector += bh->b_size >> 9;
-                       bh->b_reqnext = NULL;
-                       sectors -= bh->b_size >> 9;
-                       bh->b_end_io(bh, uptodate);
-                       if ((bh = req->bh) != NULL) {
-                               req->current_nr_sectors = bh->b_size >> 9;
-                               if (req->nr_sectors < req->current_nr_sectors) {
-                                       req->nr_sectors = req->current_nr_sectors;
-                                       printk("scsi_end_request: buffer-list 
destroyed\n");
-                               }
-                       }
-               }
-       } while (sectors && bh);
 
+       leftovers = NULL;
+       if (req->bh != NULL) {          /* Buffer head based request */
+               __scsi_collect_bh_sectors(req, uptodate, sectors, &leftovers);
+       }
+#if CONFIG_KIOBUF_IO
+       else if (req->kiobuf != NULL) { /* Kiobuf based request */
+               __scsi_collect_kio_sectors(req, uptodate, sectors, &leftovers);
+       } else {
+            panic("Both bh and kiobuf pointers are unset in request!\n");
+       }
+#endif
        /*
         * If there are blocks left over at the end, set up the command
         * to queue the remainder of them.
         */
-       if (req->bh) {
+       if (leftovers != NULL) {
                 request_queue_t *q;
 
-               if( !requeue )
-               {
+               if( !requeue ) {
                        return SCpnt;
                }
 
                 q = &SCpnt->device->request_queue;
 
-               req->buffer = bh->b_data;
+               req->buffer = leftovers;
                /*
                 * Bleah.  Leftovers again.  Stick the leftovers in
                 * the front of the queue, and goose the queue again.
--- linux-2.3pure/linux/drivers/scsi/scsi_merge.c       Thu May 18 12:22:01 2000
+++ linux-2.3rawio/linux/drivers/scsi/scsi_merge.c      Thu May 18 19:56:20 2000
@@ -6,6 +6,7 @@
  *                        Based upon conversations with large numbers
  *                        of people at Linux Expo.
  *     Support for dynamic DMA mapping: Jakub Jelinek ([EMAIL PROTECTED]).
+ *      Support for kiobuf-based I/O requests. [Chaitanya Tumuluri, [EMAIL PROTECTED]]
  */
 
 /*
@@ -90,13 +91,14 @@
        printk("nr_segments is %x\n", req->nr_segments);
        printk("counted segments is %x\n", segments);
        printk("Flags %d %d\n", use_clustering, dma_host);
-       for (bh = req->bh; bh->b_reqnext != NULL; bh = bh->b_reqnext) 
-       {
+       if (req->bh != NULL) {
+              for (bh = req->bh; bh->b_reqnext != NULL; bh = bh->b_reqnext) {  
                printk("Segment 0x%p, blocks %d, addr 0x%lx\n",
                       bh,
                       bh->b_size >> 9,
                       virt_to_phys(bh->b_data - 1));
        }
+       }
        panic("Ththththaats all folks.  Too dangerous to continue.\n");
 }
 
@@ -298,9 +300,25 @@
        SHpnt = SCpnt->host;
        SDpnt = SCpnt->device;
 
+       if (req->bh){
        req->nr_segments = __count_segments(req, 
                                            CLUSTERABLE_DEVICE(SHpnt, SDpnt),
                                            SHpnt->unchecked_isa_dma, NULL);
+       }
+#if CONFIG_KIOBUF_IO
+       else if (req->kiobuf){
+               /* Since there is no clustering/merging in kiobuf
+                * requests, the nr_segments is simply a count of
+                * the number of pages needing I/O. nr_segments is
+                * updated in __scsi_collect_kio_sectors() called 
+                * from scsi_end_request(), for the leftover case.
+                * [[EMAIL PROTECTED]]
+                */
+               return;
+       } else {
+               panic("Both kiobuf and bh pointers are NULL!");
+       }
+#endif
 }
 
 #define MERGEABLE_BUFFERS(X,Y) \
@@ -746,6 +764,191 @@
 MERGEREQFCT(scsi_merge_requests_fn_, 0, 0)
 MERGEREQFCT(scsi_merge_requests_fn_c, 1, 0)
 MERGEREQFCT(scsi_merge_requests_fn_dc, 1, 1)
+
+
+
+/*
+ * Function:    scsi_bh_sgl()
+ *
+ * Purpose:     Helper routine to construct S(catter) G(ather) L(ist)
+ *             assuming buffer_head-based request in the Scsi_Cmnd.
+ *
+ * Arguments:   SCpnt   - Command descriptor 
+ *              use_clustering - 1 if host uses clustering
+ *              dma_host - 1 if this host has ISA DMA issues (bus doesn't
+ *                      expose all of the address lines, so that DMA cannot
+ *                      be done from an arbitrary address).
+ *             sgpnt   - pointer to sgl
+ *
+ * Returns:     Number of sg segments in the sgl.
+ *
+ * Notes:       Only the SCpnt argument should be a non-constant variable.
+ *             This functionality was abstracted out of the original code
+ *             in __init_io().
+ */
+__inline static int scsi_bh_sgl(Scsi_Cmnd * SCpnt,
+                             int use_clustering,
+                             int dma_host,
+                             struct scatterlist * sgpnt)
+{
+       int count;
+       struct buffer_head * bh;
+       struct buffer_head * bhprev;
+       
+       bhprev = NULL;
+
+       for (count = 0, bh = SCpnt->request.bh;
+            bh; bh = bh->b_reqnext) {
+               if (use_clustering && bhprev != NULL) {
+                       if (dma_host &&
+                           virt_to_phys(bhprev->b_data) - 1 == ISA_DMA_THRESHOLD) {
+                               /* Nothing - fall through */
+                       } else if (CONTIGUOUS_BUFFERS(bhprev, bh)) {
+                               /*
+                                * This one is OK.  Let it go.  Note that we
+                                * do not have the ability to allocate
+                                * bounce buffer segments > PAGE_SIZE, so
+                                * for now we limit the thing.
+                                */
+                               if( dma_host ) {
+#ifdef DMA_SEGMENT_SIZE_LIMITED
+                                       if( virt_to_phys(bh->b_data) - 1 < 
+ISA_DMA_THRESHOLD
+                                           || sgpnt[count - 1].length + bh->b_size <= 
+PAGE_SIZE ) {
+                                               sgpnt[count - 1].length += bh->b_size;
+                                               bhprev = bh;
+                                               continue;
+                                       }
+#else
+                                       sgpnt[count - 1].length += bh->b_size;
+                                       bhprev = bh;
+                                       continue;
+#endif
+                               } else {
+                                       sgpnt[count - 1].length += bh->b_size;
+                                       SCpnt->request_bufflen += bh->b_size;
+                                       bhprev = bh;
+                                       continue;
+                               }
+                       }
+               }
+               count++;
+               sgpnt[count - 1].address = bh->b_data;
+               sgpnt[count - 1].length += bh->b_size;
+               if (!dma_host) {
+                       SCpnt->request_bufflen += bh->b_size;
+               }
+               bhprev = bh;
+       }
+
+       return count;
+}
+
+#if CONFIG_KIOBUF_IO
+/*
+ * Function:    scsi_kio_sgl()
+ *
+ * Purpose:     Helper routine to construct S(catter) G(ather) L(ist)
+ *             assuming kiobuf-based request in the Scsi_Cmnd.
+ *
+ * Arguments:   SCpnt   - Command descriptor 
+ *              dma_host - 1 if this host has ISA DMA issues (bus doesn't
+ *                      expose all of the address lines, so that DMA cannot
+ *                      be done from an arbitrary address).
+ *             sgpnt   - pointer to sgl
+ *
+ * Returns:     Number of sg segments in the sgl.
+ *
+ * Notes:       Only the SCpnt argument should be a non-constant variable.
+ *             This functionality was created out of __ini_io() in the
+ *             original implementation for constructing the sgl for
+ *             kiobuf-based I/Os as well.
+ *
+ *             Constructs SCpnt->use_sg sgl segments for the kiobuf.
+ *
+ *             No clustering of pages is attempted unlike the buffer_head
+ *             case. Primarily because the pages in a kiobuf are unlikely to 
+ *             be contiguous. Bears checking.
+ */
+__inline static int scsi_kio_sgl(Scsi_Cmnd * SCpnt,
+                             int dma_host,
+                             struct scatterlist * sgpnt)
+{
+        int pgcnt, nr_seg, curr_seg, nr_sectors;
+       size_t curr_offset;
+       unsigned long va;
+       unsigned int nr_bytes, total_bytes, sgl_seg_bytes;
+
+       curr_seg = SCpnt->use_sg; /* This many sgl segments */
+       nr_sectors = SCpnt->request.nr_sectors;
+       total_bytes = (nr_sectors << 9);
+       curr_offset = SCpnt->request.kiobuf->offset;
+       
+       /*
+        * In the case of leftover requests, the kiobuf->length
+        * remains the same, but req->nr_sectors would be smaller.
+        * Use this difference to adjust curr_offset in this case. 
+        * If not a leftover, the following makes no difference.
+        */
+       curr_offset += (((SCpnt->request.kiobuf->length >> 9) - nr_sectors) << 9);
+       /* How far into the kiobuf is the offset? */
+       for (pgcnt=0; pgcnt<SCpnt->request.kiobuf->nr_pages; pgcnt++) {
+               if(curr_offset >= PAGE_SIZE) {
+                       curr_offset -= PAGE_SIZE;
+                       continue;
+               } else {
+                       break;
+               }
+       }
+       /*              
+        * Reusing the pgcnt value from above:
+        * Starting at the right page and offset, build curr_seg
+        * sgl segments (one per page). Account for both a 
+        * potentially partial last page and unrequired pages 
+        * at the end of the kiobuf.
+        */
+       nr_bytes = 0;
+       for (nr_seg = 0; nr_seg < curr_seg; nr_seg++) {
+               va = page_address(SCpnt->request.kiobuf->maplist[pgcnt])
+                       + curr_offset;
+               ++pgcnt;
+               
+               /*
+                * If this is the first page, account for offset.
+                * If this the final (maybe partial) page, get remainder.
+                */
+               if (curr_offset != 0) {
+                    sgl_seg_bytes = PAGE_SIZE - curr_offset;
+                    curr_offset = 0;   
+               } else if((nr_bytes + PAGE_SIZE) > total_bytes) {
+                    sgl_seg_bytes = total_bytes - nr_bytes;
+               } else {        
+                    sgl_seg_bytes = PAGE_SIZE;
+               }
+               
+               nr_bytes += sgl_seg_bytes;
+               sgpnt[nr_seg].address = (char *)va;
+               sgpnt[nr_seg].alt_address = 0;
+               sgpnt[nr_seg].length = sgl_seg_bytes;
+
+               if (!dma_host) {
+                    SCpnt->request_bufflen += sgl_seg_bytes;
+               }
+       }
+       /* Sanity Check */
+       if ((nr_bytes > total_bytes) ||
+           (pgcnt > SCpnt->request.kiobuf->nr_pages)) {
+               printk(KERN_ERR
+                      "scsi_kio_sgl: sgl bytes[%d], request bytes[%d]\n"
+                      "scsi_kio_sgl: pgcnt[%d], kiobuf->pgcnt[%d]!\n",
+                      nr_bytes, total_bytes, pgcnt, SCpnt->request.kiobuf->nr_pages);
+               BUG();
+       }
+       return nr_seg;
+
+}
+#endif
+
+
 /*
  * Function:    __init_io()
  *
@@ -778,6 +981,9 @@
  *              gather list, the sg count in the request won't be valid
  *              (mainly because we don't need queue management functions
  *              which keep the tally uptodate.
+ *
+ *             Modified to handle kiobuf argument in the SCpnt->request
+ *             structure. 
  */
 __inline static int __init_io(Scsi_Cmnd * SCpnt,
                              int sg_count_valid,
@@ -785,7 +991,6 @@
                              int dma_host)
 {
        struct buffer_head * bh;
-       struct buffer_head * bhprev;
        char               * buff;
        int                  count;
        int                  i;
@@ -800,11 +1005,15 @@
         * needed any more.  Need to play with it and see if we hit the
         * panic.  If not, then don't bother.
         */
-       if (!SCpnt->request.bh) {
+       if ((!SCpnt->request.bh
+#if CONFIG_KIOBUF_IO
+            && !SCpnt->request.kiobuf) ||
+           (SCpnt->request.bh && SCpnt->request.kiobuf
+#endif
+           )){
                /* 
-                * Case of page request (i.e. raw device), or unlinked buffer 
-                * Typically used for swapping, but this isn't how we do
-                * swapping any more.
+                * Case of unlinked buffer. Typically used for swapping,
+                * but this isn't how we do swapping any more.
                 */
                panic("I believe this is dead code.  If we hit this, I was wrong");
 #if 0
@@ -820,6 +1029,12 @@
        req = &SCpnt->request;
        /*
         * First we need to know how many scatter gather segments are needed.
+        *
+        * Redundant test per comment below indicating sg_count_valid is always
+        * set to 1.(ll_rw_blk.c's estimate of req->nr_segments is always trusted).
+        *
+        * count is initialized in ll_rw_kio() for the kiobuf path and since these
+        * requests are never merged, the counts are stay valid.
         */
        if (!sg_count_valid) {
                count = __count_segments(req, use_clustering, dma_host, NULL);
@@ -843,12 +1058,26 @@
                this_count = SCpnt->request.nr_sectors;
                goto single_segment;
        }
+#if CONFIG_KIOBUF_IO
+       /* Check if size of the sgl would be greater than the size
+        * of the host sgl table. In which case, limit the sgl size.
+        * When the request sectors are harvested after completion of 
+        * I/O in __scsi_collect_kio_sectors, the additional sectors 
+        * will be reinjected into the request queue as a special cmd.
+        * This will be done till all the request sectors are done.
+        * [[EMAIL PROTECTED]]
+        */
+       if((SCpnt->request.kiobuf != NULL) &&
+          (count > SCpnt->host->sg_tablesize)) {
+               count = SCpnt->host->sg_tablesize - 1;
+       }
+#endif
        SCpnt->use_sg = count;
-
        /* 
         * Allocate the actual scatter-gather table itself.
         * scsi_malloc can only allocate in chunks of 512 bytes 
         */
+       
        SCpnt->sglist_len = (SCpnt->use_sg
                             * sizeof(struct scatterlist) + 511) & ~511;
 
@@ -873,51 +1102,17 @@
        memset(sgpnt, 0, SCpnt->use_sg * sizeof(struct scatterlist));
        SCpnt->request_buffer = (char *) sgpnt;
        SCpnt->request_bufflen = 0;
-       bhprev = NULL;
 
-       for (count = 0, bh = SCpnt->request.bh;
-            bh; bh = bh->b_reqnext) {
-               if (use_clustering && bhprev != NULL) {
-                       if (dma_host &&
-                           virt_to_phys(bhprev->b_data) - 1 == ISA_DMA_THRESHOLD) {
-                               /* Nothing - fall through */
-                       } else if (CONTIGUOUS_BUFFERS(bhprev, bh)) {
-                               /*
-                                * This one is OK.  Let it go.  Note that we
-                                * do not have the ability to allocate
-                                * bounce buffer segments > PAGE_SIZE, so
-                                * for now we limit the thing.
-                                */
-                               if( dma_host ) {
-#ifdef DMA_SEGMENT_SIZE_LIMITED
-                                       if( virt_to_phys(bh->b_data) - 1 < 
ISA_DMA_THRESHOLD
-                                           || sgpnt[count - 1].length + bh->b_size <= 
PAGE_SIZE ) {
-                                               sgpnt[count - 1].length += bh->b_size;
-                                               bhprev = bh;
-                                               continue;
+       if (SCpnt->request.bh){
+               count = scsi_bh_sgl(SCpnt, use_clustering, dma_host, sgpnt);
                                        }
-#else
-                                       sgpnt[count - 1].length += bh->b_size;
-                                       bhprev = bh;
-                                       continue;
-#endif
+#if CONFIG_KIOBUF_IO
+       else if (SCpnt->request.kiobuf) {
+               count = scsi_kio_sgl(SCpnt, dma_host, sgpnt);
                                } else {
-                                       sgpnt[count - 1].length += bh->b_size;
-                                       SCpnt->request_bufflen += bh->b_size;
-                                       bhprev = bh;
-                                       continue;
-                               }
-                       }
-               }
-               count++;
-               sgpnt[count - 1].address = bh->b_data;
-               sgpnt[count - 1].length += bh->b_size;
-               if (!dma_host) {
-                       SCpnt->request_bufflen += bh->b_size;
+               panic("Yowza! Both kiobuf and buffer_head pointers are null!");
                }
-               bhprev = bh;
-       }
-
+#endif
        /*
         * Verify that the count is correct.
         */
@@ -1009,6 +1204,18 @@
        SCpnt->use_sg = 0;
        scsi_free(SCpnt->request_buffer, SCpnt->sglist_len);
 
+#if CONFIG_KIOBUF_IO
+       /*
+        * Shouldn't ever get here for a kiobuf request.
+        *
+        * Since each segment is a page and also, we couldn't
+        * allocate bounce buffers for even the first page,
+        * this means that the DMA buffer pool is exhausted!
+        */
+       if (SCpnt->request.kiobuf){
+               dma_exhausted(SCpnt, 0);
+       }
+#endif 
        /*
         * Make an attempt to pick up as much as we reasonably can.
         * Just keep adding sectors until the pool starts running kind of
@@ -1044,7 +1251,6 @@
         * segment.  Possibly the entire request, or possibly a small
         * chunk of the entire request.
         */
-       bh = SCpnt->request.bh;
        buff = SCpnt->request.buffer;
 
        if (dma_host) {
@@ -1053,7 +1259,7 @@
                 * back and allocate a really small one - enough to satisfy
                 * the first buffer.
                 */
-               if (virt_to_phys(SCpnt->request.bh->b_data)
+               if (virt_to_phys(SCpnt->request.buffer)
                    + (this_count << 9) - 1 > ISA_DMA_THRESHOLD) {
                        buff = (char *) scsi_malloc(this_count << 9);
                        if (!buff) {
--- linux-2.3pure/linux/drivers/scsi/sd.c       Thu May 18 12:22:01 2000
+++ linux-2.3rawio/linux/drivers/scsi/sd.c      Thu May 18 19:20:21 2000
@@ -513,6 +513,7 @@
 static void rw_intr(Scsi_Cmnd * SCpnt)
 {
        int result = SCpnt->result;
+       
 #if CONFIG_SCSI_LOGGING
        char nbuff[6];
 #endif
@@ -544,6 +545,11 @@
                        SCpnt->sense_buffer[6];
                        if (SCpnt->request.bh != NULL)
                                block_sectors = SCpnt->request.bh->b_size >> 9;
+#if CONFIG_KIOBUF_IO
+                       /* Tweak to support kiobuf-based I/O requests, [[EMAIL PROTECTED]] 
+*/
+                       if (SCpnt->request.kiobuf != NULL)
+                              block_sectors = SCpnt->request.kiobuf->length >> 9;
+#endif
                        switch (SCpnt->device->sector_size) {
                        case 1024:
                                error_sector <<= 1;
--- linux-2.3pure/linux/include/linux/blkdev.h  Thu May 18 12:22:28 2000
+++ linux-2.3rawio/linux/include/linux/blkdev.h Mon May 15 17:08:24 2000
@@ -7,6 +7,9 @@
 #include <linux/tqueue.h>
 #include <linux/list.h>
 
+#if CONFIG_KIOBUF_IO
+#include <linux/iobuf.h>
+#endif
 struct request_queue;
 typedef struct request_queue request_queue_t;
 
@@ -38,6 +41,9 @@
        void * special;
        char * buffer;
        struct semaphore * sem;
+#if CONFIG_KIOBUF_IO
+       struct kiobuf * kiobuf;
+#endif
        struct buffer_head * bh;
        struct buffer_head * bhtail;
        request_queue_t * q;
--- linux-2.3pure/linux/include/linux/elevator.h        Thu May 18 12:22:28 2000
+++ linux-2.3rawio/linux/include/linux/elevator.h       Mon May 15 15:57:04 2000
@@ -107,6 +107,11 @@
        elevator->sequence++;
        if (req->cmd == READ)
                elevator->read_pendings++;
+#if CONFIG_KIOBUF_IO
+       if (req->kiobuf != NULL) {
+            elevator->nr_segments += req->nr_segments;
+       } else
+#endif      
        elevator->nr_segments++;
 }
 
--- linux-2.3pure/linux/include/linux/fs.h      Thu May 18 12:22:28 2000
+++ linux-2.3rawio/linux/include/linux/fs.h     Mon May 15 15:57:04 2000
@@ -1014,6 +1014,9 @@
 extern struct buffer_head * get_hash_table(kdev_t, int, int);
 extern struct buffer_head * getblk(kdev_t, int, int);
 extern void ll_rw_block(int, int, struct buffer_head * bh[]);
+#if CONFIG_KIOBUF_IO
+extern void ll_rw_kio(int , struct kiobuf *, kdev_t, unsigned long, size_t, int *);
+#endif
 extern int is_read_only(kdev_t);
 extern void __brelse(struct buffer_head *);
 extern inline void brelse(struct buffer_head *buf)
--- linux-2.3pure/linux/include/linux/iobuf.h   Thu May 18 12:22:29 2000
+++ linux-2.3rawio/linux/include/linux/iobuf.h  Mon May 15 17:05:14 2000
@@ -54,6 +54,9 @@
        atomic_t        io_count;       /* IOs still in progress */
        int             errno;          /* Status of completed IO */
        void            (*end_io) (struct kiobuf *); /* Completion callback */
+#if CONFIG_KIOBUF_IO
+       void *k_dev_id;                 /* Store kiovec (or pagebuf) here */
+#endif
        wait_queue_head_t wait_queue;
 };
 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to