On Fri, Sep 18, 2009 at 10:40:27AM +0300, Kostik Belousov wrote:

> On Thu, Sep 17, 2009 at 03:26:41PM -0700, Xin LI wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi, Igor,
> > 
> > Igor Sysoev wrote:
> > > Hi,
> > > 
> > > nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO
> > > flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single
> > > byte. The first aio_read() preloads the first 128K part of a file in VM 
> > > cache,
> > > however, all successive aio_read()s preload just 16K parts of the file.
> > > This makes non-blocking sendfile() usage ineffective for files larger
> > > than 128K.
> > > 
> > > I've created a small patch for Darwin compatible F_RDAHEAD fcntl:
> > > 
> > >    fcntl(fd, F_RDAHEAD, preload_size)
> > > 
> > > There is small incompatibilty: Darwin's fcntl allows just to 
> > > enable/disable
> > > read ahead, while the proposed patch allows to set exact preload size.
> > > 
> > > Currently the preload size affects vn_read() code path only and does not
> > > affect on sendfile() code path. However, it can be easy extended on
> > > sendfile() part too. The preload size is still limited by sysctl 
> > > vfs.read_max.
> > > 
> > > The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE 
> > > only.
> > 
> > I have ported this as a patch against -HEAD (should apply on 8.0-R but
> > it's too late for us to add a new feature) plus a manual page entry
> > documenting the feature.
> > 
> > I've used F_READAHEAD as the name, but reading the manual page, it looks
> > like we can just use F_RDAHEAD since Darwin seems to just distinguish 0
> > and !=0 case so that programmers won't have to use #ifdef or something
> > else to get code working on different platform?
> 
> What I dislike about the patch is the new kernel-private flag that is
> eaten from the open(2) flags namespace. We do already have FHASLOCK,
> so far the only such flag.

The new patch version against 7.2 is attached. Changes:
1) two fcntl's: F_READAHEAD and Darwin compatible F_RDAHEAD,
2) FREADAHEAD uses O_CREAT bit.


-- 
Igor Sysoev
http://sysoev.ru/en/
--- /sys/sys/fcntl.h    2009-06-02 19:05:17.000000000 +0400
+++ /sys/sys/fcntl.h    2009-09-22 16:28:52.000000000 +0400
@@ -132,7 +132,7 @@
 /* bits to save after open */
 #define        FMASK           
(FREAD|FWRITE|FAPPEND|FASYNC|FFSYNC|FNONBLOCK|O_DIRECT)
 /* bits settable by fcntl(F_SETFL, ...) */
-#define        FCNTLFLAGS      
(FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FPOSIXSHM|O_DIRECT)
+#define        FCNTLFLAGS      
(FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FPOSIXSHM|FRDAHEAD|O_DIRECT)
 #endif
 
 /*
@@ -163,6 +163,9 @@
  * implemented as plain files).
  */
 #define        FPOSIXSHM       O_NOFOLLOW
+
+/* Read ahead */
+#define FRDAHEAD       O_CREAT
 #endif
 
 /*
@@ -187,6 +190,8 @@
 #define        F_SETLK         12              /* set record locking 
information */
 #define        F_SETLKW        13              /* F_SETLK; wait if blocked */
 #define        F_SETLK_REMOTE  14              /* debugging support for remote 
locks */
+#define        F_READAHEAD     15              /* read ahead */
+#define        F_RDAHEAD       16              /* Darwin compatible read ahead 
*/
 
 /* file descriptor flags (F_GETFD, F_SETFD) */
 #define        FD_CLOEXEC      1               /* close-on-exec flag */
--- /sys/kern/vfs_vnops.c       2009-06-02 19:05:00.000000000 +0400
+++ /sys/kern/vfs_vnops.c       2009-09-22 14:08:03.000000000 +0400
@@ -305,6 +305,9 @@
 sequential_heuristic(struct uio *uio, struct file *fp)
 {
 
+       if (fp->f_flag & FRDAHEAD)
+               return(fp->f_seqcount << IO_SEQSHIFT);
+
        if ((uio->uio_offset == 0 && fp->f_seqcount > 0) ||
            uio->uio_offset == fp->f_nextoff) {
                /*
--- /sys/kern/kern_descrip.c    2009-08-28 18:50:11.000000000 +0400
+++ /sys/kern/kern_descrip.c    2009-09-22 14:17:47.000000000 +0400
@@ -411,6 +411,7 @@
        u_int newmin;
        int error, flg, tmp;
        int vfslocked;
+       uint64_t bsize;
 
        vfslocked = 0;
        error = 0;
@@ -694,6 +695,35 @@
                vfslocked = 0;
                fdrop(fp, td);
                break;
+
+       case F_RDAHEAD:
+               arg = arg ? 128 * 1024: 0;
+               /* FALLTHROUGH F_READAHEAD */
+
+       case F_READAHEAD:
+               FILEDESC_SLOCK(fdp);
+               if ((fp = fdtofp(fd, fdp)) == NULL) {
+                       FILEDESC_SUNLOCK(fdp);
+                       error = EBADF;
+                       break;
+               }
+               if (fp->f_type != DTYPE_VNODE) {
+                       FILEDESC_SUNLOCK(fdp);
+                       error = EBADF;
+                       break;
+               }
+               FILE_LOCK(fp);
+               if (arg) {
+                       bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize;
+                       fp->f_seqcount = (arg + bsize - 1) / bsize;
+                       fp->f_flag |= FRDAHEAD;
+               } else {
+                       fp->f_flag &= ~FRDAHEAD;
+               }
+               FILE_UNLOCK(fp);
+               FILEDESC_SUNLOCK(fdp);
+               break;
+
        default:
                error = EINVAL;
                break;
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Reply via email to