Hmm... I may be confused and have to re-read all the past messages (except I've deleted them) but my understanding was it was rapid re- use of inodes within the 1-second resolution of mtime and the same file size, all confusing the cache code that caused the problem.

With the fix below with resolution for mtime at 1 second and the "grace period" at 2 seconds I can see how it would work but it would make me a bit queasy -- fixes which have assumptions of timing can be fragile. With filename-based keys and unique filenames (which would seem like a natural requirement from someone writing similar code), the inode would be ignored and you'd get a consistent view, regardless of timing or what other threads would be up to. I think you could try this approach as quickly as your fix -- just define _WIN32 after the "include nsd.h" to get the filename behavior of Win32. You could run your test and see if it was stable as well -- I'd be curious.

Again, this whole issue is interesting and the problem report quite subtle, justifying some sort of defensive fix but using ns_returnfile for short, dynamic content still seems like the wrong approach. Ideas to use a cache of open fd's via Ns_GetTemp or Tcl channels via ns_returnfp seem closer to what's needed here.

BTW: Which OS is re-using inodes so quickly? I can't get my Mac OS/X laptop to do that -- figured the inode re-use/prediction thing was plugged years ago, e.g., when fsirand was introduced for scrambling NFS vnodes.

-Jim




On Aug 20, 2008, at 12:54 PM, John Caruso wrote:

On Wednesday 08:45 AM 8/20/2008, Jim Davidson wrote:
Overall, it seems one thing to do would be to switch to filename- based
cache keys by default, leaving the dev/inode pair as an option for
folks who run sites with large symlinks and want to benefit from
caching objects just once.  I think that should avoid the data
corruption cases John pointed out with minimal downside.

Actually that wouldn't have fixed the problem in the code that led us to find out about this in the first place. The change that I suggested does fix that problem, though, and it directly addresses the limitation of mtime's one-second granularity--which is the crux of the issue. The patch below (really just a one-line change) implements this fix:

------- 8< --------------------------------------------------------------------- --- aolserver-4.5.0-orig/nsd/fastpath.c 2006-04-19 10:48:47.000000000 -0700 +++ aolserver-4.5.0/nsd/fastpath.c 2008-08-19 21:22:26.000000000 -0700
@@ -507,9 +507,11 @@
    }

    if (servPtr->fastpath.cache == NULL
-           || stPtr->st_size > servPtr->fastpath.cachemaxentry) {
+           || stPtr->st_size > servPtr->fastpath.cachemaxentry
+           || (time(NULL) - stPtr->st_mtime) < 2) {
       /*
-        * Caching is disabled or the entry is too large for the cache
+        * Caching is disabled, the entry is too large for the cache,
+        * or the file was modified too recently to be cached safely,
        * so just open, mmap, and send the content directly.
        */

------- 8< ---------------------------------------------------------------------

We've tested this fix extensively against the code that was hitting the bug before, and I can verify that it resolves the problem there. As far as I can tell this fix would resolve the issue in any standard scenario (and certainly all of the ones I've outlined thus far).

Given that this is a straightforward, user-transparent change that would have only a negligible impact on fastpath caching, and considering the security implications, I'd suggest that this change be applied to the AOLserver code in CVS.

BTW, Jeff, the scenario you'd outlined that you thought would trip this up...:

  13:50:21 - create file
  13:50:21 - serve file (gets cached)
  13:50:21 - delete file
  13:50:21 - create file again (reuses inode)
  ... time passes ...
  13:55:11 - serve file

...actually wouldn't, because the file would NOT be cached in the second line. The whole point of this strategy is that a file won't be cached if it's been modified within the threshold time (2 seconds in the patch above).

- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED] > with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to