Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-22 Thread Jim Davidson
Ah -- I (finally) understand... I must have missed the detail re:  
serialization in message #30 out of #60 or so


So, this clarifies to me:

-- cache by filename key is correct and good for most cases and should  
be on by default
-- the grace period is a clever solution for the rapid-changing,  
same filename case you described and deserves to be on by default
-- ns_returnfile shouldn't use the cache but already does -- some  
config and/or command flags can be added to toggle the behavior


I'll update the code with the options above.


-Jim





On Aug 21, 2008, at 11:27 PM, John Caruso wrote:


On Thursday 02:34 PM 8/21/2008, Jim Davidson wrote:
To clarify one point:  There is no technical solution to creating  
temp

files with the same name and avoiding the race condition without
additional synchronization.


To clarify as well: the original code didn't involve a race  
condition--it was effectively serialized, as though it were like  
this snippet:


   foreach object $objects {
   eval exec /some/external/program --output-file $tempfile -- 
object $object

   ns_returnfile 200 text/plain $tempfile
   }

(As I mentioned to you, this was basically a batch process driven by  
a client-side Java applet making sequential HTTP requests to an  
AOLserver-driven API web server, one transaction at a time, with the  
results being returned by ns_returnfile on the server.  Also, the  
temp file in question was in a secure directory.)


So the bug can (and did) manifest itself with serialized access.


So, here's what I'd suggest:

-- Cache by filename key should be the default.  This is technically
the correct fix to enable temporary, uniquely named files, to be
returned via ns_returnfile.
-- John's grace period code is a clever optimization if fastpath is
being used in this way and could also be an option, default off.


Again, this wouldn't have resolved Arena's initial problem; the  
original code would still have hit the bug, and it would have been  
just as difficult to detect that that was happening (though slightly  
easier to debug).  That's why I'd recommend having the mtime  
workaround code active with a default of 1--otherwise people running  
a default config of AOLserver will still be open to the same issue.


(That's my only stake in this, BTW; Arena is already using the mtime  
fix and will continue to do so, but I'd really rather not have  
someone else run into this issue in the future.)


In thinking about it today I realized that it's useful to think  
about the four scenarios in which the bug can currently occur (which  
I believe partition the bug space):


1) Monotonically increasing time with a different filename
2) Monotonically increasing time with the same filename
3) Time travelling with a different filename
4) Time travelling with the same filename

(Time travelling here means mucking with the mtime artifically, a  
la rsync, and filename means fully-qualified filename.)


The mtime workaround resolves scenarios 1 and 2, and using the  
filename as the cache key resolves scenarios 1 and 3.  Nothing  
suggested so far resolves scenario 4--and in fact I don't think it's  
possible to resolve scenario 4 short of a major rewrite of the code  
(like Juan's suggestion of using inotify or similar functionality).   
So combining both fixes resolves all of the resolvable issues.


For reference: the bug occurred in scenario 2, and subsequently in  
scenario 1.  And the security implications apply to all four  
scenarios, though they're arguably worst in scenarios 1 and 3.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-22 Thread Tom Jackson
Jim,

Can I ask why the filename is important for the cache key? With the
cache delay, the inode/dev + *time + size should do it all.

In fact, I finally understood the difference between mtime and ctime, if
any change is made, it should be the change to ctime. 

Why ctime? ctime is unique in that it isn't something that can be set by
user level programs. It changes whenever the content of the file changes
or permissions, owners, or any of the metadata of the file. 

So, for instance, if someone replaces a file with an identical file, the
ctime would still change. If you check the ctime, you can also skip
checking the size. 

But none of this has to do with the filename. On Unix, filenames are
especially squishy. both stat and open follow symbolic links, and you
could therefore use a symlink to point to different files over time, but
the files could have identical mtime, atime, size, owner, etc. using
stat/open and comparing with what is in cache based upon the filename
would never detect this slight of hand. This is why the inode is most
important on unix. On windows, you can't do this, so filenames are safe.

My recommendation for changes are these:

1. use John's ingenious 1-2 second delay, optionally allow config of the
cache delay.

2. base the age for the above on the ctime, not the mtime. ctime is
always younger or the same age as mtime, and covers changes to metadata,
and is immune from easy modification.

Maybe: remove the fastpath config options from the basic config file, if
it is even there, other example configs could be set to cache = off.

Happy Weekend everyone.

tom jackson

On Fri, 2008-08-22 at 13:18 -0400, Jim Davidson wrote:
 Ah -- I (finally) understand... I must have missed the detail re:  
 serialization in message #30 out of #60 or so
 
 So, this clarifies to me:
 
 -- cache by filename key is correct and good for most cases and should  
 be on by default
 -- the grace period is a clever solution for the rapid-changing,  
 same filename case you described and deserves to be on by default
 -- ns_returnfile shouldn't use the cache but already does -- some  
 config and/or command flags can be added to toggle the behavior
 
 I'll update the code with the options above.
 
 
 -Jim
 
 
 
 
 
 On Aug 21, 2008, at 11:27 PM, John Caruso wrote:
 
  On Thursday 02:34 PM 8/21/2008, Jim Davidson wrote:
  To clarify one point:  There is no technical solution to creating  
  temp
  files with the same name and avoiding the race condition without
  additional synchronization.
 
  To clarify as well: the original code didn't involve a race  
  condition--it was effectively serialized, as though it were like  
  this snippet:
 
 foreach object $objects {
 eval exec /some/external/program --output-file $tempfile -- 
  object $object
 ns_returnfile 200 text/plain $tempfile
 }
 
  (As I mentioned to you, this was basically a batch process driven by  
  a client-side Java applet making sequential HTTP requests to an  
  AOLserver-driven API web server, one transaction at a time, with the  
  results being returned by ns_returnfile on the server.  Also, the  
  temp file in question was in a secure directory.)
 
  So the bug can (and did) manifest itself with serialized access.
 
  So, here's what I'd suggest:
 
  -- Cache by filename key should be the default.  This is technically
  the correct fix to enable temporary, uniquely named files, to be
  returned via ns_returnfile.
  -- John's grace period code is a clever optimization if fastpath is
  being used in this way and could also be an option, default off.
 
  Again, this wouldn't have resolved Arena's initial problem; the  
  original code would still have hit the bug, and it would have been  
  just as difficult to detect that that was happening (though slightly  
  easier to debug).  That's why I'd recommend having the mtime  
  workaround code active with a default of 1--otherwise people running  
  a default config of AOLserver will still be open to the same issue.
 
  (That's my only stake in this, BTW; Arena is already using the mtime  
  fix and will continue to do so, but I'd really rather not have  
  someone else run into this issue in the future.)
 
  In thinking about it today I realized that it's useful to think  
  about the four scenarios in which the bug can currently occur (which  
  I believe partition the bug space):
 
  1) Monotonically increasing time with a different filename
  2) Monotonically increasing time with the same filename
  3) Time travelling with a different filename
  4) Time travelling with the same filename
 
  (Time travelling here means mucking with the mtime artifically, a  
  la rsync, and filename means fully-qualified filename.)
 
  The mtime workaround resolves scenarios 1 and 2, and using the  
  filename as the cache key resolves scenarios 1 and 3.  Nothing  
  suggested so far resolves scenario 4--and in fact I don't think it's  
  possible to resolve scenario 4 short of a 

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Don Baccus

On Aug 21, 2008, at 8:14 AM, Dossy Shiobara wrote:


I've remained silent on this issue because I didn't want to be  
accused of stifling the community, etc.


...


 End of discussion.


Accused.  Guilty.


Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Dossy Shiobara
I think there has been more than enough discussion around this issue which I 
did not interfere with nor influence in any way.

However, to let it continue to spin over and over is unproductive.  I'd love to 
hear other solutions other than configurabe cache key strategies or a 
time-based delay caching strategy, but the time to debate whether this is a 
defect or not is officially over: it is a defect. Let's find the right 
technical solution to make fastpath more robust, please.

--Original Message--
From: Don Baccus
Sender: AOLserver Discussion
To: AOLSERVER@LISTSERV.AOL.COM
ReplyTo: AOLserver Discussion
Sent: Aug 21, 2008 12:25 PM
Subject: Re: [AOLSERVER] Data corruption with fastpath caching

On Aug 21, 2008, at 8:14 AM, Dossy Shiobara wrote:

 I've remained silent on this issue because I didn't want to be  
 accused of stifling the community, etc.

...

  End of discussion.

Accused.  Guilty.


Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


-- 
Dossy Shiobara
[EMAIL PROTECTED]


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread John Caruso

On Thursday 08:14 AM 8/21/2008, Dossy Shiobara wrote:
4) I see the simplest (best?) solution here being a configurable 
parameter that controls fastpath's cache key generation.  As Jim points 
out, one can quickly test whether this would solve the problem at hand by 
temporarily #define'ing _WIN32 in the appropriate place.


I'm not sure if I've mentioned this on the list, but the initial case that 
prompted us to discover the bug would not have been helped by this change; 
the main difference is that it would have shortened the debugging 
efforts.  And this change would also alter current behavior in a major 
way, by defeating the explicit goal of having hard-linked files served 
from the same cache entry.  A fix that doesn't resolve the initial problem 
and which has undesirable side effects isn't worth pursuing.


The change I offered is the only one that corrects the problem in the 
original code and all the other examples I've mentioned, transparently and 
without any serious side effects.  Simply put, fastpath as designed should 
not be caching items that have been modified within the current second, 
because mtime's granularity (in combination with inode reuse) doesn't 
allow it to distinguish them from other items in the cache.  Based on what 
Jim's said here, I'm guessing he wasn't thinking much about the mtime 
granularity issue because he thought inodes would be more than enough to 
ensure uniqueness, but that's not always the case (in fact it's usually 
*not* the case, since the two most widely-used filesystems reuse inodes).


In thinking about this further, barring a complete rewrite I don't think 
the fastpath caching mechanism *can* be fixed completely--it can only be 
improved.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Jeff Rogers

Dossy Shiobara wrote:


However, to let it continue to spin over and over is unproductive.
I'd love to hear other solutions other than configurabe cache key
strategies or a time-based delay caching strategy, but the time to
debate whether this is a defect or not is officially over: it is a
defect. Let's find the right technical solution to make fastpath more
robust, please.


The solutions I've seen proposed are

* configurable cache key
+ solves the problem for the most common and surprising case
- does not solve the OP's problem
? what to make the default?

* time-delay
+ solves OP's problem
- still easily broken

* flush files from cache with ns_unlink
- easily defeated (by using different command to remove files)
- links mostly unrelated commands

* exclude some paths from caching (e.g., don't cache if the file is 
outside of pageroot, or is in some configured list of non-cached 
directories, like /tmp)

+ keeps temporary files from taking up cache memory
- still easily broken
- developer needs to know to use excluded paths

* add a -nocache flag to ns_returnfile (or add a new command like 
ns_returntmpfile) to indicate that the file shouldn't be cached

+ makes intent clearer
+ keeps temp files out of cache
- bloats api
- doesn't fix existing cases

* change fastpath default to off to be turned on only when desired and 
understood

+ simple, effective
- default performance hit

* use system-provided change notifications to flush cache (e.g., inotify)
+ effective, efficient
- non-portable

Did I leave any out?

-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Tom Jackson
On Thu, 2008-08-21 at 11:14 -0400, Dossy Shiobara wrote:
 4) I see the simplest (best?) solution here being a configurable 
 parameter that controls fastpath's cache key generation.  As Jim points 
 out, one can quickly test whether this would solve the problem at hand 
 by temporarily #define'ing _WIN32 in the appropriate place.  If this 
 proves successful, we change it from using #ifdef's to regular if() 
 statements and define a new configuration parameter.  End of discussion.

I have responded twice to John's newest patch idea, which is a one line
patch. It appears to completely eliminate any problem with cache
poisoning. It is simple, it doesn't change the semantics of the command
or anything else. It simply works around a known limitation of the stat
mtime granularity. 

The only security issue that was exposed was the misuse of
ns_returnfile. All of the data put into cache were entirely under the
control of the AOLserver process. The developer / maintainer of that
process is responsible for everything the process does. ns_returnfile is
an inherently dangerous API, there is no handholding involved. You have
to understand what it is doing and why it exists. 

In fact, John even pointed out that the original code which wrote out
the contents of the file reused the same name over and over. Assuming
that you can know that the contents of a file have not changed just
because it has the same name, same mtime and same size is an invalid
assumption, it will always be invalid. All caches have the same
limitation. By definition they are not in sync with the true copy.
Anyone who uses a cache needs to understand this. 

So, this is important, John is not interested in the cache, he actually
wants to avoid the cache. So talking about how stuff is stored in the
cache, and under what key, is unimportant for John. He wants to keep his
newly created file from ever getting into the cache. 

And this is where he has a point, a very good one. Why put newly created
files into a cache, if the point of the cache is to handle static files?
We can wait for evidence that it is static. In this case, we can wait
until it is a few seconds old, at least. John's patch does exactly this
and nothing more. It is actually a very ingenious change. 


There is no difference between the inode and the filename under unix.
Both offer equal opportunity to screw up due to a race condition. It can
still happen even in the patched ns_returnfile. Jim mentioned this.
After a file is stat'ed, the open might find a different (maybe
truncated) file. There is no guarantee that you won't get something
else, especially if you have multiple processes/threads creating files
in an non-synchronized way. It is not part of ns_returnfile to guarantee
that the contents/age of a file remains unchanged during the course of
execution, and when you throw in an external process it is nearly
impossible to come up with any code which can provide that guarantee. If
data integrity is really important to you, don't try to provide it using
named files as temporary storage.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread John Caruso

On Thursday 09:25 AM 8/21/2008, Tom Jackson wrote:

Why put newly created
files into a cache, if the point of the cache is to handle static files?
We can wait for evidence that it is static. In this case, we can wait
until it is a few seconds old, at least.


This is a very good point, actually.  Initially I didn't think having the 
value be configurable was a good way to go--it should just be set to 1, 
since that's the threshold value (yeah, I used 2 out of an excess of 
caution, but 1 would have worked fine and would probably have created less 
confusion).


But having this as a configurable (something like fastpath.min_cache_age?) 
would give people the ability to say: do NOT cache anything unless it's at 
least X seconds old.  Which actually does seem like a genuinely useful 
thing on its own (completely independent of this issue), since it would 
allow people to prevent new/dynamic files from kicking good data out of 
the cache.  And enforcing a minimum value of 1 would serve the dual 
purpose of resolving this problem as well.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Titi Alailima
I agree that John's patch is worth doing.  It satisfies both his requirements 
and the stated design goals of fastpath.

The remaining issue is whether something called ns_returnfile which takes a 
pathname as a parameter should have some guarantee that you will return what at 
least at some point was the contents of a file with that pathname.  It's 
perfectly acceptable in dealing with caching systems that the cached value 
could be out of sync, but not that the cached value could be for something 
entirely different from what you were looking for.  Even with the mtime fix 
there's no guarantee that systems which muck around with mtime (such as tar) 
won't cause separate files to collide.  For a contrived example:

1. tar xf foo.tar (creating two files a and b with the same size and same 
mtime)
2. ns_returnfile b
3. Delete files a and b
4. tar xf foo.tar
5. ns_returnfile b (this could return the contents of a because the inode was 
reused)

I don't think this example violates any of the stated principles of using 
ns_returnfile for only static data.  Both a and b could have completely 
stable contents and due to some minor issue of system administration (for 
example) their inodes could end up swapped and the cache poisoned.

So I think we need both fixes, one to eliminate caching unless a certain 
criterion of static-ness has been met, and the other to prevent the cache 
from returning completely unrelated data.  Other caveats about ns_returnfile 
use still apply, and the documentation should reflect them.

Now the only people this wouldn't satisfy are those who are concerned about 
pathnames taking up space in the cache or slowing it down.  The option has been 
suggested to make pathname inclusion optional, though I would advise against it 
unless the configuration option is named in such a way as to indicate its 
unsafe-ness.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309


 -Original Message-
 From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
 Behalf Of Tom Jackson
 Sent: Thursday, August 21, 2008 12:25 PM
 To: AOLSERVER@LISTSERV.AOL.COM
 Subject: Re: [AOLSERVER] Data corruption with fastpath caching

 On Thu, 2008-08-21 at 11:14 -0400, Dossy Shiobara wrote:
  4) I see the simplest (best?) solution here being a configurable
  parameter that controls fastpath's cache key generation.  As Jim
 points
  out, one can quickly test whether this would solve the problem at
 hand
  by temporarily #define'ing _WIN32 in the appropriate place.  If this
  proves successful, we change it from using #ifdef's to regular if()
  statements and define a new configuration parameter.  End of
 discussion.

 I have responded twice to John's newest patch idea, which is a one line
 patch. It appears to completely eliminate any problem with cache
 poisoning. It is simple, it doesn't change the semantics of the command
 or anything else. It simply works around a known limitation of the stat
 mtime granularity.

 The only security issue that was exposed was the misuse of
 ns_returnfile. All of the data put into cache were entirely under the
 control of the AOLserver process. The developer / maintainer of that
 process is responsible for everything the process does. ns_returnfile
 is
 an inherently dangerous API, there is no handholding involved. You have
 to understand what it is doing and why it exists.

 In fact, John even pointed out that the original code which wrote out
 the contents of the file reused the same name over and over. Assuming
 that you can know that the contents of a file have not changed just
 because it has the same name, same mtime and same size is an invalid
 assumption, it will always be invalid. All caches have the same
 limitation. By definition they are not in sync with the true copy.
 Anyone who uses a cache needs to understand this.

 So, this is important, John is not interested in the cache, he actually
 wants to avoid the cache. So talking about how stuff is stored in the
 cache, and under what key, is unimportant for John. He wants to keep
 his
 newly created file from ever getting into the cache.

 And this is where he has a point, a very good one. Why put newly
 created
 files into a cache, if the point of the cache is to handle static
 files?
 We can wait for evidence that it is static. In this case, we can wait
 until it is a few seconds old, at least. John's patch does exactly this
 and nothing more. It is actually a very ingenious change.


 There is no difference between the inode and the filename under unix.
 Both offer equal opportunity to screw up due to a race condition. It
 can
 still happen even in the patched ns_returnfile. Jim mentioned this.
 After a file is stat'ed, the open might find a different (maybe
 truncated) file. There is no guarantee that you won't get something
 else, especially if you have multiple processes/threads creating files
 in an non-synchronized way. It is not part of ns_returnfile

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Rusty Brooks
I don't have any opinion on the fix, but I think the actual objection to 
using the filename in the fix is that this would cause hard links to 
files, which are for all intents and purposes The Same File, to be 
considered different files by fastpath.  (Hard links have different 
names, but the same inode)


Rusty

Titi Alailima wrote:

I agree that John's patch is worth doing.  It satisfies both his requirements 
and the stated design goals of fastpath.

The remaining issue is whether something called ns_returnfile which takes a 
pathname as a parameter should have some guarantee that you will return what at least at 
some point was the contents of a file with that pathname.  It's perfectly acceptable in 
dealing with caching systems that the cached value could be out of sync, but not that the 
cached value could be for something entirely different from what you were looking for.  
Even with the mtime fix there's no guarantee that systems which muck around with mtime 
(such as tar) won't cause separate files to collide.  For a contrived example:

1. tar xf foo.tar (creating two files a and b with the same size and same 
mtime)
2. ns_returnfile b
3. Delete files a and b
4. tar xf foo.tar
5. ns_returnfile b (this could return the contents of a because the inode was 
reused)

I don't think this example violates any of the stated principles of using ns_returnfile for only 
static data.  Both a and b could have completely stable contents and due 
to some minor issue of system administration (for example) their inodes could end up swapped and the cache 
poisoned.

So I think we need both fixes, one to eliminate caching unless a certain criterion of 
static-ness has been met, and the other to prevent the cache from returning 
completely unrelated data.  Other caveats about ns_returnfile use still apply, and the 
documentation should reflect them.

Now the only people this wouldn't satisfy are those who are concerned about pathnames 
taking up space in the cache or slowing it down.  The option has been suggested to make 
pathname inclusion optional, though I would advise against it unless the configuration 
option is named in such a way as to indicate its unsafe-ness.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309



-Original Message-
From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
Behalf Of Tom Jackson
Sent: Thursday, August 21, 2008 12:25 PM
To: AOLSERVER@LISTSERV.AOL.COM
Subject: Re: [AOLSERVER] Data corruption with fastpath caching

On Thu, 2008-08-21 at 11:14 -0400, Dossy Shiobara wrote:

4) I see the simplest (best?) solution here being a configurable
parameter that controls fastpath's cache key generation.  As Jim

points

out, one can quickly test whether this would solve the problem at

hand

by temporarily #define'ing _WIN32 in the appropriate place.  If this
proves successful, we change it from using #ifdef's to regular if()
statements and define a new configuration parameter.  End of

discussion.

I have responded twice to John's newest patch idea, which is a one line
patch. It appears to completely eliminate any problem with cache
poisoning. It is simple, it doesn't change the semantics of the command
or anything else. It simply works around a known limitation of the stat
mtime granularity.

The only security issue that was exposed was the misuse of
ns_returnfile. All of the data put into cache were entirely under the
control of the AOLserver process. The developer / maintainer of that
process is responsible for everything the process does. ns_returnfile
is
an inherently dangerous API, there is no handholding involved. You have
to understand what it is doing and why it exists.

In fact, John even pointed out that the original code which wrote out
the contents of the file reused the same name over and over. Assuming
that you can know that the contents of a file have not changed just
because it has the same name, same mtime and same size is an invalid
assumption, it will always be invalid. All caches have the same
limitation. By definition they are not in sync with the true copy.
Anyone who uses a cache needs to understand this.

So, this is important, John is not interested in the cache, he actually
wants to avoid the cache. So talking about how stuff is stored in the
cache, and under what key, is unimportant for John. He wants to keep
his
newly created file from ever getting into the cache.

And this is where he has a point, a very good one. Why put newly
created
files into a cache, if the point of the cache is to handle static
files?
We can wait for evidence that it is static. In this case, we can wait
until it is a few seconds old, at least. John's patch does exactly this
and nothing more. It is actually a very ingenious change.


There is no difference between the inode and the filename under unix.
Both offer equal opportunity to screw up due to a race condition. It
can
still happen even in the patched ns_returnfile

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Jeff Rogers

Titi Alailima wrote:


what you were looking for.  Even with the mtime fix there's no
guarantee that systems which muck around with mtime (such as tar)
won't cause separate files to collide.  For a contrived example:


I think the best you can do is to use ctime instead of mtime,  or maybe 
btime on *bsd.  You can still run into problems if you have clock skew, 
but there's only so much you can account for.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Titi Alailima
Right, I forgot that one.  But the potential resolution is the same, allow a 
configurable unsafe mode.  If you want the inode-only optimization and are 
willing to take on the resulting unpredictability of ns_returnfile, go right 
ahead.  But the majority of developers who don't know or don't care will have a 
much safer experience.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309


 -Original Message-
 From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
 Behalf Of Rusty Brooks
 Sent: Thursday, August 21, 2008 3:25 PM
 To: AOLSERVER@LISTSERV.AOL.COM
 Subject: Re: [AOLSERVER] Data corruption with fastpath caching

 I don't have any opinion on the fix, but I think the actual objection
 to
 using the filename in the fix is that this would cause hard links to
 files, which are for all intents and purposes The Same File, to be
 considered different files by fastpath.  (Hard links have different
 names, but the same inode)

 Rusty


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Tom Jackson
I'm lost. If you are interested in serving the same content, mtime tells
you the last time the content was modified. ctime changes for reasons
all unrelated to the content. 

But this is a cache, which is a copy. There is never any way to
guarantee that the content is the same as what is currently on disk,
unless you compare the files directly. Of course that totally negates
the purpose of the cache. If this is what you want, you should disable
the cache completely.

BTW, if you just delete the fastpath config parameters from the config
file fastpath will be disabled, so it is disabled by default right now. 

I'm also wondering if the inode/dev key just catches hard links. I think
it also works via indirection with symlinks? Both stat and open follow
symbolic links, so the inode is probably more stable than the filename
on unix. 

Anyway, trying to guarantee anything about files is a lost cause. 

tom jackson

On Thu, 2008-08-21 at 12:46 -0700, Jeff Rogers wrote: 
 Titi Alailima wrote:
 
  what you were looking for.  Even with the mtime fix there's no
  guarantee that systems which muck around with mtime (such as tar)
  won't cause separate files to collide.  For a contrived example:
 
 I think the best you can do is to use ctime instead of mtime,  or maybe 
 btime on *bsd.  You can still run into problems if you have clock skew, 
 but there's only so much you can account for.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Jim Davidson
 with the same  
size and same mtime)

2. ns_returnfile b
3. Delete files a and b
4. tar xf foo.tar
5. ns_returnfile b (this could return the contents of a because  
the inode was reused)
I don't think this example violates any of the stated principles of  
using ns_returnfile for only static data.  Both a and b could  
have completely stable contents and due to some minor issue of  
system administration (for example) their inodes could end up  
swapped and the cache poisoned.
So I think we need both fixes, one to eliminate caching unless a  
certain criterion of static-ness has been met, and the other to  
prevent the cache from returning completely unrelated data.  Other  
caveats about ns_returnfile use still apply, and the documentation  
should reflect them.
Now the only people this wouldn't satisfy are those who are  
concerned about pathnames taking up space in the cache or slowing  
it down.  The option has been suggested to make pathname inclusion  
optional, though I would advise against it unless the configuration  
option is named in such a way as to indicate its unsafe-ness.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309

-Original Message-
From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
Behalf Of Tom Jackson
Sent: Thursday, August 21, 2008 12:25 PM
To: AOLSERVER@LISTSERV.AOL.COM
Subject: Re: [AOLSERVER] Data corruption with fastpath caching

On Thu, 2008-08-21 at 11:14 -0400, Dossy Shiobara wrote:

4) I see the simplest (best?) solution here being a configurable
parameter that controls fastpath's cache key generation.  As Jim

points

out, one can quickly test whether this would solve the problem at

hand
by temporarily #define'ing _WIN32 in the appropriate place.  If  
this

proves successful, we change it from using #ifdef's to regular if()
statements and define a new configuration parameter.  End of

discussion.

I have responded twice to John's newest patch idea, which is a one  
line

patch. It appears to completely eliminate any problem with cache
poisoning. It is simple, it doesn't change the semantics of the  
command
or anything else. It simply works around a known limitation of the  
stat

mtime granularity.

The only security issue that was exposed was the misuse of
ns_returnfile. All of the data put into cache were entirely under  
the

control of the AOLserver process. The developer / maintainer of that
process is responsible for everything the process does.  
ns_returnfile

is
an inherently dangerous API, there is no handholding involved. You  
have

to understand what it is doing and why it exists.

In fact, John even pointed out that the original code which wrote  
out
the contents of the file reused the same name over and over.  
Assuming

that you can know that the contents of a file have not changed just
because it has the same name, same mtime and same size is an invalid
assumption, it will always be invalid. All caches have the same
limitation. By definition they are not in sync with the true copy.
Anyone who uses a cache needs to understand this.

So, this is important, John is not interested in the cache, he  
actually
wants to avoid the cache. So talking about how stuff is stored in  
the

cache, and under what key, is unimportant for John. He wants to keep
his
newly created file from ever getting into the cache.

And this is where he has a point, a very good one. Why put newly
created
files into a cache, if the point of the cache is to handle static
files?
We can wait for evidence that it is static. In this case, we can  
wait
until it is a few seconds old, at least. John's patch does exactly  
this

and nothing more. It is actually a very ingenious change.


There is no difference between the inode and the filename under  
unix.

Both offer equal opportunity to screw up due to a race condition. It
can
still happen even in the patched ns_returnfile. Jim mentioned this.
After a file is stat'ed, the open might find a different (maybe
truncated) file. There is no guarantee that you won't get something
else, especially if you have multiple processes/threads creating  
files

in an non-synchronized way. It is not part of ns_returnfile to
guarantee
that the contents/age of a file remains unchanged during the  
course of

execution, and when you throw in an external process it is nearly
impossible to come up with any code which can provide that  
guarantee.

If
data integrity is really important to you, don't try to provide it
using
named files as temporary storage.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the
Subject: field of your email blank.

--
AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread John Caruso

On Thursday 12:08 PM 8/21/2008, Titi Alailima wrote:
It's perfectly acceptable in dealing with caching systems that the cached 
value could be out of sync, but not that the cached value could be for 
something entirely different from what you were looking for.


Yep.  I think that aspect of the issue has been getting lost--it's not 
just about getting stale data from a given file, but getting data from an 
entirely *different* file, which I'd agree violates any reasonable 
expectation of a caching system.


So I think we need both fixes, one to eliminate caching unless a certain 
criterion of static-ness has been met, and the other to prevent the 
cache from returning completely unrelated data.


You make a good point.  The fix I suggested is intended to make fastpath 
caching behave well in all cases where time proceeds monotonically (which 
I'd guess is by far the most common use case, especially for a web server 
that's unlikely to call utilities like tar/rsync that would munge file 
times).  In fact that's essentially what I mean by pathological.  But to 
protect against the time-travelling scenarios causing fastpath to confuse 
two different files, you'd have to use the filename-as-key fix as well.


Using the filename as a key is a bummer for sites like AOL that want the 
cache to respect hard links when serving data from the cache, though.  It 
won't matter for us either way, since we're not that concerned about the 
cache in the first place--just about the ways in which it might return the 
wrong data.


(In case anyone's wondering: Arena's web application pre-caches large 
amounts of data when it starts, and web servers aren't put into rotation 
until they've finished this pre-caching step.  The web servers are also 
bathed in RAM, so it's unlikely that fastpath caching is offering much of 
a performance boost over the Linux page cache--especially when that time 
is set against the overhead of database access.)


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Tom Jackson
On Thu, 2008-08-21 at 17:34 -0400, Jim Davidson wrote:
 So, technically this was a case where  
 we dynamically created code which was later read by ADP (which had the  
 same dev/inode cache stuff as fastpath).  However, this was done  
 carefully:
 
 -- Tcl-level mutex/condition variables to ensure only one thread did  
 the hard work even if several were interested in the result
 -- Careful write to a non .adp extension, unique temp file
 -- Atomic rename in place when ready
 
 It was a combination of traditional atomic Unix filesystem semantics  
 and newer thread synchronization at the Tcl level used to avoid ever  
 getting some mutant result.

Here is an example using Tcl level commands (although a hidden use of
mutex/condition vars:

# save datastore to file
# Note: a data.tmp file is created. If writing to this
# succeeds, this is renamed to the data file, hopefully
# atomically replacing it 
# Note2: the data.tmp file is not a lock file, it is used
# to avoid a half written file in the event of power loss
# or process exit.
proc ::datastore::save { store } {



set LockID [lock $store]

if {[catch {
set FD [open ${dataFileroot}.tmp w+]
fconfigure $FD -translation binary -encoding binary
puts $FD $out
close $FD
file rename -force ${dataFileroot}.tmp $dataFileroot
} err ]} {
unlock $store $LockID
error $err ::datastore::save error saving store $store
}

unlock $store $LockID

}

But it is very difficult (impossible) to safely read/write files unless
you can synchronize access (you need cooperation) and/or use atomic
file operations (serialize access). The above example uses both.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-21 Thread Tom Jackson
On Thu, 2008-08-21 at 13:52 -0700, John Caruso wrote:
 On Thursday 12:08 PM 8/21/2008, Titi Alailima wrote:
 It's perfectly acceptable in dealing with caching systems that the cached 
 value could be out of sync, but not that the cached value could be for 
 something entirely different from what you were looking for.
 
 Yep.  I think that aspect of the issue has been getting lost--it's not 
 just about getting stale data from a given file, but getting data from an 
 entirely *different* file, which I'd agree violates any reasonable 
 expectation of a caching system.

John, 

Your patch fixes this issue as best it can be fixed. The issue that Titi
is addressing cannot be fixed. With your patch you can be sure of the
following:

1. All cache entries are unique. You can't create and cache two files
with the same inode and mtime and have both be over 1 second old. 

2. When an inode is reused (by the filesystem) the associated file mtime
will be larger than the mtime of the cached entry. The new entry will
replace the old entry. 

3. If several files point to the same inode, updating this entry updates
all of them (there is only one cache entry if the inode is used.)

4. If a file changes on disk, and it is less than 2 sec old, it is
served directly, skipping the cache.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Juan José del Río
On Tue, 2008-08-19 at 18:25 -0700, John Caruso wrote:
 On Tuesday 05:59 PM 8/19/2008, Juan José del Río wrote:
 If you don't want to deactivate it, and have some C skills, I would
 recommend you to make the needed changes to fastpath code to enable it
 to use the kernel facilities of the operating system (in case you're
 using linux, then that'll be epoll system call; in FreeBSD case it's
 kqueue; etc.).
 
 This is an interesting suggestion, but from a 
 quick scan of the epoll man page it doesn't look 
 like it would work in this case since it acts on 
 an open file descriptor, but fastpath associates 
 file data with a (dev, inode, mtime, size) tuple 
 without keeping an open file descriptor (and it'd 
 be pretty wonky for AOLserver to keep open file 
 descriptors for all files currently in the fastpath cache).
 
 No matter, though, we've got plenty of 
 workarounds, and we'll probably just disable 
 fastpath entirely since the benefits are likely vanishingly small anyway.

Sorry John, i said epoll, but i meant inotify.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Titi Alailima
Could someone document ns_returnfp while we're talking about it?

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309


 -Original Message-
 From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
 Behalf Of Jim Davidson
 Sent: Tuesday, August 19, 2008 8:39 PM
 To: AOLSERVER@LISTSERV.AOL.COM
 Subject: Re: [AOLSERVER] Data corruption with fastpath caching

 Your right, the code snippet below could trip over a race condition as
 you've described.  But, that's not reason enough to change the
 fastpath, it's reason to better document the behavior so folks don't
 write code that uses ns_returnfile for temporary, dynamic content.
 Although fastpath takes care to be correct in most cases (e.g.,
 stat'ing the file on each request and serializing read on cache miss),
 the fast in fastpath is because it's primarily designed to return
 simple static content with minimal overhead.

 BTW:  I believe the ns_returnfile command didn't use the fastpath
 originally -- I think it just opened and sent the content.  It was
 changed because folks asked for it to go faster I think -- can't
 recall.

 Anyway, for your app, it might be easiest to not change your code but
 instead write a new ns_returnfile to override the builtin -- maybe
 just with open and ns_returnfp.

 -Jim




 On Aug 19, 2008, at 4:00 PM, John Caruso wrote:

  On Monday 05:53 PM 8/18/2008, Jeff Rogers wrote:
  russell muetzelfeldt wrote:
  fastpath is making assumptions about what means something is the
  same file, and those assumptions are not consistent with unix
  filesystem semantics - how is this not a bug?
 
  It's not a bug because no one ever said that it *was* strictly
  following unix filesystem semantics, which isn't even a single
  thing (ufs is slightly different than nfs, is slightly different
  than ext2 -noatime, is slightly different than afs, etc.)  It is
  following a particular definition: if the file still exists and has
  the same dev/inode/mtime/size as it did when you last checked, then
  it is the same file.   This of it as a if-modified-since or if-none-
  match conditional GET.
 
  Actually that's not analogous, for the same reason that the
  analogies to caching of attributes in NFS, rsync or tar not noticing
  content changes if attributes stay the same, etc, don't apply:
  because this bug can happen *even with two files that have
  completely different names or paths*.  Again, in this example...:
 
set file [open /var/tmp/myfile w]
puts $file ABC123
close $file
ns_returnfile 200 text/plain /var/tmp/myfile
ns_unlink -nocomplain /var/tmp/myfile
 
set file [open /var/tmp/myotherfile w]
puts $file XYZ987
close $file
ns_returnfile 200 text/plain /var/tmp/myotherfile
ns_unlink -nocomplain /var/tmp/myotherfile
 
  ...AOLserver will almost always return the contents of /var/tmp/
  myfile rather than /var/tmp/myotherfile in response to the second
  ns_returnfile.
 
  I think the analogies to other systems aren't really germane anyway--
  AOLserver's behavior has to be judged on its own merits.  But
  adopting that standard, I can't think of any other program that
  would confuse /var/tmp/myfile with /var/tmp/myotherfile.
 
  - John
 
 
  --
  AOLserver - http://www.aolserver.com/
 
  To Remove yourself from this list, simply send an email to
 [EMAIL PROTECTED]
   with the
  body of SIGNOFF AOLSERVER in the email message. You can leave the
  Subject: field of your email blank.


 --
 AOLserver - http://www.aolserver.com/

 To Remove yourself from this list, simply send an email to
 [EMAIL PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the
 Subject: field of your email blank.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Gustaf Neumann

Titi Alailima schrieb:

This sounds like the problem.  Not a bug with fastpath, 

Come on folks, the discussion wether or the behavior shown by John Caruso
is a bug or not is completely fruitless. Most aolserver users are not 
novices,

and if it takes some of us several weeks to find, what the problem is,
we should act and not insist, that it is no bug.

In my opinion, switching the caching index from inodes to file names 
(like in windows)
is a very reasonable solution getting rid of most problems (although 
less cache efficient).


It would be additionally a nice feature to provide a configuration 
option for getting

back the current behavior (for people with tons of links). This option would
guarantee backward compatibiity. This way, one could savely let fastpath
switched on by default.

-gustaf neumann


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread John Caruso

On Wednesday 08:45 AM 8/20/2008, Jim Davidson wrote:

Overall, it seems one thing to do would be to switch to filename-based
cache keys by default, leaving the dev/inode pair as an option for
folks who run sites with large symlinks and want to benefit from
caching objects just once.  I think that should avoid the data
corruption cases John pointed out with minimal downside.


Actually that wouldn't have fixed the problem in the code that led us to 
find out about this in the first place.  The change that I suggested does 
fix that problem, though, and it directly addresses the limitation of 
mtime's one-second granularity--which is the crux of the issue.  The patch 
below (really just a one-line change) implements this fix:


--- 8 
-
--- aolserver-4.5.0-orig/nsd/fastpath.c 2006-04-19 10:48:47.0 
-0700
+++ aolserver-4.5.0/nsd/fastpath.c  2008-08-19 21:22:26.0 
-0700

@@ -507,9 +507,11 @@
 }

 if (servPtr-fastpath.cache == NULL
-   || stPtr-st_size  servPtr-fastpath.cachemaxentry) {
+   || stPtr-st_size  servPtr-fastpath.cachemaxentry
+   || (time(NULL) - stPtr-st_mtime)  2) {
/*
-* Caching is disabled or the entry is too large for the cache
+* Caching is disabled, the entry is too large for the cache,
+* or the file was modified too recently to be cached safely,
 * so just open, mmap, and send the content directly.
 */

--- 8 
-


We've tested this fix extensively against the code that was hitting the 
bug before, and I can verify that it resolves the problem there.  As far 
as I can tell this fix would resolve the issue in any standard scenario 
(and certainly all of the ones I've outlined thus far).


Given that this is a straightforward, user-transparent change that would 
have only a negligible impact on fastpath caching, and considering the 
security implications, I'd suggest that this change be applied to the 
AOLserver code in CVS.


BTW, Jeff, the scenario you'd outlined that you thought would trip this 
up...:


   13:50:21 - create file
   13:50:21 - serve file (gets cached)
   13:50:21 - delete file
   13:50:21 - create file again (reuses inode)
   ... time passes ...
   13:55:11 - serve file

...actually wouldn't, because the file would NOT be cached in the second 
line.  The whole point of this strategy is that a file won't be cached if 
it's been modified within the threshold time (2 seconds in the patch 
above).


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Derek

On Aug 20, 2008, at 11:54 AM, John Caruso wrote:

Actually that wouldn't have fixed the problem in the code that led  
us to find out about this in the first place.  The change that I  
suggested does fix that problem, though, and it directly addresses  
the limitation of mtime's one-second granularity--which is the crux  
of the issue.  The patch below (really just a one-line change)  
implements this fix:


all this traffic to use functionality in a way that wasn't intended.

i fail to see how the right problem is being solved here.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jeff Rogers

John Caruso wrote:
BTW, Jeff, the scenario you'd outlined that you thought would trip this 
up...:


   13:50:21 - create file
   13:50:21 - serve file (gets cached)
   13:50:21 - delete file
   13:50:21 - create file again (reuses inode)
   ... time passes ...
   13:55:11 - serve file

...actually wouldn't, because the file would NOT be cached in the second 
line.  The whole point of this strategy is that a file won't be cached 
if it's been modified within the threshold time (2 seconds in the patch 
above).


Fine, then change that first timestamp to 13:50:18 (say if you ran 
another external program after creating the file but before serving it 
that took more than 2 seconds, or if your external program backdated the 
file mtime.)  It's still a race condition that you'll hit if all the 
stars are in the wrong place.  And it still hurts the optimization of 
using a 404 adp page to generate a heavyweight file only once that gets 
cached.


If your patch solves your problem, that's great, and that's the whole 
point of OSS.  But it does nothing to solve the problem generally and 
has negative side effects, so I think it would be a mistake to add it to 
the general distribution.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jim Davidson
Hmm... I may be confused and have to re-read all the past messages  
(except I've deleted them) but my understanding was it was rapid re- 
use of inodes within the 1-second resolution of mtime and the same  
file size, all confusing the cache code that caused the problem.


With the fix below with resolution for mtime at 1 second and the  
grace period at 2 seconds I  can see how it would work but it would  
make me a bit queasy -- fixes which have assumptions of timing can be  
fragile.With filename-based keys and unique filenames (which would  
seem like a natural requirement from someone writing similar code),  
the inode would be ignored and you'd get a consistent view, regardless  
of timing or what other threads would be up to.  I think you could try  
this approach as quickly as your fix -- just define _WIN32 after the  
include nsd.h to get the filename behavior of Win32.  You could run  
your test and see if it was stable as well -- I'd be curious.


Again, this whole issue is interesting and the problem report quite  
subtle, justifying some sort of defensive fix but using ns_returnfile  
for short, dynamic content still seems like the wrong approach.  Ideas  
to use a cache of open fd's via Ns_GetTemp or Tcl channels via  
ns_returnfp seem closer to what's needed here.


BTW: Which OS is re-using inodes so quickly?  I can't get my Mac OS/X  
laptop to do that -- figured the inode re-use/prediction thing was  
plugged years ago, e.g., when fsirand was introduced for scrambling  
NFS vnodes.


-Jim




On Aug 20, 2008, at 12:54 PM, John Caruso wrote:


On Wednesday 08:45 AM 8/20/2008, Jim Davidson wrote:
Overall, it seems one thing to do would be to switch to filename- 
based

cache keys by default, leaving the dev/inode pair as an option for
folks who run sites with large symlinks and want to benefit from
caching objects just once.  I think that should avoid the data
corruption cases John pointed out with minimal downside.


Actually that wouldn't have fixed the problem in the code that led  
us to find out about this in the first place.  The change that I  
suggested does fix that problem, though, and it directly addresses  
the limitation of mtime's one-second granularity--which is the crux  
of the issue.  The patch below (really just a one-line change)  
implements this fix:


--- 8  
-
--- aolserver-4.5.0-orig/nsd/fastpath.c 2006-04-19  
10:48:47.0 -0700
+++ aolserver-4.5.0/nsd/fastpath.c  2008-08-19  
21:22:26.0 -0700

@@ -507,9 +507,11 @@
}

if (servPtr-fastpath.cache == NULL
-   || stPtr-st_size  servPtr-fastpath.cachemaxentry) {
+   || stPtr-st_size  servPtr-fastpath.cachemaxentry
+   || (time(NULL) - stPtr-st_mtime)  2) {
   /*
-* Caching is disabled or the entry is too large for the cache
+* Caching is disabled, the entry is too large for the cache,
+* or the file was modified too recently to be cached safely,
* so just open, mmap, and send the content directly.
*/

--- 8  
-


We've tested this fix extensively against the code that was hitting  
the bug before, and I can verify that it resolves the problem  
there.  As far as I can tell this fix would resolve the issue in any  
standard scenario (and certainly all of the ones I've outlined thus  
far).


Given that this is a straightforward, user-transparent change that  
would have only a negligible impact on fastpath caching, and  
considering the security implications, I'd suggest that this change  
be applied to the AOLserver code in CVS.


BTW, Jeff, the scenario you'd outlined that you thought would trip  
this up...:


  13:50:21 - create file
  13:50:21 - serve file (gets cached)
  13:50:21 - delete file
  13:50:21 - create file again (reuses inode)
  ... time passes ...
  13:55:11 - serve file

...actually wouldn't, because the file would NOT be cached in the  
second line.  The whole point of this strategy is that a file won't  
be cached if it's been modified within the threshold time (2 seconds  
in the patch above).


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jeff Rogers

Jim Davidson wrote:

BTW: Which OS is re-using inodes so quickly?  I can't get my Mac OS/X 
laptop to do that -- figured the inode re-use/prediction thing was 
plugged years ago, e.g., when fsirand was introduced for scrambling NFS 
vnodes.


Linux.  This tcl page:

set fn /tmp/tmpfile[expr rand()]
set f [open $fn w]
puts $f [ns_queryget data]
close $f
after 2
ns_returnfile 200 text/plain $fn
ns_unlink $fn

being hit at the same time in 2 windows:
$ while true; do res=`curl -s http://localhost:8000/crap.tcl?data=wxyz`; 
if [ $res != 'wxyz' ]; then echo $res; break ; fi; echo -n . ; done


$ while true; do res=`curl -s http://localhost:8000/crap.tcl?data=wxyz`; 
if [ $res != 'wxyz' ]; then echo $res; break ; fi; echo -n . ; done


Will cause one or the other test script to get the error typically 
withing 5 requests.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread John Caruso

On Wednesday 10:58 AM 8/20/2008, Jim Davidson wrote:

With the fix below with resolution for mtime at 1 second and the
grace period at 2 seconds I  can see how it would work but it would
make me a bit queasy -- fixes which have assumptions of timing can be
fragile.


Perhaps, but not in this case.  This is just another way to circumvent use 
of the cache on an object-by-object basis (like the size restriction 
that's already there), which recognizes the fact that using mtime as a 
determiner of uniqueness is limited by the fact that mtime has a 
granularity of only one-second.  As for fragility, the fastpath algorithm 
is fragile *now*, thanks precisely to its assumptions about timing.  This 
very simple change removes the bulk of that fragility.



Again, this whole issue is interesting and the problem report quite
subtle, justifying some sort of defensive fix but using ns_returnfile
for short, dynamic content still seems like the wrong approach.


Whether or not that's so, the fact is that everyone on this list appeared 
to share the same utterly natural assumption that ns_returnfile X really 
will return file X--which turns out to be untrue solely because of 
fastpath caching's design limitation.  This fix resolves that design 
limitation in any standard circumstance.



BTW: Which OS is re-using inodes so quickly?


The ext3 filesystem on Linux and the ufs filesystem on Solaris both re-use 
inodes in this way.


Jeff,

On Wednesday 10:56 AM 8/20/2008, Jeff Rogers wrote:

John Caruso wrote:
BTW, Jeff, the scenario you'd outlined that you thought would trip this 
up...:

   13:50:21 - create file
   13:50:21 - serve file (gets cached)
   13:50:21 - delete file
   13:50:21 - create file again (reuses inode)
   ... time passes ...
   13:55:11 - serve file
...actually wouldn't, because the file would NOT be cached in the second 
line.  The whole point of this strategy is that a file won't be cached 
if it's been modified within the threshold time (2 seconds in the patch 
above).


Fine, then change that first timestamp to 13:50:18 [...]


No, you're still not understanding how the patch works.  If you change the 
first timestamp to 13:50:18, the file will indeed be cached at 
13:50:21--but with an mtime of 13:50:18.  When the new file is served at 
13:55:11, it will *not* result in a cache hit because the mtime will be 
different.  That's exactly how the patch fixes this issue.


If your patch solves your problem, that's great, and that's the whole 
point of OSS.  But it does nothing to solve the problem generally and has 
negative side effects, so I think it would be a mistake to add it to the 
general distribution.


I'm surprised you're taking such an all-or-nothing view now, given that 
you started out being open to discussion.  This patch certainly does solve 
the problem generally--in all but what I'd say are pathological cases, and 
certainly in any standard usage (like the various examples I've 
posted).  And it does it by directly addressing the fastpath algorithm's 
reliance on mtime, which has only one-second granularity.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread John Caruso

By the way, Jeff, regarding this...:

On Wednesday 10:56 AM 8/20/2008, Jeff Rogers wrote:
And it still hurts the optimization of using a 404 adp page to generate a 
heavyweight file only once that gets cached.


...which you'd explained elsewhere as...:

There is also at least one clever optimization where static content 
does get served within a second of being created, where the 404 page is 
used to generate something like an image from something like a database 
and writes it to a file where it is subsequently served by fastpath.


...this fix doesn't break this functionality.  You can still do it and 
it'll still work.  And in fact, others have been arguing (and I believe 
you've been agreeing) that serving anything other than truly static 
content with ns_returnfile is immoral anyway--so it seems more than a bit 
contradictory to use that exact case to argue against the fix.


In any case, though, assuming that this once-generated heavyweight file is 
actually reused multiple times, it *will* be cached...just not for 1-2 
seconds.  I think the negligible cost of not having caching for 
just-created files for a period of 1-2 seconds is more than justified by 
the necessity to patch such a serious data corruption and security hole.


(I used two seconds out of an excess of caution, BTW; one would be 
sufficient.)


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Derek

On Aug 20, 2008, at 1:29 PM, John Caruso wrote:

Whether or not that's so, the fact is that everyone on this list  
appeared to share the same utterly natural assumption that  
ns_returnfile X really will return file X--which turns out to be  
untrue solely because of fastpath caching's design limitation.  This  
fix resolves that design limitation in any standard circumstance.


use ns_returnfile for static data as it was intended or you will have  
problems.


how is this thread still alive.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Tom Jackson
John,

The patch below is not at all unreasonable as far as your stated goal of
not caching a newly modified file.

BTW, I think that fastpath is not enabled by default, your config has to
have the fastpath config section with the cache parameter set to true.

Also, ns_returnfile is really just an internal C api which has been
exposed to Tcl so that developers can create specific file handlers. In
general it is dangerous to serve content from outside of pageroot. But
that just means that the developer has to put a lot of care into
avoiding problems. In other words, ns_returnfile is not really intended
as an end user API. It usually ends up as some more friendly procedure
that handles things like relative paths, security, etc. 

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Eric Larkin
On 8/20/08 11:29 AM, John Caruso [EMAIL PROTECTED] wrote:
 Whether or not that's so, the fact is that everyone on this list appeared
 to share the same utterly natural assumption that ns_returnfile X really
 will return file X

All, I've been on vacation or I would have chimed in earlier, but as John's
client and CTO of the company who found the problem (and is now faced with a
fairly extensive and difficult impact assessment to determine whether the
confidentiality and integrity of our customers' data has been compromised),
I find the suggestion that this is not a bug to be utterly baffling.
Perhaps if the procedure in question was called ns_returnfromcache, I
could see the arguments against the behavior being considered a bug, but the
name of the procedure is ns_returnfile, and it takes an argument which is
a filename.  Our objective in using the procedure was not to return a
dynamic file through the cache, it was to return a dynamically generated
file (which was produced by an exec of an OS-level command) from the
filesystem...and the documentation for the procedure certainly did not
suggest that its functionality did not support this usage.

Obviously we'll work around the problem in the future, but it is
disheartening to find a fairly subtle bug, report it with a reproducible
test case, and be challenged so aggressively on the whether it was a poor
decision to use ns_returnfile to...um...return a file.

Eric
__
Eric Larkin
Chief Technology Officer
Arena Solutions

[EMAIL PROTECTED]
4100 E. Third Ave.| Suite 300 | Foster City | CA 94404
tel: 650.513.3502 | fax: 650.513.3511


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jeff Rogers

John Caruso wrote:

 No, you're still not understanding how the patch works.

Ok, I'll admit that I misread it at first, but you're also not 
understanding why I'm saying why it will still break.



I'm surprised you're taking such an all-or-nothing view now,


I don't think I'm taking an all-or-nothing view at all, I just think 
your solution isn't the right one.


given that 
you started out being open to discussion.  This patch certainly does 
solve the problem generally--in all but what I'd say are pathological 
cases, and certainly in any standard usage (like the various examples 
I've posted).  And it does it by directly addressing the fastpath 
algorithm's reliance on  mtime, which has only one-second granularity.


pathological cases are exactly the problem.  I'm only speaking for 
myself of course, but I suspect that at least a few others would agree 
that your case is pathological itself, and not at all standard usage.


I can very easily come up with a scenario that breaks your patched 
fastpath just as easily as the original, to which you can rightly say, 
but why would you do it that way?.  And you would be right.


That is the exact same thing that has been said repeatedly on this 
thread to you: why are you doing it that way?  You probably have valid 
reasons and in any case I'm in no position to question your reasons. 
That doesn't make your case any less pathological than some other one.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jeff Rogers

Derek wrote:


how is this thread still alive.


I think this bikeshed should be painted blue.

See http://www.bikeshed.com/ if you don't understand this.

-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Tom Jackson
On Wed, 2008-08-20 at 12:23 -0700, Eric Larkin wrote:
 On 8/20/08 11:29 AM, John Caruso [EMAIL PROTECTED] wrote:
  Whether or not that's so, the fact is that everyone on this list appeared
  to share the same utterly natural assumption that ns_returnfile X really
  will return file X
 
 All, I've been on vacation or I would have chimed in earlier, but as John's
 client and CTO of the company who found the problem (and is now faced with a
 fairly extensive and difficult impact assessment to determine whether the
 confidentiality and integrity of our customers' data has been compromised),
 I find the suggestion that this is not a bug to be utterly baffling.

Eric,

I'm not sure what your qualifications are to determine if it is a bug or not. 
The author of the code
doesn't seem to think it is a bug. Everyone agrees that the code works as 
intended. It was no secret
at the time the code was written that the file mtime granularity is one second. 
When fastpath was added
many years ago, it was documented in the changelogs. There are configuration 
parameters in the config
file. 

I just sent an email responding to John's suggested patch. It is a great 
suggestion for several reasons,
the most important is that it doesn't change the intended purpose of the cache 
or the API. As John said
there is no visible impact on the user. I would even go so far as to suggest 
that the wait time (2 sec) 
be added as a configuration parameter. Although the semantics should be 
discussed. 

This patch may fix your initial problem, but it does nothing to fix the broken 
use of ns_returnfile. If you
are serious about not exposing sensitive information, don't write it to disk as 
a file. Most security breaches
don't happen by accident. I have outlined how you can avoid the problem using 
ns_returnfp, _AND_ a 
particular series of commands. No single API will serve as some kind of shield 
of protection, it takes
a lot of effort. Anything involving files opens up a whole series of problems. 
They are not bugs.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread John Caruso

On Wednesday 12:30 PM 8/20/2008, Jeff Rogers wrote:
I can very easily come up with a scenario that breaks your patched 
fastpath just as easily as the original, to which you can rightly say, 
but why would you do it that way?.  And you would be right.


Do it, then.  This is the simplest example I've given that exhibits this 
bug:


eval exec /some/external/program --output-file $tempfile
ns_returnfile 200 text/plain $tempfile
ns_unlink $tempfile

This is ALL that's required.  No external meddling, no munging of file 
modification times, nothing else.  Three lines of Tcl code.  By all means, 
show me an example that defeats this patch that's anywhere near as simple 
as that.


And far, far more to the point: that's as NATURAL as that.  I'd assert 
that that example code is perfectly intuitive--and right up to the point 
where I pointed out this bug in the first place, it would have been 
accepted without remark on this mailing list.  *Now*, of course, serving 
anything but 5-year old files with ns_returnfile is proof that one should 
give up computers for a living, and not recognizing the holy contract we 
enter into with AOLserver when we invoke ns_returnfile that our files must 
continue to exist for at least one more second is a valid reason to 
contemplate suicide to end our worthless existence.


Look: we discovered a serious bug in AOLserver's fastpath caching 
mechanism that can cause both data corruption and information 
leakage.  I've explained that bug carefully, in the face of confusion, 
obfuscation, and a continual stream of utterly unnecessary abuse.  I've 
offered a tested patch that provides a minimal, correct-by-inspection, 
user-transparent, massive improvement over the current behavior, despite 
my near certainty (based on previous experience) that it'd just lead to 
another round of carping, finger-pointing, and refusal to accept that 
there's even a problem here in the first place.  I've explained why the 
patch makes sense (and in fact addresses exactly the limitation that 
should have been considered when the fastpath caching mechanism was 
initially designed).  I've responded to all serious concerns, respectfully 
and without returning any of the bile that's been sent my way.


For anyone who's serious about securing your installation, you have the 
patch now, and I'd strongly suggest that you apply it to your AOLserver 
sources.  I'll still respond to serious concerns that don't just rehash 
the same excuses, but otherwise I'm done.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Jim Davidson

Hi Folks,

I agree with Eric, even though I wrote the original code and was one  
of the first to suggest is wasn't a bug.  This thread has surprised me  
in a few ways:


-- The bug was indeed subtle and curious
-- The debate on dynamic vs. static and underlying assumptions and  
performance was well reasoned
-- The name of the command does indicate something it is not in a way  
that matters
-- Folks generally underestimated the impact of this bug except for  
those affected

-- The direct personal attacks were a bit embarrassing to watch


Overall, seems like:

-- Patching underlying fastpath to be filename-based keys makes sense  
if it's confirmed to solve the problem
-- What ns_returnfile does is good for some things if you know what  
it's doing
-- Sending dynamic content with the current ns_returnfile isn't a good  
idea


For Eric and John, I'd recommend a Tcl-based ns_retunfile wrapper  
using open/ns_returnfp/close as a quick first step.



BTW:  The underlying cause -- the rapid re-use of inodes -- is indeed  
a behavior of at least Linux but not Mac OS/X.  Compare:


Linux:
[EMAIL PROTECTED]:~$ rm -f foo ; touch foo ; stat -c %d.%i foo ; rm foo ;  
touch foo ; stat -c %d.%i foo

2049.712963
2049.712963

Mac OS/X:
[JimBook:~] jimbo% rm -f foo ; touch foo ; stat -f %d.%i foo ; rm  
foo ; touch foo ; stat -f %d.%i foo

234881026.4090565
234881026.4090566

I find this very interesting But, maybe I'm odd, I know most of  
you have probably long since begun to ignore this thread and it should  
probably die now...


Cheers,
-Jim




On Aug 20, 2008, at 3:23 PM, Eric Larkin wrote:


On 8/20/08 11:29 AM, John Caruso [EMAIL PROTECTED] wrote:
Whether or not that's so, the fact is that everyone on this list  
appeared
to share the same utterly natural assumption that ns_returnfile X  
really

will return file X


All, I've been on vacation or I would have chimed in earlier, but as  
John's
client and CTO of the company who found the problem (and is now  
faced with a
fairly extensive and difficult impact assessment to determine  
whether the
confidentiality and integrity of our customers' data has been  
compromised),

I find the suggestion that this is not a bug to be utterly baffling.
Perhaps if the procedure in question was called  
ns_returnfromcache, I
could see the arguments against the behavior being considered a bug,  
but the
name of the procedure is ns_returnfile, and it takes an argument  
which is

a filename.  Our objective in using the procedure was not to return a
dynamic file through the cache, it was to return a dynamically  
generated

file (which was produced by an exec of an OS-level command) from the
filesystem...and the documentation for the procedure certainly did not
suggest that its functionality did not support this usage.

Obviously we'll work around the problem in the future, but it is
disheartening to find a fairly subtle bug, report it with a  
reproducible
test case, and be challenged so aggressively on the whether it was a  
poor

decision to use ns_returnfile to...um...return a file.

Eric
__
Eric Larkin
Chief Technology Officer
Arena Solutions

[EMAIL PROTECTED]
4100 E. Third Ave.| Suite 300 | Foster City | CA 94404
tel: 650.513.3502 | fax: 650.513.3511


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Juan José del Río
Hello John,

I just think that you're introducing complexity to a common and widely
used infraestructure (with a hack), just to hide the flaws on your side.

I don't like this hack. It's not the correct thing to do, not even if
you make some parameters (like the timeout) changeable via the
configuration file. It just doesn't feel good to me.

But, of course, you are free to patch your own AOLServer. I can't (and
won't) complain about it.

Someone said that this was not an academic discussion... but I am afraid
that to some extent it is. We've been speaking that the cache can't know
if its contents are good or not, if the changes happen out if its scope,
and it's never notified of it.

But, since Linux version 2.6.13, inotify is into the kernel, and
aolserver can subscribe to a path, so know if that file has been
deleted, modified, or anything else. That's the way a cache can know if
the file has been altered in any way, and it should be marked as dirty.
As easy as that... but I know it is harder than that one-line-patch.

Now you can do whatever you want... as long as you don't ask me to patch
my code with the dirty hack you proposed. I don't like having
AOLServer's SVN version patched with this.

Best Regards,

  Juan José

-  
Juan José del Río|  
(+34) 616 512 340|  [EMAIL PROTECTED]


Simple Option S.L.
  Tel: (+34) 951 930 122
  Fax: (+34) 951 930 122
  http://www.simpleoption.com


On Wed, 2008-08-20 at 13:21 -0700, John Caruso wrote:
 On Wednesday 12:30 PM 8/20/2008, Jeff Rogers wrote:
 I can very easily come up with a scenario that breaks your patched 
 fastpath just as easily as the original, to which you can rightly say, 
 but why would you do it that way?.  And you would be right.
 
 Do it, then.  This is the simplest example I've given that exhibits this 
 bug:
 
  eval exec /some/external/program --output-file $tempfile
  ns_returnfile 200 text/plain $tempfile
  ns_unlink $tempfile
 
 This is ALL that's required.  No external meddling, no munging of file 
 modification times, nothing else.  Three lines of Tcl code.  By all means, 
 show me an example that defeats this patch that's anywhere near as simple 
 as that.
 
 And far, far more to the point: that's as NATURAL as that.  I'd assert 
 that that example code is perfectly intuitive--and right up to the point 
 where I pointed out this bug in the first place, it would have been 
 accepted without remark on this mailing list.  *Now*, of course, serving 
 anything but 5-year old files with ns_returnfile is proof that one should 
 give up computers for a living, and not recognizing the holy contract we 
 enter into with AOLserver when we invoke ns_returnfile that our files must 
 continue to exist for at least one more second is a valid reason to 
 contemplate suicide to end our worthless existence.
 
 Look: we discovered a serious bug in AOLserver's fastpath caching 
 mechanism that can cause both data corruption and information 
 leakage.  I've explained that bug carefully, in the face of confusion, 
 obfuscation, and a continual stream of utterly unnecessary abuse.  I've 
 offered a tested patch that provides a minimal, correct-by-inspection, 
 user-transparent, massive improvement over the current behavior, despite 
 my near certainty (based on previous experience) that it'd just lead to 
 another round of carping, finger-pointing, and refusal to accept that 
 there's even a problem here in the first place.  I've explained why the 
 patch makes sense (and in fact addresses exactly the limitation that 
 should have been considered when the fastpath caching mechanism was 
 initially designed).  I've responded to all serious concerns, respectfully 
 and without returning any of the bile that's been sent my way.
 
 For anyone who's serious about securing your installation, you have the 
 patch now, and I'd strongly suggest that you apply it to your AOLserver 
 sources.  I'll still respond to serious concerns that don't just rehash 
 the same excuses, but otherwise I'm done.
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread John Caruso

On Wednesday 01:45 PM 8/20/2008, Juan José del Río wrote:

But, since Linux version 2.6.13, inotify is into the kernel, and
aolserver can subscribe to a path, so know if that file has been
deleted, modified, or anything else. That's the way a cache can know if
the file has been altered in any way, and it should be marked as dirty.
As easy as that... but I know it is harder than that one-line-patch.


inotify isn't available as of Redhat Enterprise 
Linux 4, and I'm sure there are other major 
distributions that are missing it as well--so 
while it may be the best approach eventually it 
won't solve the problem right now.  Also, not all 
platforms use inotify, so adding filesystem 
monitoring support would require complex 
cross-platform code (which in some cases won't 
even be available).  So while I'd agree with you 
that the eventual (ideal) solution is a major 
rewrite of the fastpath caching code to use some 
sort of filesystem monitoring technique, that won't work now.


By comparison, the fix I offered will work just 
fine on all the platforms that AOLserver runs on, 
right now, and it fixes all non-pathological use 
cases (i.e., all but ones that are artifically 
designed to trip it up).  As to whether it's a 
hack, possibly, but so is the fastpath 
code--that's the whole problem.  The patch 
corrects a specific flaw in the original code: 
that fastpath never should have been caching 
files that were modified within the last second, 
since the underlying OS mechanism (mtime) doesn't 
provide the resolution necessary to distinguish 
reliably between two such files.  In other words, 
the patch is in the same spirit as the original 
code--so if you really don't want such ugliness 
in your code you should probably just strip out 
fastpath caching entirely. :-)


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Michael A. Cleverly
On Wed, Aug 20, 2008 at 12:22 PM, Jeff Rogers [EMAIL PROTECTED] wrote:
 Linux.  This tcl page:

 set fn /tmp/tmpfile[expr rand()]
 set f [open $fn w]
 puts $f [ns_queryget data]
 close $f
 after 2
 ns_returnfile 200 text/plain $fn
 ns_unlink $fn

[after 2] would wait 2 milliseconds.  [after 2000] or [ns_sleep 2]
might make a difference(?).


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Juan José del Río
Hello again John,

Oops, sorry. I thought RedHat Linux had everything  an Enterprise
needed ;-)
(Take the above line with a lot of salt. I don't want to discuss about
linux distros... yet ;-)

Then what about adding a -nocache parameter to the function? That way
you won't modify the original behaviour...

As far as the solution is not perfect, if we keep pushing up patches,
it'll end being a hell consisting on a pile of patches. In fact, I feel
that checking the filename instead of last modification time is a better
way to patch the if clause. It's safer to be sure that will not be
collisions by that way.

Anyways, as someone said, I am amazed on the fact of how ext3 reuses the
inode numbers... I didn't know it was so aggresive :-)

Regards, and hope you finally can serve your customers right (no matter
how you solve this problem),

  Juan José



-  
Juan José del Río|  
(+34) 616 512 340|  [EMAIL PROTECTED]


Simple Option S.L.
  Tel: (+34) 951 930 122
  Fax: (+34) 951 930 122
  http://www.simpleoption.com


On Wed, 2008-08-20 at 14:24 -0700, John Caruso wrote:
 On Wednesday 01:45 PM 8/20/2008, Juan José del Río wrote:
 But, since Linux version 2.6.13, inotify is into the kernel, and
 aolserver can subscribe to a path, so know if that file has been
 deleted, modified, or anything else. That's the way a cache can know if
 the file has been altered in any way, and it should be marked as dirty.
 As easy as that... but I know it is harder than that one-line-patch.
 
 inotify isn't available as of Redhat Enterprise 
 Linux 4, and I'm sure there are other major 
 distributions that are missing it as well--so 
 while it may be the best approach eventually it 
 won't solve the problem right now.  Also, not all 
 platforms use inotify, so adding filesystem 
 monitoring support would require complex 
 cross-platform code (which in some cases won't 
 even be available).  So while I'd agree with you 
 that the eventual (ideal) solution is a major 
 rewrite of the fastpath caching code to use some 
 sort of filesystem monitoring technique, that won't work now.
 
 By comparison, the fix I offered will work just 
 fine on all the platforms that AOLserver runs on, 
 right now, and it fixes all non-pathological use 
 cases (i.e., all but ones that are artifically 
 designed to trip it up).  As to whether it's a 
 hack, possibly, but so is the fastpath 
 code--that's the whole problem.  The patch 
 corrects a specific flaw in the original code: 
 that fastpath never should have been caching 
 files that were modified within the last second, 
 since the underlying OS mechanism (mtime) doesn't 
 provide the resolution necessary to distinguish 
 reliably between two such files.  In other words, 
 the patch is in the same spirit as the original 
 code--so if you really don't want such ugliness 
 in your code you should probably just strip out 
 fastpath caching entirely. :-)
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread Tom Jackson
John, 

Your last patch suggestion seems good, not caching something that looks
like a new file is fully in line with the intent of fastpath and
ns_returnfile. 

I'm not sure everyone is commenting on this new patch idea, maybe a
previous idea?

Anyway if the cache is for serving static (and likely older than a few
seconds) content, why cache it until it looks static? I think that is
the basic thrust of the patch. It would be impossible to poison the
cache by accident if you wait until a file ages a few seconds before you
stick it in. 

But the poison is really self-inflicted. The only data on the server
that can be unintentionally exposed is something that made it into the
cache in the first place. Stuff only makes its way into the cache by
filename. So if your application sends out secret files via
ns_returnfile, you could have a problem, but no long lived secret file
will ever fall into this trap by accident, it would have to cease to
exist (giving up its inode) after being place into cache, then a new
file would have to be created and served. All of this assumes that your
webserver process can read the secret file. 

BUT! Please pay attention to this: ns_returnfile is by definition not
safe in the context of most webserver APIs. It returns content that is
outside of pageroot. It exposes (to Tcl) an internal API which handles
returning files under pageroot. The reason it is exposed is so that
developers can easily create their own filehandlers and virtual servers.
The internal API handles a handful of annoyingly picky but important and
standard website and HTTP features. It also provides a number of hooks
for customization related to these features. 

tom jackson 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-20 Thread russell muetzelfeldt

On 21/08/2008, at 1:45 AM, Jim Davidson wrote:

I looked at the code a bit closer.   The ns_returnfile and  
ns_respond commands both call Ns_ConnReturnFile, the public API to  
the underlying FastPath.  It does more than just blast the content  
-- it handles:



...

-- caches, mmap's, or simply opens the fd and sends, chunk by chunk


not directly related to the issue at hand, but if this is being  
worked on shouldn't the file returning be handled by sendfile() on  
platforms that support it? that'd bypass fastpath, but on linux at  
least you'd not be wasting RAM by buffering in both the page cache  
and fastpath, and spitting the data back out the connection socket  
gets handled entirely in the kernel...


just a thought...


cheers

Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Tom Jackson
On Mon, 2008-08-18 at 19:20 -0700, John Caruso wrote:

 I'd say it's still better, because it requires explicit action on the 
 user's part to enable the flawed caching mechanism in that case.  And 
 actually I don't think fastpath in its default configuration would be of 
 much help in performance terms these days, given that the cache is only 
 5MB large and file data is typically cached by the OS anyway (and servers 
 generally have far more RAM than they did even five years ago).
 

fastpath is for small static content. You don't need to cache large
files, and that is why the cachemaxsize parameter gives you a cutoff on
the largest size to cache. 

AOLserver has great performance on small files, fastpath speeds it up
further, plus the overall scheme handles directory files, internal
redirects, etc. 

 I do think this should have been considered (and steps taken to address 
 it) when the fastpath caching mechanism was initially developed, since 
 it's a glaring flaw.  I've designed things that rely on shaky underlying 
 assumptions in the past, but only in controlled circumstances where those 
 assumptions were guaranteed to obtain.  I can think of situations in which 
 a caching mechanism with this type of design limitation wouldn't be an 
 issue, but in my opinion it has no place being a default-enabled mechanism 
 in an enterprise-grade web server.

Why not just write another API which strips out all the things you don't
like. I think you misjudge fastpath in every way, but whatever.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Andrew Piskorski
On Mon, Aug 18, 2008 at 06:06:23PM -0700, John Caruso wrote:

 That'd be an improvement over the current situation, but it's still the 
 case that AOLserver as currently shipped has a file cache mechanism built 
 into it which 1) may return incorrect data and 2) is enabled by 
 default.  Given the risk, I'd say fastpath caching should be disabled by 
 default rather than enabled.

Sounds right to me.  Either robustify Fastpath somehow against this
corner case, or don't have Fastpath turned on by default.

-- 
Andrew Piskorski [EMAIL PROTECTED]
http://www.piskorski.com/


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Titi Alailima
This would be a wonderful addition to the documentation.  As a matter of fact, 
I just added it:
http://panoptic.com/wiki/aolserver/Fastpath

For what it's worth, it seems to me that if it has a measurable benefit, it's 
worth leaving on by default, as long as developers are properly educated about 
design issues (flaws, bugs, tradeoffs, whatever) that they need to deal with.  
If it's off by default it may as well be removed entirely.  I say on by 
default, but well-documented so that developers are forced to have at least a 
cursory understanding of it when doing anything that might relate to it.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309


 -Original Message-
 From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
 Behalf Of Tom Jackson
 Sent: Tuesday, August 19, 2008 1:18 AM
 To: AOLSERVER@LISTSERV.AOL.COM
 Subject: Re: [AOLSERVER] Data corruption with fastpath caching

 On Tue, 2008-08-19 at 12:24 +1000, russell muetzelfeldt wrote:
  On 19/08/2008, at 11:59 AM, Tom Jackson wrote:
   On Tue, 2008-08-19 at 11:37 +1000, russell muetzelfeldt wrote:
   On 19/08/2008, at 10:56 AM, Tom Jackson wrote:
  
   You want a transactional database but you are using a filesystem.
   Grow up.
  
   and
  
   If your application wasn't the responsible party which violated
 the
   expectation you state, I would agree (maybe).
  
   please go and re-read this thread, and get your parties straight.
  
   Sorry, I don't follow.
 
  ok, I'll spell it out.
 
  it's not my application that's violated the expectation I state. you
  haven't been paying attention to the From: headers, and seem to have
  mistaken me for the original poster of this thread.

 Ah, okay. I didn't mean to point to any particular application, by
 your I didn't mean any particular you or your.

  all I've been saying is that ns_returnfile filename returning the
  content of something other than filename, contrary to the
  documentation and common sense, is a bug. given that fastpath exists
  for a (good) reason, and that the behaviour which triggers the bug is
  marginal anyway, the correct response is the bug will not be fixed,
  here's why, and here's how to work around it.

 It is an interesting point. But it isn't a bug. The purpose of the API
 is to return a static file, not one which changes in under a second. It
 is not a bug to not support code which is guaranteed to be slower than
 common alternatives.

 Fastpath is designed to support return of smallish static content. It
 isn't some ancient way of speeding up stuff that was slow, it was for
 speeding up stuff that was already fast but was easy to make even
 faster.

 If you want to avoid use of fastpath, just set the configuration lower
 than your dynamic content:

 #
 # Fastpath
 #
 ns_section ns/server/${server}/fastpath
 ns_param cache[set cache 10] ;# max entries ??
 ns_param cachemaxsize [set cachemaxsize [expr 5 * 1024 * 1024]]
 ns_param cachemaxentry[expr round(floor($cachemaxsize/$cache))]


 Or, if the dynamic content is very small, or customized, don't write it
 to a file in the first place. In general you are probably doing
 something wrong if you write small content to a file and immediately
 delete it. You are also likely doing something wrong if you are caching
 large files.

 tom jackson


 --
 AOLserver - http://www.aolserver.com/

 To Remove yourself from this list, simply send an email to
 [EMAIL PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the
 Subject: field of your email blank.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Juan José del Río
I agree with Titi. The vast majority of times, having Fastpath on does
not harm at all.

Having it disabled by default would be like not using computers because
they fail sometimes. That would be too extreme, isn't it? ;-)

As long as it's well documented, and there are alternatives to avoid the
problems, i think it's ok to leave Fastpath activated by default.

Regards,

  Juan José


--
Juan José del Río
Chief of Commerce
Simple Option S.L.
Avda. Editor Angel Caffarena 11, B11, 1B
Málaga, 29010, Spain

+34 616 512 340 cell
+34 951 930 122 tel/fax




On Tue, 2008-08-19 at 06:18 -0700, Titi Alailima wrote:
 This would be a wonderful addition to the documentation.  As a matter of 
 fact, I just added it:
 http://panoptic.com/wiki/aolserver/Fastpath
 
 For what it's worth, it seems to me that if it has a measurable benefit, it's 
 worth leaving on by default, as long as developers are properly educated 
 about design issues (flaws, bugs, tradeoffs, whatever) that they need to deal 
 with.  If it's off by default it may as well be removed entirely.  I say on 
 by default, but well-documented so that developers are forced to have at 
 least a cursory understanding of it when doing anything that might relate to 
 it.
 
 Titi Ala'ilima
 Lead Architect
 MedTouch LLC
 1100 Massachusetts Avenue
 Cambridge, MA 02138
 617.621.8670 x309
 
 
  -Original Message-
  From: AOLserver Discussion [mailto:[EMAIL PROTECTED] On
  Behalf Of Tom Jackson
  Sent: Tuesday, August 19, 2008 1:18 AM
  To: AOLSERVER@LISTSERV.AOL.COM
  Subject: Re: [AOLSERVER] Data corruption with fastpath caching
 
  On Tue, 2008-08-19 at 12:24 +1000, russell muetzelfeldt wrote:
   On 19/08/2008, at 11:59 AM, Tom Jackson wrote:
On Tue, 2008-08-19 at 11:37 +1000, russell muetzelfeldt wrote:
On 19/08/2008, at 10:56 AM, Tom Jackson wrote:
   
You want a transactional database but you are using a filesystem.
Grow up.
   
and
   
If your application wasn't the responsible party which violated
  the
expectation you state, I would agree (maybe).
   
please go and re-read this thread, and get your parties straight.
   
Sorry, I don't follow.
  
   ok, I'll spell it out.
  
   it's not my application that's violated the expectation I state. you
   haven't been paying attention to the From: headers, and seem to have
   mistaken me for the original poster of this thread.
 
  Ah, okay. I didn't mean to point to any particular application, by
  your I didn't mean any particular you or your.
 
   all I've been saying is that ns_returnfile filename returning the
   content of something other than filename, contrary to the
   documentation and common sense, is a bug. given that fastpath exists
   for a (good) reason, and that the behaviour which triggers the bug is
   marginal anyway, the correct response is the bug will not be fixed,
   here's why, and here's how to work around it.
 
  It is an interesting point. But it isn't a bug. The purpose of the API
  is to return a static file, not one which changes in under a second. It
  is not a bug to not support code which is guaranteed to be slower than
  common alternatives.
 
  Fastpath is designed to support return of smallish static content. It
  isn't some ancient way of speeding up stuff that was slow, it was for
  speeding up stuff that was already fast but was easy to make even
  faster.
 
  If you want to avoid use of fastpath, just set the configuration lower
  than your dynamic content:
 
  #
  # Fastpath
  #
  ns_section ns/server/${server}/fastpath
  ns_param cache[set cache 10] ;# max entries ??
  ns_param cachemaxsize [set cachemaxsize [expr 5 * 1024 * 1024]]
  ns_param cachemaxentry[expr round(floor($cachemaxsize/$cache))]
 
 
  Or, if the dynamic content is very small, or customized, don't write it
  to a file in the first place. In general you are probably doing
  something wrong if you write small content to a file and immediately
  delete it. You are also likely doing something wrong if you are caching
  large files.
 
  tom jackson
 
 
  --
  AOLserver - http://www.aolserver.com/
 
  To Remove yourself from this list, simply send an email to
  [EMAIL PROTECTED] with the
  body of SIGNOFF AOLSERVER in the email message. You can leave the
  Subject: field of your email blank.
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Tom Jackson
Andrew,

This is not a corner case. The exact same thing could happen without
fastpath. 

What is that thing? That the contents of a file changes after a request
is made and before the file is returned. In fact, there is no guarantee
that it won't change mid-return. This is a fact of life with files on
any filesystem. 

In fact, with the HTTP caching mechanisms, you could fail to get
up-to-date contents of a file, since the If-Modified-Since mechanism
will also fail. 

The problem here is that the application is using this static file
handling API to serve dynamic content. Wondering why it doesn't work is
pointless.

Just to summarize again, this case requires that a file is created then
destroyed and another file created within the same second that has the
same size. Also, the original file must get into the cache, and the only
way that can happen is for the application to treat it as a long lived
static file. 

We have other means to cache dynamic data, and large chunks of dynamic
content saved as a file can avoid the fastpath cache by setting the
cachemaxsize parameter. Writing smaller content to disk doesn't make any
sense if your goal is speed...or security. 

It is probably even more important to tamp down these misconceptions
about how AOLserver works. Static and dynamic content are handled by
different API. The reason is that it has long been recognized by the
developers of AOLserver that different techniques are required to
maintain high performance based upon how the content is generated, its
expected lifespan, its size, and its potential for reuse.  

tom jackson

On Tue, 2008-08-19 at 03:00 -0400, Andrew Piskorski wrote:
 On Mon, Aug 18, 2008 at 06:06:23PM -0700, John Caruso wrote:
 
  That'd be an improvement over the current situation, but it's still the 
  case that AOLserver as currently shipped has a file cache mechanism built 
  into it which 1) may return incorrect data and 2) is enabled by 
  default.  Given the risk, I'd say fastpath caching should be disabled by 
  default rather than enabled.
 
 Sounds right to me.  Either robustify Fastpath somehow against this
 corner case, or don't have Fastpath turned on by default.
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Juan José del Río
What about using epoll (or equivalent) in Linux, and kqueue in FreeBSD
to tell the kernel to notify AOLServer in change a file has changed?

That'd be a pretty easy and efficient way to discard fastpath items in
case they have been deleted and/or modified.

Just my two cents ;-) 

-  
Juan José del Río|  
(+34) 616 512 340|  [EMAIL PROTECTED]


Simple Option S.L.
  Tel: (+34) 951 930 122
  Fax: (+34) 951 930 122
  http://www.simpleoption.com


On Tue, 2008-08-19 at 09:20 -0700, Tom Jackson wrote:
 Andrew,
 
 This is not a corner case. The exact same thing could happen without
 fastpath. 
 
 What is that thing? That the contents of a file changes after a request
 is made and before the file is returned. In fact, there is no guarantee
 that it won't change mid-return. This is a fact of life with files on
 any filesystem. 
 
 In fact, with the HTTP caching mechanisms, you could fail to get
 up-to-date contents of a file, since the If-Modified-Since mechanism
 will also fail. 
 
 The problem here is that the application is using this static file
 handling API to serve dynamic content. Wondering why it doesn't work is
 pointless.
 
 Just to summarize again, this case requires that a file is created then
 destroyed and another file created within the same second that has the
 same size. Also, the original file must get into the cache, and the only
 way that can happen is for the application to treat it as a long lived
 static file. 
 
 We have other means to cache dynamic data, and large chunks of dynamic
 content saved as a file can avoid the fastpath cache by setting the
 cachemaxsize parameter. Writing smaller content to disk doesn't make any
 sense if your goal is speed...or security. 
 
 It is probably even more important to tamp down these misconceptions
 about how AOLserver works. Static and dynamic content are handled by
 different API. The reason is that it has long been recognized by the
 developers of AOLserver that different techniques are required to
 maintain high performance based upon how the content is generated, its
 expected lifespan, its size, and its potential for reuse.  
 
 tom jackson
 
 On Tue, 2008-08-19 at 03:00 -0400, Andrew Piskorski wrote:
  On Mon, Aug 18, 2008 at 06:06:23PM -0700, John Caruso wrote:
  
   That'd be an improvement over the current situation, but it's still the 
   case that AOLserver as currently shipped has a file cache mechanism built 
   into it which 1) may return incorrect data and 2) is enabled by 
   default.  Given the risk, I'd say fastpath caching should be disabled by 
   default rather than enabled.
  
  Sounds right to me.  Either robustify Fastpath somehow against this
  corner case, or don't have Fastpath turned on by default.
  


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Jeff Rogers

Tom Jackson wrote:

If you want to avoid use of fastpath, just set the configuration lower
than your dynamic content:

#
# Fastpath
#
ns_section ns/server/${server}/fastpath
ns_param cache[set cache 10] ;# max entries ??
ns_param cachemaxsize [set cachemaxsize [expr 5 * 1024 * 1024]]
ns_param cachemaxentry[expr round(floor($cachemaxsize/$cache))]


The description of the parameters here is a little confusing.  Browsing 
the source, it appears that cache is a flag to enable or disable 
fastpath, cachemaxsize is the maximum size of the cache, and 
cachemaxentry is the largest size of a file that will get cached. 
There is no setting for the max number of entries, the use of $cache in 
the settings above (reflecting the server defaults) is really a minimum 
number of cache entries (i.e., the default cache will hold at least 10 
entries of the max 512k size, but it could also hold 1000 5k files).


I didn't dig deep enough to see how the cache flushing works, but on 
casual perusal it looks like the cache is pruned by removing the oldest 
entries (not largest, least hit, or least recently hit).


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Jim Davidson

Hi folks,

I wrote the code.   The explanation below is correct -- I chose inode/ 
dev combination to cache the same file even with multiple names which  
was the case at AOL -- hundreds of symlinks and hard links to the same  
file.  The same strategy is used for ADP templates.   I think the code  
uses just filenames on Windows because the inode/dev don't really  
exist in Win32 weirdness (or at least I didn't care enough to find the  
proper analog).


As for whether this is a bug or not, opinions vary.  I would suggest  
the code snippet of create temp file and use fastpath to return  
contents is not a use case I was solving for or recommend.   The  
suggestion to open a temp fd, unlink it, dump content into and send  
from the open fd seems the better approach for a few reason including  
proper cleanup after a crash.  In fact, there is an API for such cases  
-- Ns_GetTemp or something -- and it's used internally, for example,  
to spool large file uploads.  It re-uses open and unlinked fd's -- in  
practice file create is expensive and is avoided by just keeping a  
cache of open fd's around, truncating the content at the end of the  
connection.  I'm not sure if the docs are up to date and/or if there  
are useful Tcl commands but something could be added easily if needed.


Having said all that, a note in the docs that ns_returnfile is  
designed for truly static content... and comments on how the cache  
works and can be disabled would make sense.


-Jim





On Aug 18, 2008, at 7:37 PM, Tom Jackson wrote:


On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:

While I'd agree this is a bug in fastpath, the real problem is that
fastpath is being used at all in this case.


I don't think it is a bug in fastpath.

Think about the case where multiple logical files are actually the  
same

physical file. Using the name would result in caching the same object
under different names. This is a much more likely situation than  
this so

called bug.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread John Caruso

On Tuesday 10:40 AM 8/19/2008, Jim Davidson wrote:

I would suggest
the code snippet of create temp file and use fastpath to return
contents is not a use case I was solving for or recommend.


It's also not the use case in question--just a simple illustration of the 
problem.  Here's a more realistic template of a use case (which closely 
mirrors the actual code that led to the discovery of the bug):


eval exec /some/external/program --output-file $tempfile
ns_returnfile 200 text/plain $tempfile
ns_unlink -nocomplain $tempfile

In other words, run an external program that writes its output to 
$tempfile, return that file to the user, and delete the file.  This is a 
case in which ns_returnfile seems like the obvious and appropriate 
call--but if this procedure is run on behalf of users A and B within the 
same second (which is common on an active web server), and the results in 
$tempfile are the same length, B will get A's output.  Depending on what 
information the external program writes to $tempfile, this could easily 
represent a security breach.


That example involves timing between two different users, but something 
like the following will also trigger the bug:


foreach user $users {
eval exec /some/external/program --output-file $tempfile --user 
$user

ns_returnfile 200 text/plain $tempfile
}

Again, this code looks perfectly appropriate, but it's very likely to 
return incorrect data due to this bug.  Note that the ns_unlink isn't even 
required in this case.


Also, regarding use fastpath to return content: the developer in this 
case didn't know fastpath from a hole in the ground--after all, they were 
calling ns_returnfile, not fastpath.  fastpath is just the 
behind-the-scenes mechanism that was making ns_returnfile X return a 
file other than X.  And generally speaking, I'd say it's perfectly 
reasonable for a developer to believe that ns_returnfile X actually will 
return file X.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Tom Jackson
John,

It is not a bug in ns_returnfile. 

tom jackson

On Tue, 2008-08-19 at 11:52 -0700, John Caruso wrote:
 On Tuesday 10:40 AM 8/19/2008, Jim Davidson wrote:
 I would suggest
 the code snippet of create temp file and use fastpath to return
 contents is not a use case I was solving for or recommend.
 
 It's also not the use case in question--just a simple illustration of the 
 problem.  Here's a more realistic template of a use case (which closely 
 mirrors the actual code that led to the discovery of the bug):
 
  eval exec /some/external/program --output-file $tempfile
  ns_returnfile 200 text/plain $tempfile
  ns_unlink -nocomplain $tempfile
 
 In other words, run an external program that writes its output to 
 $tempfile, return that file to the user, and delete the file.  This is a 
 case in which ns_returnfile seems like the obvious and appropriate 
 call--but if this procedure is run on behalf of users A and B within the 
 same second (which is common on an active web server), and the results in 
 $tempfile are the same length, B will get A's output.  Depending on what 
 information the external program writes to $tempfile, this could easily 
 represent a security breach.
 
 That example involves timing between two different users, but something 
 like the following will also trigger the bug:
 
  foreach user $users {
  eval exec /some/external/program --output-file $tempfile --user 
 $user
  ns_returnfile 200 text/plain $tempfile
  }
 
 Again, this code looks perfectly appropriate, but it's very likely to 
 return incorrect data due to this bug.  Note that the ns_unlink isn't even 
 required in this case.
 
 Also, regarding use fastpath to return content: the developer in this 
 case didn't know fastpath from a hole in the ground--after all, they were 
 calling ns_returnfile, not fastpath.  fastpath is just the 
 behind-the-scenes mechanism that was making ns_returnfile X return a 
 file other than X.  And generally speaking, I'd say it's perfectly 
 reasonable for a developer to believe that ns_returnfile X actually will 
 return file X.
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread John Caruso

On Monday 05:53 PM 8/18/2008, Jeff Rogers wrote:

russell muetzelfeldt wrote:
fastpath is making assumptions about what means something is the same 
file, and those assumptions are not consistent with unix filesystem 
semantics - how is this not a bug?


It's not a bug because no one ever said that it *was* strictly following 
unix filesystem semantics, which isn't even a single thing (ufs is 
slightly different than nfs, is slightly different than ext2 -noatime, is 
slightly different than afs, etc.)  It is following a particular 
definition: if the file still exists and has the same 
dev/inode/mtime/size as it did when you last checked, then it is the same 
file.   This of it as a if-modified-since or if-none-match conditional 
GET.


Actually that's not analogous, for the same reason that the analogies to 
caching of attributes in NFS, rsync or tar not noticing content changes if 
attributes stay the same, etc, don't apply: because this bug can happen 
*even with two files that have completely different names or 
paths*.  Again, in this example...:


   set file [open /var/tmp/myfile w]
   puts $file ABC123
   close $file
   ns_returnfile 200 text/plain /var/tmp/myfile
   ns_unlink -nocomplain /var/tmp/myfile

   set file [open /var/tmp/myotherfile w]
   puts $file XYZ987
   close $file
   ns_returnfile 200 text/plain /var/tmp/myotherfile
   ns_unlink -nocomplain /var/tmp/myotherfile

...AOLserver will almost always return the contents of /var/tmp/myfile 
rather than /var/tmp/myotherfile in response to the second ns_returnfile.


I think the analogies to other systems aren't really germane 
anyway--AOLserver's behavior has to be judged on its own merits.  But 
adopting that standard, I can't think of any other program that would 
confuse /var/tmp/myfile with /var/tmp/myotherfile.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Jeff Rogers

John Caruso wrote:

 Think of it as a if-modified-since or if-none-match 
conditional GET.


Actually that's not analogous, ...


I didn't mean to say it was exactly the same, just similar in that given 
a particular system that makes particular assumptions it is possible to 
construct a situation where the results are unexpected or incorrect in a 
particular way.


I think by now everyone reading this understands the problem.  What's 
not clear is what you are expecting to happen now.


Documentation has been updated to reflect awareness of this problem and 
caution against using ns_returnfile in this situation and suggesting 
alternate solutions in the client code.


Some code fixes have been proposed, which for various reasons are 
undesirable or simply won't fix the problem.


A default configuration change was suggested which seems generally 
viewed as undesirable.


What more are you looking for?

-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread John Caruso

On Tuesday 02:10 PM 8/19/2008, Jeff Rogers wrote:
A default configuration change was suggested which seems generally viewed 
as undesirable.


My impression was that support was split about evenly, actually.  I take 
it that means you're against changing the default?  I'm a bit surprised, 
since you started out agreeing that it's a bug.  Personally I can't 
imagine any persuasive argument that a caching mechanism that can easily 
confuse /usr/local/private/var/rootpass and 
/var/tmp/verisign/certs/webcert.txt should be enabled by default in a web 
server.


For anyone thinking, well, you're the only one who's ever seen this bug, 
I'd say no, we're just the first ones to discover this bug.  It's quite 
possible that other people have run into it without knowing it, since 
AOLserver will just silently serve the wrong data.


As for what I want, as I said, I was mainly bringing this up to shine a 
light on the issue and see what other people's thoughts were.  That's been 
helpful in particular because I hadn't considered the security 
implications, which are quite serious; I may raise this issue on security 
forums as well so that people using ns_returnfile are aware of the danger 
of silent data corruption and/or information leaks and can review their 
code accordingly.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Tom Jackson
John,

This isn't a democracy. You have to demonstrate some understanding of
how things work. 

The only real security issue is your misuse/abuse of ns_returnfile to
serve dynamic data. 

Nobody is going to guarantee that you can't shoot yourself in the foot
due to your lack of understanding of writing robust code, or how to
configure and maintain a secure internet application, or take advice on
how to do so. 

But please, go tell the security police about our insecure file
commands. 

tom jackson



On Tue, 2008-08-19 at 15:33 -0700, John Caruso wrote:
 On Tuesday 02:10 PM 8/19/2008, Jeff Rogers wrote:
 A default configuration change was suggested which seems generally viewed 
 as undesirable.
 
 My impression was that support was split about evenly, actually.  I take 
 it that means you're against changing the default?  I'm a bit surprised, 
 since you started out agreeing that it's a bug.  Personally I can't 
 imagine any persuasive argument that a caching mechanism that can easily 
 confuse /usr/local/private/var/rootpass and 
 /var/tmp/verisign/certs/webcert.txt should be enabled by default in a web 
 server.
 
 For anyone thinking, well, you're the only one who's ever seen this bug, 
 I'd say no, we're just the first ones to discover this bug.  It's quite 
 possible that other people have run into it without knowing it, since 
 AOLserver will just silently serve the wrong data.
 
 As for what I want, as I said, I was mainly bringing this up to shine a 
 light on the issue and see what other people's thoughts were.  That's been 
 helpful in particular because I hadn't considered the security 
 implications, which are quite serious; I may raise this issue on security 
 forums as well so that people using ns_returnfile are aware of the danger 
 of silent data corruption and/or information leaks and can review their 
 code accordingly.
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Rusty Brooks

  Personally I can't
imagine any persuasive argument that a caching mechanism that can easily 
confuse /usr/local/private/var/rootpass and 
/var/tmp/verisign/certs/webcert.txt should be enabled by default in a 
web server.


Oh, come on.  Only if you're rapidly creating and deleting these files.

I think it's interesting, at least, that this topic has created more 
traffic than the list usually sees in a whole year.


Rusty


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread John Caruso

On Tuesday 04:57 PM 8/19/2008, Rusty Brooks wrote:

  Personally I can't
imagine any persuasive argument that a caching mechanism that can easily 
confuse /usr/local/private/var/rootpass and 
/var/tmp/verisign/certs/webcert.txt should be enabled by default in a 
web server.


Oh, come on.  Only if you're rapidly creating and deleting these files.


Yes, I've explained the conditions several times.  The point was that the 
files can be in completely different locations in the filesystem with 
completely different names, and may have secure contents.


Again: this is not an academic point.  This is an actual bug encountered 
in actual code, resulting in data corruption (effectively) and possible 
information leakage--and all because ns_returnfile X may not actually 
return file X.  I don't doubt that there are other people who are also at 
risk due to this behavior of ns_returnfile/fastpath.


If it's no big deal for you, great, but the security implications are 
nonetheless serious.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Jim Davidson

Hi,

I haven't looked at a directory change notification type scheme in a  
long time but that could be very clever.  Aside from addressing issues  
discussed here, the key benefit would be to avoid the repeated stat  
syscalls.  Those stat calls always bothered me conceptually but the  
performance of the underlying systems always improved faster than my  
irritation would grow to do something about it.  However, we were  
always careful to run websites against local filesystems - I would be  
more concerned with the overhead if we were using NFS or some other  
shared filesystem thing.


Somewhat related, the dci module (a series of AOL extensions we open  
sourced awhile back) includes some content fetch/caching features  
called sob.  That had the model you described -- things stayed in  
the cache until either space was needed or the server received an  
explicit flush message on a publish event.  That approach worked well  
and scaled well but it wasn't entirely general nor naive, i.e., it was  
key that we understood how it worked under the covers and to make sure  
the flush message links were reliable to avoid stale content problems.


Anyway, I've been pondering this whole discussion some more and agree  
with Tom -- the fastpath isn't broken.  It just does a certain thing  
-- serves static files with a reasonable balance of performance and  
stability -- and shouldn't be modified except to add notes about how  
it works in the docs.  I'm having trouble thinking through how it  
could be modified to plug all possible race conditions.  I'd suggest  
the code snippets using fastpath for dynamic content should be  
modified, perhaps some new Tcl commands could be added to make it  
convenient, but otherwise it seems a mismatch between capabilities and  
requirements.


-Jim





On Aug 19, 2008, at 1:03 PM, Juan José del Río wrote:


What about using epoll (or equivalent) in Linux, and kqueue in FreeBSD
to tell the kernel to notify AOLServer in change a file has changed?

That'd be a pretty easy and efficient way to discard fastpath items in
case they have been deleted and/or modified.

Just my two cents ;-)

-
Juan José del Río|
(+34) 616 512 340|  [EMAIL PROTECTED]


Simple Option S.L.
 Tel: (+34) 951 930 122
 Fax: (+34) 951 930 122
 http://www.simpleoption.com


On Tue, 2008-08-19 at 09:20 -0700, Tom Jackson wrote:

Andrew,

This is not a corner case. The exact same thing could happen without
fastpath.

What is that thing? That the contents of a file changes after a  
request
is made and before the file is returned. In fact, there is no  
guarantee

that it won't change mid-return. This is a fact of life with files on
any filesystem.

In fact, with the HTTP caching mechanisms, you could fail to get
up-to-date contents of a file, since the If-Modified-Since mechanism
will also fail.

The problem here is that the application is using this static file
handling API to serve dynamic content. Wondering why it doesn't  
work is

pointless.

Just to summarize again, this case requires that a file is created  
then
destroyed and another file created within the same second that has  
the
same size. Also, the original file must get into the cache, and the  
only
way that can happen is for the application to treat it as a long  
lived

static file.

We have other means to cache dynamic data, and large chunks of  
dynamic

content saved as a file can avoid the fastpath cache by setting the
cachemaxsize parameter. Writing smaller content to disk doesn't  
make any

sense if your goal is speed...or security.

It is probably even more important to tamp down these misconceptions
about how AOLserver works. Static and dynamic content are handled by
different API. The reason is that it has long been recognized by the
developers of AOLserver that different techniques are required to
maintain high performance based upon how the content is generated,  
its

expected lifespan, its size, and its potential for reuse.

tom jackson

On Tue, 2008-08-19 at 03:00 -0400, Andrew Piskorski wrote:

On Mon, Aug 18, 2008 at 06:06:23PM -0700, John Caruso wrote:

That'd be an improvement over the current situation, but it's  
still the
case that AOLserver as currently shipped has a file cache  
mechanism built

into it which 1) may return incorrect data and 2) is enabled by
default.  Given the risk, I'd say fastpath caching should be  
disabled by

default rather than enabled.


Sounds right to me.  Either robustify Fastpath somehow against this
corner case, or don't have Fastpath turned on by default.




--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can 

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread russell muetzelfeldt

On 20/08/2008, at 9:57 AM, Rusty Brooks wrote:

I think it's interesting, at least, that this topic has created  
more traffic than the list usually sees in a whole year.


most of which isn't actually about the issue at hand, but rather  
whether John is an idiot for expecting ns_returnfile to behave as  
documented.



cheers

Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread John Caruso

On Tuesday 05:39 PM 8/19/2008, Jim Davidson wrote:
Your right, the code snippet below could trip 
over a race condition as you've described.


It's not a race condition, actually; the code in 
that example was serialized, so there's no race involved.



...fastpath isn't broken.


It's designed in such a way that it can return 
incorrect results (and ones that are wildly 
outside of reasonable expectations).  Whether or 
not that's broken is a judgment call, and there 
we apparently differ, though I find that 
surprising--just because bad behavior can be 
documented and avoided doesn't mean it's not bad behavior.



Anyway, for your app, it might be easiest to not change your code but
instead write a new ns_returnfile to override the builtin -- maybe
just with open and ns_returnfp.


Yep, that was essentially my original suggestion 
to the developers.  I can guarantee you that all 
uses of ns_returnfile will be receiving close scrutiny. :-)


On Tuesday 05:59 PM 8/19/2008, Juan José del Río wrote:

If you don't want to deactivate it, and have some C skills, I would
recommend you to make the needed changes to fastpath code to enable it
to use the kernel facilities of the operating system (in case you're
using linux, then that'll be epoll system call; in FreeBSD case it's
kqueue; etc.).


This is an interesting suggestion, but from a 
quick scan of the epoll man page it doesn't look 
like it would work in this case since it acts on 
an open file descriptor, but fastpath associates 
file data with a (dev, inode, mtime, size) tuple 
without keeping an open file descriptor (and it'd 
be pretty wonky for AOLserver to keep open file 
descriptors for all files currently in the fastpath cache).


No matter, though, we've got plenty of 
workarounds, and we'll probably just disable 
fastpath entirely since the benefits are likely vanishingly small anyway.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-19 Thread Jim Davidson
One other idea could be to modify the code so that the Win32 behavior  
(cache by filename) could be made a configurable option (perhaps  
default on) instead of compile-time.  Would be easy to fiddle with  
code snippets like this to make that happen:


#ifdef _WIN32
key = file;
#else
ukey.dev = stPtr-st_dev;
ukey.ino = stPtr-st_ino;
key = (char *) ukey;
#endif


Note similar code is in ADP.  The downside to filename-based cache is  
the keys are bigger strings instead of small fixed sized structures  
(tiny) and double caching for files which are actually the same via  
symlinks (could be large or nothing if you're using symlinks).   Both  
cases result in more memory but perhaps safer results.  One positive  
is the ns_cache_keys, size, etc. commands would show results for  
fastpath objects by filename (I think they ignore non-string based  
cache keys now).


In fact, I suppose you could just add this to the top of fastpath.c  
and recompile to try it out:


#define _WIN32  1   /* Get Win32 filename-based cache keys */

As long as your filenames are unique names, this may give you the  
results you're looking for (although I still think a dynamic app using  
files should use open/cached fd's).


-Jim




On Aug 19, 2008, at 9:25 PM, John Caruso wrote:


On Tuesday 05:39 PM 8/19/2008, Jim Davidson wrote:
Your right, the code snippet below could trip over a race condition  
as you've described.


It's not a race condition, actually; the code in that example was  
serialized, so there's no race involved.



...fastpath isn't broken.


It's designed in such a way that it can return incorrect results  
(and ones that are wildly outside of reasonable expectations).   
Whether or not that's broken is a judgment call, and there we  
apparently differ, though I find that surprising--just because bad  
behavior can be documented and avoided doesn't mean it's not bad  
behavior.



Anyway, for your app, it might be easiest to not change your code but
instead write a new ns_returnfile to override the builtin -- maybe
just with open and ns_returnfp.


Yep, that was essentially my original suggestion to the developers.   
I can guarantee you that all uses of ns_returnfile will be receiving  
close scrutiny. :-)


On Tuesday 05:59 PM 8/19/2008, Juan José del Río wrote:

If you don't want to deactivate it, and have some C skills, I would
recommend you to make the needed changes to fastpath code to enable  
it

to use the kernel facilities of the operating system (in case you're
using linux, then that'll be epoll system call; in FreeBSD case  
it's

kqueue; etc.).


This is an interesting suggestion, but from a quick scan of the  
epoll man page it doesn't look like it would work in this case since  
it acts on an open file descriptor, but fastpath associates file  
data with a (dev, inode, mtime, size) tuple without keeping an open  
file descriptor (and it'd be pretty wonky for AOLserver to keep open  
file descriptors for all files currently in the fastpath cache).


No matter, though, we've got plenty of workarounds, and we'll  
probably just disable fastpath entirely since the benefits are  
likely vanishingly small anyway.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
 with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


[AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

Consider the following pseudocode snippet:

...generate file $myfile in some way...
ns_returnfile 200 text/plain $myfile
ns_unlink $myfile

If this snippet is executed in a tight loop on a Linux system, the chances 
of returning the wrong results are very high due to AOLserver's fastpath 
caching, which requires the following four attributes to be identical to 
consider a new file to be a cache hit (as per the FastReturn function in 
fastpath.c):


1) Same device number
2) Same inode number
3) Same modification time (within one second)
4) Same size

Assuming $myfile is always on the same filesystem, number 1 is taken care 
of, and Linux reuses inode numbers, so the creation and deletion of 
$myfile will typically result in a file with the same inode.  So in this 
example, files created within a given second that contains the same amount 
of data as a preceding file created within that same second will be 
considered identical, and will be erroneously served from cache.


This isn't just a hypothetical, BTW; a client of mine ran into this issue 
and spent many weeks trying to figure out what was happening before 
tracing it back to AOLserver's fastpath caching.  And the issue had 
existed for many years without being detected.


I'm mainly bringing this up to shine a light on the issue and see what 
other people's views are.  It's potentially a very serious issue given 
that it may silently corrupt data, and the fact that fastpath caching is 
enabled by default means that people may run into it without even knowing 
they're exposed to the danger.  The best workaround I can think of (short 
of a checksum, which would defeat the purpose of caching in the first 
place) would be to check that the mtime or ctime of the file is some 
threshold number of seconds (e.g. 1 or 2) less than the current time, and 
not serve the file from cache if it's not.  In other words, a file would 
have to be at least X seconds old (which could be a configurable value) 
before it could be served from the cache rather than from disk.


Thoughts?

- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
There is probably someone here that can directly address a better way to
do what you want, with ns_cache or some other scheme, but it looks like
your basic problem is saving rapidly changing data to disk or serving it
from cache. Why do this? If you data is changing faster than once per
second, don't keep a copy of it. It's not be a data corruption issue
because you are choosing to overwrite the old data with new data using
the exact same file name. If the data is important, don't overwrite it,
thus no corruption. 

But in general it is not a good idea to do things the way you are, which
is reading and writing the same file at the same time, which has nothing
to do with fastpath. You should use a cond/mutex to serialize access.

tom jackson

On Mon, 2008-08-18 at 12:33 -0700, John Caruso wrote:
 Consider the following pseudocode snippet:
 
  ...generate file $myfile in some way...
  ns_returnfile 200 text/plain $myfile
  ns_unlink $myfile
 
 If this snippet is executed in a tight loop on a Linux system, the chances 
 of returning the wrong results are very high due to AOLserver's fastpath 
 caching, which requires the following four attributes to be identical to 
 consider a new file to be a cache hit (as per the FastReturn function in 
 fastpath.c):
 
 1) Same device number
 2) Same inode number
 3) Same modification time (within one second)
 4) Same size
 
 Assuming $myfile is always on the same filesystem, number 1 is taken care 
 of, and Linux reuses inode numbers, so the creation and deletion of 
 $myfile will typically result in a file with the same inode.  So in this 
 example, files created within a given second that contains the same amount 
 of data as a preceding file created within that same second will be 
 considered identical, and will be erroneously served from cache.
 
 This isn't just a hypothetical, BTW; a client of mine ran into this issue 
 and spent many weeks trying to figure out what was happening before 
 tracing it back to AOLserver's fastpath caching.  And the issue had 
 existed for many years without being detected.
 
 I'm mainly bringing this up to shine a light on the issue and see what 
 other people's views are.  It's potentially a very serious issue given 
 that it may silently corrupt data, and the fact that fastpath caching is 
 enabled by default means that people may run into it without even knowing 
 they're exposed to the danger.  The best workaround I can think of (short 
 of a checksum, which would defeat the purpose of caching in the first 
 place) would be to check that the mtime or ctime of the file is some 
 threshold number of seconds (e.g. 1 or 2) less than the current time, and 
 not serve the file from cache if it's not.  In other words, a file would 
 have to be at least X seconds old (which could be a configurable value) 
 before it could be served from the cache rather than from disk.
 
 Thoughts?
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

On Monday 01:33 PM 8/18/2008, Tom Jackson wrote:

It's not be a data corruption issue
because you are choosing to overwrite the old data with new data using
the exact same file name. If the data is important, don't overwrite it,
thus no corruption.


No, you've misunderstood the scenario.  The file name needn't be the same 
to trigger this issue, and the corruption doesn't come from serving data 
out of a file that's changing, but rather because fastpath caching 
mistakenly identifies a new file as being identical to a previously-cached 
file (for the reasons I outlined) and erroneously serves the 
previously-cached data to the user.


This is a design limitation and arguably a bug in the fastpath caching 
implementation, which is potentially quite serious since it silently 
serves the wrong data to the user.  If you want a more straightforward 
(albeit contrived) demonstration of the problem, here you go:


   set file [open /var/tmp/myfile w]
   puts $file ABC123
   close $file
   ns_returnfile 200 text/plain /var/tmp/myfile
   ns_unlink -nocomplain /var/tmp/myfile

   set file [open /var/tmp/myotherfile w]
   puts $file XYZ987
   close $file
   ns_returnfile 200 text/plain /var/tmp/myotherfile
   ns_unlink -nocomplain /var/tmp/myotherfile

Assuming that /var/tmp/myfile and /var/tmp/myotherfile are created within 
the same second, the fastpath caching algorithm will misidentify them as 
the same file, and ns_returnfile will therefore erroneously return the 
(previously cached) contents of /var/tmp/myfile when it should be 
returning the (uncached) contents of /var/tmp/myotherfile.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
John,

Just to be clear: fastpath is for serving static content. This is not
what you are doing here, you are creating a temporary file to store
dynamic content. For your bug to work you must delete the old file and
create a new one within the same second, etc. 

Also, your code sequence below will leave temporary files around in the
case of a crash. If you want to safely serve the content from this
temporary storage, you should unlink after you finish creating it (no
other thread or process will be able to access the content, or you can
unlink before you write the content and even local users will not be
able to see the file. 

Then just send out the contents directly using the fd not the file
name.  

(maybe something like:

ns_return 200 [ns_guesstype $myfile] [read $fd]

Then you can close the fd, although AOLserver does that automatically at
the end of each request.

Now: why are you writing the content to disk? Can't you use a temp
variable.

tom jackson


On Mon, 2008-08-18 at 14:13 -0700, John Caruso wrote:
 On Monday 01:33 PM 8/18/2008, Tom Jackson wrote:
 It's not be a data corruption issue
 because you are choosing to overwrite the old data with new data using
 the exact same file name. If the data is important, don't overwrite it,
 thus no corruption.
 
 No, you've misunderstood the scenario.  The file name needn't be the same 
 to trigger this issue, and the corruption doesn't come from serving data 
 out of a file that's changing, but rather because fastpath caching 
 mistakenly identifies a new file as being identical to a previously-cached 
 file (for the reasons I outlined) and erroneously serves the 
 previously-cached data to the user.
 
 This is a design limitation and arguably a bug in the fastpath caching 
 implementation, which is potentially quite serious since it silently 
 serves the wrong data to the user.  If you want a more straightforward 
 (albeit contrived) demonstration of the problem, here you go:
 
 set file [open /var/tmp/myfile w]
 puts $file ABC123
 close $file
 ns_returnfile 200 text/plain /var/tmp/myfile
 ns_unlink -nocomplain /var/tmp/myfile
 
 set file [open /var/tmp/myotherfile w]
 puts $file XYZ987
 close $file
 ns_returnfile 200 text/plain /var/tmp/myotherfile
 ns_unlink -nocomplain /var/tmp/myotherfile
 
 Assuming that /var/tmp/myfile and /var/tmp/myotherfile are created within 
 the same second, the fastpath caching algorithm will misidentify them as 
 the same file, and ns_returnfile will therefore erroneously return the 
 (previously cached) contents of /var/tmp/myfile when it should be 
 returning the (uncached) contents of /var/tmp/myotherfile.
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jeff Rogers
While I'd agree this is a bug in fastpath, the real problem is that 
fastpath is being used at all in this case.  The intent of fastpath is 
to avoid reading a seldom-changed file from disk.  It happens to be used 
in ns_returnfile since that is the normal use case.  On unix the 
fastpath cache is keyed off the dev/inode probably to keep the hash key 
shorter.  Windows doesn't have device and inode numbers so it uses the 
filename as the hashkey, so it wouldn't run into this problem.


From the server side, this could be fixed by:
- adding in the filename to the hash key or checking that it is the same
- making ns_unlink flush the entry from the fastpath cache
- restricting what fastpath will cache - e.g., don't cache anything in 
/var/tmp or /tmp or a configuration-specified directory.

- adding a -nocache flag to ns_returnfile

All of these have pros and cons.

I don't think your suggestion of waiting for cache entries to age a 
second or two would work well, it just moves the race condition around 
and adds a whole lot of disk activity when a busy server is warming up - 
static files might be read a few dozen times instead of once.


Fixing it from the application side is much easier.  Just use 
ns_returnfp instead of ns_returnfile, on the open handle if you 
generated the file from tcl code and it's convenient to get the handle, 
otherwise by opening the file right there:


...generate file $myfile in some way...
set fp [open $myfile]
ns_returnfp 200 text/plain $fp
close $fp
ns_unlink $myfile

You'd probably lose some efficiency by not mmap-ing the file, but that's 
likely to be noise compared to generating the file in the first place.


-J

John Caruso wrote:

On Monday 01:33 PM 8/18/2008, Tom Jackson wrote:

It's not be a data corruption issue
because you are choosing to overwrite the old data with new data using
the exact same file name. If the data is important, don't overwrite it,
thus no corruption.


No, you've misunderstood the scenario.  The file name needn't be the 
same to trigger this issue, and the corruption doesn't come from 
serving data out of a file that's changing, but rather because fastpath 
caching mistakenly identifies a new file as being identical to a 
previously-cached file (for the reasons I outlined) and erroneously 
serves the previously-cached data to the user.


This is a design limitation and arguably a bug in the fastpath caching 
implementation, which is potentially quite serious since it silently 
serves the wrong data to the user.  If you want a more straightforward 
(albeit contrived) demonstration of the problem, here you go:


   set file [open /var/tmp/myfile w]
   puts $file ABC123
   close $file
   ns_returnfile 200 text/plain /var/tmp/myfile
   ns_unlink -nocomplain /var/tmp/myfile

   set file [open /var/tmp/myotherfile w]
   puts $file XYZ987
   close $file
   ns_returnfile 200 text/plain /var/tmp/myotherfile
   ns_unlink -nocomplain /var/tmp/myotherfile

Assuming that /var/tmp/myfile and /var/tmp/myotherfile are created 
within the same second, the fastpath caching algorithm will misidentify 
them as the same file, and ns_returnfile will therefore erroneously 
return the (previously cached) contents of /var/tmp/myfile when it 
should be returning the (uncached) contents of /var/tmp/myotherfile.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to 
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the 
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jade Rubick
I would call that a security issue then. Leaking the wrong data to the wrong
connection is pretty serious.

Jade

Jade Rubick
Director of Development
Truist
120 Wall Street, 4th Floor
New York, NY USA
[EMAIL PROTECTED]
+1 503 285 4963
+1 707 671 1333 fax



The information contained in this email/document is confidential and may be
legally privileged. Access to this mail/document by anyone other than the
intended recipient(s) is unauthorized. If you are not an intended recipient,
any disclosure, copying, distribution, or any action taken or omitted to be
taken in reliance to it, is prohibited.


On Mon, Aug 18, 2008 at 2:13 PM, John Caruso [EMAIL PROTECTED]wrote:

 On Monday 01:33 PM 8/18/2008, Tom Jackson wrote:

 It's not be a data corruption issue
 because you are choosing to overwrite the old data with new data using
 the exact same file name. If the data is important, don't overwrite it,
 thus no corruption.


 No, you've misunderstood the scenario.  The file name needn't be the same
 to trigger this issue, and the corruption doesn't come from serving data
 out of a file that's changing, but rather because fastpath caching
 mistakenly identifies a new file as being identical to a previously-cached
 file (for the reasons I outlined) and erroneously serves the
 previously-cached data to the user.

 This is a design limitation and arguably a bug in the fastpath caching
 implementation, which is potentially quite serious since it silently serves
 the wrong data to the user.  If you want a more straightforward (albeit
 contrived) demonstration of the problem, here you go:

   set file [open /var/tmp/myfile w]
   puts $file ABC123
   close $file
   ns_returnfile 200 text/plain /var/tmp/myfile
   ns_unlink -nocomplain /var/tmp/myfile

   set file [open /var/tmp/myotherfile w]
   puts $file XYZ987
   close $file
   ns_returnfile 200 text/plain /var/tmp/myotherfile
   ns_unlink -nocomplain /var/tmp/myotherfile

 Assuming that /var/tmp/myfile and /var/tmp/myotherfile are created within
 the same second, the fastpath caching algorithm will misidentify them as the
 same file, and ns_returnfile will therefore erroneously return the
 (previously cached) contents of /var/tmp/myfile when it should be returning
 the (uncached) contents of /var/tmp/myotherfile.


 - John


 --
 AOLserver - http://www.aolserver.com/

 To Remove yourself from this list, simply send an email to 
 [EMAIL PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the
 Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
Jade,

It is a security issue mostly because the code sequence is incorrect.
(which also means that ns_returnfile should not be used for temp file
return)

The safe way to do this is to open the temp file, then immediately
unlink it! Then write to the fd. 

BTW, this same bug exists in the ns_form/ns_conn files code which should
use fd's instead of files. We need a little code cleanup here.

tom jackson


On Mon, 2008-08-18 at 15:30 -0700, Jade Rubick wrote:
 I would call that a security issue then. Leaking the wrong data to the
 wrong connection is pretty serious.
 
 Jade
 
 Jade Rubick
 Director of Development
 Truist
 120 Wall Street, 4th Floor
 New York, NY USA
 [EMAIL PROTECTED]
 +1 503 285 4963
 +1 707 671 1333 fax
 
 
 
 The information contained in this email/document is confidential and
 may be legally privileged. Access to this mail/document by anyone
 other than the intended recipient(s) is unauthorized. If you are not
 an intended recipient, any disclosure, copying, distribution, or any
 action taken or omitted to be taken in reliance to it, is prohibited.
 
 
 On Mon, Aug 18, 2008 at 2:13 PM, John Caruso
 [EMAIL PROTECTED] wrote:
 On Monday 01:33 PM 8/18/2008, Tom Jackson wrote:
 It's not be a data corruption issue
 because you are choosing to overwrite the old data
 with new data using
 the exact same file name. If the data is important,
 don't overwrite it,
 thus no corruption.
 
 
 No, you've misunderstood the scenario.  The file name needn't
 be the same to trigger this issue, and the corruption
 doesn't come from serving data out of a file that's changing,
 but rather because fastpath caching mistakenly identifies a
 new file as being identical to a previously-cached file (for
 the reasons I outlined) and erroneously serves the
 previously-cached data to the user.
 
 This is a design limitation and arguably a bug in the fastpath
 caching implementation, which is potentially quite serious
 since it silently serves the wrong data to the user.  If you
 want a more straightforward (albeit contrived) demonstration
 of the problem, here you go:
 
   set file [open /var/tmp/myfile w]
   puts $file ABC123
   close $file
   ns_returnfile 200 text/plain /var/tmp/myfile
   ns_unlink -nocomplain /var/tmp/myfile
 
   set file [open /var/tmp/myotherfile w]
   puts $file XYZ987
   close $file
   ns_returnfile 200 text/plain /var/tmp/myotherfile
   ns_unlink -nocomplain /var/tmp/myotherfile
 
 Assuming that /var/tmp/myfile and /var/tmp/myotherfile are
 created within the same second, the fastpath caching algorithm
 will misidentify them as the same file, and ns_returnfile will
 therefore erroneously return the (previously cached) contents
 of /var/tmp/myfile when it should be returning the (uncached)
 contents of /var/tmp/myotherfile.
 
 
 
 - John
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 To Remove yourself from this list, simply send an email to
 [EMAIL PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can
 leave the Subject: field of your email blank.
 
 
 
 
 --
 AOLserver - http://www.aolserver.com/
 
 
 
 To Remove yourself from this list, simply send an email to [EMAIL 
 PROTECTED] with the
 body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
 field of your email blank.
 
 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

On Monday 03:38 PM 8/18/2008, Jeff Rogers wrote:
While I'd agree this is a bug in fastpath, the real problem is that 
fastpath is being used at all in this case.  The intent of fastpath is to 
avoid reading a seldom-changed file from disk.


I'd agree that that's the intent, but the caching is hidden within 
ns_returnfile and it's not clear at all from the user's perspective that 
this alligator is lurking in the swamp.  Using ns_returnfile in this way 
may not be the best approach in any particular situation, but it's 
nonetheless a completely valid usage and isn't contraindicated in any 
AOLserver docs I've seen.


It's not difficult to come up with examples where it might happen, 
BTW...say, a web service that returns the result of an operating system 
command to a user.


I think Jade makes a good point that this is not only a bug but 
potentially a security issue.


It happens to be used in ns_returnfile since that is the normal use 
case.  On unix the fastpath cache is keyed off the dev/inode probably to 
keep the hash key shorter.  Windows doesn't have device and inode numbers 
so it uses the filename as the hashkey, so it wouldn't run into this 
problem.


No, it can still easily run into this problem--it's just that the file 
name needs to be the same in both cases (which actually did apply in my 
client's case, and caused confusion in the early debugging of the problem, 
since the assumption was that using the same file name and/or path name 
was the source of the problem).



From the server side, this could be fixed by:
- adding in the filename to the hash key or checking that it is the same


No go, as observed above.


- making ns_unlink flush the entry from the fastpath cache


Nope, since the file can be removed via (e.g.) exec rm.

- restricting what fastpath will cache - e.g., don't cache anything in 
/var/tmp or /tmp or a configuration-specified directory.

- adding a -nocache flag to ns_returnfile


This last is the one I'd considered as well, but the problem is that it 
puts the onus on the user to know that they should use the flag, and 
that's unlikely to be clear to them.


I don't think your suggestion of waiting for cache entries to age a 
second or two would work well, it just moves the race condition around 
and adds a whole lot of disk activity when a busy server is warming up - 
static files might be read a few dozen times instead of once.


Nope, not at all.  The only files that would get read more than once would 
be those that were served within one second of being generated--which 
wouldn't apply to any content that fits the definition of static.


So this is actually a fairly non-intrusive fix.  The main limitation is 
that it relies on the file timestamps and the server timestamps being 
synchronized, which may not always be true.  But I can't think of a better 
solution.  Simply put, fastpath caching is inherently broken because it's 
not possible to guarantee that the file in question really should be 
served from cache (again, short of a cache-defeating checksum).


Fixing it from the application side is much easier.  Just use ns_returnfp 
instead of ns_returnfile, on the open handle if you generated the file 
from tcl code and it's convenient to get the handle, otherwise by opening 
the file right there:


Yep, and that's more or less the workaround I'd suggested to my 
client.  But my point here wasn't to ask about potential workarounds but 
to highlight the issue itself, since I haven't seen it mentioned before.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:
 While I'd agree this is a bug in fastpath, the real problem is that 
 fastpath is being used at all in this case.  

I don't think it is a bug in fastpath. 

Think about the case where multiple logical files are actually the same
physical file. Using the name would result in caching the same object
under different names. This is a much more likely situation than this so
called bug.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Mon, 2008-08-18 at 16:20 -0700, John Caruso wrote:
 It's not difficult to come up with examples where it might happen, 
 BTW...say, a web service that returns the result of an operating system 
 command to a user.

The command is named ns_returnfile.

The expectation is that you are returning a file, not a web service
resource. 

The expectation is that the file will be around for longer than one
second before being deleted and replaced.

The fact that the documentation doesn't say this is unimportant. Inodes
are reused, this is part of how the filesystem works. You could run into
the same problem with an archive program. A file of the same inode,
name, size and age is created replacing the old file. Most archive
programs would not understand that the file contents had changed. Is it
a bug? No. It is called a practical limitation.

Anyway: no bug, just how it works. The only bug is how ns_returnfile is
being used in the example. 

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jade Rubick
Consider this use case:

   - You use git or another version control system to store for a bunch of
   static html files you serve with Aolserver.
   - You check out all of your static html files. Because they're all
   checked out at the same time, many of them have identical timestamps.

Could the user get the wrong version of an html file they're being served?

What about this scenario:

   - You have a web application that allows administrators on various sites
   hosted on your application to download a list of user names and passwords
   (this is a slightly contrived example). They can download it to CSV.
   - Admin #1 generates this file. You create a unique filename for their
   site_id, because you want a unique filename to return back to the user:
   site-1234-passwords.csv. You return this file to the admin.
   - Admin #2 generates their file. You create a unique filename for their
   site_id, because you want a unique filename to return back to the user:
   site-5000-passwords.csv. You attempt to return this file to the admin.
   Because their request was in the same second, however, they get
   site-1234-passwords.csv?

Do I understand the problem correctly? I think both of these scenarios are
pretty common examples of the way people use Aolserver currently, but I'm
not sure if I'm understanding correctly the bug.

Jade

Jade Rubick
Director of Development
Truist
120 Wall Street, 4th Floor
New York, NY USA
[EMAIL PROTECTED]
+1 503 285 4963
+1 707 671 1333 fax



The information contained in this email/document is confidential and may be
legally privileged. Access to this mail/document by anyone other than the
intended recipient(s) is unauthorized. If you are not an intended recipient,
any disclosure, copying, distribution, or any action taken or omitted to be
taken in reliance to it, is prohibited.


On Mon, Aug 18, 2008 at 4:20 PM, John Caruso [EMAIL PROTECTED]wrote:

 On Monday 03:38 PM 8/18/2008, Jeff Rogers wrote:

 While I'd agree this is a bug in fastpath, the real problem is that
 fastpath is being used at all in this case.  The intent of fastpath is to
 avoid reading a seldom-changed file from disk.


 I'd agree that that's the intent, but the caching is hidden within
 ns_returnfile and it's not clear at all from the user's perspective that
 this alligator is lurking in the swamp.  Using ns_returnfile in this way may
 not be the best approach in any particular situation, but it's nonetheless a
 completely valid usage and isn't contraindicated in any AOLserver docs I've
 seen.

 It's not difficult to come up with examples where it might happen,
 BTW...say, a web service that returns the result of an operating system
 command to a user.

 I think Jade makes a good point that this is not only a bug but potentially
 a security issue.

  It happens to be used in ns_returnfile since that is the normal use case.
  On unix the fastpath cache is keyed off the dev/inode probably to keep the
 hash key shorter.  Windows doesn't have device and inode numbers so it uses
 the filename as the hashkey, so it wouldn't run into this problem.


 No, it can still easily run into this problem--it's just that the file name
 needs to be the same in both cases (which actually did apply in my client's
 case, and caused confusion in the early debugging of the problem, since the
 assumption was that using the same file name and/or path name was the source
 of the problem).

  From the server side, this could be fixed by:
 - adding in the filename to the hash key or checking that it is the same


 No go, as observed above.

  - making ns_unlink flush the entry from the fastpath cache


 Nope, since the file can be removed via (e.g.) exec rm.

  - restricting what fastpath will cache - e.g., don't cache anything in
 /var/tmp or /tmp or a configuration-specified directory.
 - adding a -nocache flag to ns_returnfile


 This last is the one I'd considered as well, but the problem is that it
 puts the onus on the user to know that they should use the flag, and that's
 unlikely to be clear to them.

  I don't think your suggestion of waiting for cache entries to age a second
 or two would work well, it just moves the race condition around and adds a
 whole lot of disk activity when a busy server is warming up - static files
 might be read a few dozen times instead of once.


 Nope, not at all.  The only files that would get read more than once would
 be those that were served within one second of being generated--which
 wouldn't apply to any content that fits the definition of static.

 So this is actually a fairly non-intrusive fix.  The main limitation is
 that it relies on the file timestamps and the server timestamps being
 synchronized, which may not always be true.  But I can't think of a better
 solution.  Simply put, fastpath caching is inherently broken because it's
 not possible to guarantee that the file in question really should be served
 from cache (again, short of a cache-defeating checksum).

  Fixing 

Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jeff Rogers

John Caruso wrote:

I'd agree that that's the intent, but the caching is hidden within 
ns_returnfile and it's not clear at all from the user's perspective that 
this alligator is lurking in the swamp.  Using ns_returnfile in this way 
may not be the best approach in any particular situation, but it's 
nonetheless a completely valid usage and isn't contraindicated in any 
AOLserver docs I've seen.


This then is the real fix: mention it in the docs.  I put a blurb on the 
appropriate wiki pages; feel free to suggest something better :)

The docs in the distribution should be updated too.

It happens to be used in ns_returnfile since that is the normal use 
case.  On unix the fastpath cache is keyed off the dev/inode probably 
to keep the hash key shorter.  Windows doesn't have device and inode 
numbers so it uses the filename as the hashkey, so it wouldn't run 
into this problem.


No, it can still easily run into this problem--it's just that the file 
name needs to be the same in both cases (which actually did apply in my 
client's case, and caused confusion in the early debugging of the 
problem, since the assumption was that using the same file name and/or 
path name was the source of the problem).


The system needs to be free to do some things to improve performance 
with the understanding that the user needs to be aware of those things 
or risk bad behaviour.  I wouldn't call it an unreasonable assumption 
that a file with the same name (and same modtime etc) is the same file.
You can run into a very similar problem with NFS (i.e., attribute 
caching causing a modified file to appear not so) and people have 
learned to deal with that.



- making ns_unlink flush the entry from the fastpath cache


Nope, since the file can be removed via (e.g.) exec rm.


True, but I'd still put this in the system needs to be able to ... 
category above.  The system does some things and the developer should be 
aware of those things.


I don't think your suggestion of waiting for cache entries to age a 
second or two would work well, it just moves the race condition around 
and adds a whole lot of disk activity when a busy server is warming up 
- static files might be read a few dozen times instead of once.


Nope, not at all.  The only files that would get read more than once 
would be those that were served within one second of being 
generated--which wouldn't apply to any content that fits the definition 
of static.


It would work in your exact case, where the file is always removed 
immediately after being served and generated.  But if not, it would 
still come up with the wrong answer.


13:50:21 - create file
13:50:21 - serve file (gets cached)
13:50:21 - delete file
13:50:21 - create file again (reuses inode)
... time passes ...
13:55:11 - serve file

In this case the file modtime is more than a few seconds old, but the 
cached mtime, inode, etc. are still matching the file on disk, so the 
stale cache entry would get delivered.


There is also at least one clever optimization where static content 
does get served within a second of being created, where the 404 page is 
used to generate something like an image from something like a database 
and writes it to a file where it is subsequently served by fastpath.


So this is actually a fairly non-intrusive fix.  The main limitation is 
that it relies on the file timestamps and the server timestamps being 
synchronized, which may not always be true.  But I can't think of a 
better solution.  Simply put, fastpath caching is inherently broken 
because it's not possible to guarantee that the file in question really 
should be served from cache (again, short of a cache-defeating checksum).


The same can be said about nearly any caching system: it is unable to 
handle changes in the data that happen outside of the cache's control or 
knowledge.  This is just the bargain you make when you use a cache.


But my point here wasn't to ask about potential workarounds but to 
highlight the issue itself, since I haven't seen it mentioned before.


I think you highlighting it is most of the fix.  From there, get the 
caveat inserted into the documentation and the knowledge into the 
community so that the next person who runs into this problem will have 
an easier, or at least less frustrating time solving it.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jeff Rogers

Tom Jackson wrote:

Think about the case where multiple logical files are actually the same
physical file. Using the name would result in caching the same object
under different names. This is a much more likely situation than this so
called bug.


Huh, hard links - I sometimes forget about those.  It's a much more 
believable reason (than my previous suggestion of shortening the key) 
for why the inode was used instead of the filename for the hash key.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 9:37 AM, Tom Jackson wrote:

On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:

While I'd agree this is a bug in fastpath, the real problem is that
fastpath is being used at all in this case.


I don't think it is a bug in fastpath.


fastpath is making assumptions about what means something is the  
same file, and those assumptions are not consistent with unix  
filesystem semantics - how is this not a bug?


sure, the original use case that triggered this seems non-optimal,  
and could be done in other ways that don't trigger the bug, but that  
doesn't mean fastpath is behaving correctly...



Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Mon, 2008-08-18 at 16:56 -0700, Jade Rubick wrote:
 Consider this use case:
   * You use git or another version control system to store for a
 bunch of static html files you serve with Aolserver. 
   * You check out all of your static html files. Because they're
 all checked out at the same time, many of them have identical
 timestamps.
 Could the user get the wrong version of an html file they're being
 served?
 

No, because each file has a different inode. The bug requires that you
create and destroy one file and create another one within one second (so
they have the same timestamp) also required that the same inode is used
and that the file is the same exact size. 

But beyond that, hopefully your git checkout will maintain the original
timestamp with the file.


 What about this scenario:
   * You have a web application that allows administrators on
 various sites hosted on your application to download a list of
 user names and passwords (this is a slightly contrived
 example). They can download it to CSV.
   * Admin #1 generates this file. You create a unique filename for
 their site_id, because you want a unique filename to return
 back to the user: site-1234-passwords.csv. You return this
 file to the admin.
   * Admin #2 generates their file. You create a unique filename
 for their site_id, because you want a unique filename to
 return back to the user: site-5000-passwords.csv. You attempt
 to return this file to the admin. Because their request was in
 the same second, however, they get site-1234-passwords.csv?
 Do I understand the problem correctly? I think both of these scenarios
 are pretty common examples of the way people use Aolserver currently,
 but I'm not sure if I'm understanding correctly the bug.
 

The filename doesn't matter, neither does the source of the information.
Two different requests could create files. The requirement is that the
first is created and destroyed and the second is created within the same
second as the first, reuses the inode, has the exact same size. 

This is why you should not use linked files (with path names) as
temporary storage. Instead, open the file then unlink it (delete it from
the filesystem), then use it via the fd.

In short: there is no bug.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

On Monday 04:56 PM 8/18/2008, Jade Rubick wrote:

Consider this use case:
   * You use git or another version control system to store for a bunch 
of static html files you serve with Aolserver.
   * You check out all of your static html files. Because they're all 
checked out at the same time, many of them have identical timestamps.
Could the user get the wrong version of an html file they're being 
served?


Nope, because in this case the inodes for the files would be different, so 
fastpath caching would distinguish them.



What about this scenario:
   * You have a web application that allows administrators on various 
sites hosted on your application to download a list of user names and 
passwords (this is a slightly contrived example). They can download it 
to CSV.
   * Admin #1 generates this file. You create a unique filename for 
their site_id, because you want a unique filename to return back to the 
user: site-1234-passwords.csv. You return this file to the admin.
   * Admin #2 generates their file. You create a unique filename for 
their site_id, because you want a unique filename to return back to the 
user: site-5000-passwords.csv. You attempt to return this file to the 
admin. Because their request was in the same second, however, they get 
site-1234-passwords.csv?
Yep, it could happen in this case, assuming the files are deleted after 
they're returned to the user via ns_returnfile.


As I mentioned, this bug wasn't discovered through code review or any 
theoretical process--it was causing problems in live code, and it was 
extremely difficult to track down.  And the damage assessment is still 
underway.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 10:13 AM, Jeff Rogers wrote:

John Caruso wrote:

The system needs to be free to do some things to improve  
performance with the understanding that the user needs to be aware  
of those things or risk bad behaviour.  I wouldn't call it an  
unreasonable assumption that a file with the same name (and same  
modtime etc) is the same file.
You can run into a very similar problem with NFS (i.e., attribute  
caching causing a modified file to appear not so) and people have  
learned to deal with that.


the problem is that this can occur even if the filename is changed,  
and I'd argue that pretty convincingly violates the principle of  
least surprise.


yes, of course the system needs to make some assumptions about what  
it can optimise, but if the contents of /tmp/userinfo-71562 might get  
served back when I've asked for /tmp/userinfo-61453 then there's  
something wrong.



Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Tue, 2008-08-19 at 10:01 +1000, russell muetzelfeldt wrote:
 On 19/08/2008, at 9:37 AM, Tom Jackson wrote:
  On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:
  While I'd agree this is a bug in fastpath, the real problem is that
  fastpath is being used at all in this case.
 
  I don't think it is a bug in fastpath.
 
 fastpath is making assumptions about what means something is the  
 same file, and those assumptions are not consistent with unix  
 filesystem semantics - how is this not a bug?
 

No, fastpath is making the exact same assumptions that any archive
program would make, which is to record certain attributes at the time
something is cached and then compare them with the same attributes at a
later time. Unless you do a checksum or some other comparison, the cache
system doesn't work very well for the intended purpose. 

 sure, the original use case that triggered this seems non-optimal,  
 and could be done in other ways that don't trigger the bug, but that  
 doesn't mean fastpath is behaving correctly...

The use case is a bug. You can't violate the essential granularity of
the support system and call it a bug. The granularity is: inode, size,
timestamp. Now, if we could just slow down AOLserver so that this never
happens, that would be a great fix. 

This is like claiming that a checksum collision is a bug. No, it is
expected. We don't use things like checksums, or inode,size,time as a
key as a guarantee of anything. They are a compromise, in other words,
engineering. 

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jeff Rogers

russell muetzelfeldt wrote:

On 19/08/2008, at 9:37 AM, Tom Jackson wrote:

On Mon, 2008-08-18 at 15:38 -0700, Jeff Rogers wrote:

While I'd agree this is a bug in fastpath, the real problem is that
fastpath is being used at all in this case.


I don't think it is a bug in fastpath.


fastpath is making assumptions about what means something is the same 
file, and those assumptions are not consistent with unix filesystem 
semantics - how is this not a bug?


It's not a bug because no one ever said that it *was* strictly following 
unix filesystem semantics, which isn't even a single thing (ufs is 
slightly different than nfs, is slightly different than ext2 -noatime, 
is slightly different than afs, etc.)  It is following a particular 
definition: if the file still exists and has the same 
dev/inode/mtime/size as it did when you last checked, then it is the 
same file.   This of it as a if-modified-since or if-none-match 
conditional GET.


It is a bug in that it's not what you expect.  However in that case, the 
location of the bug is subject to debate.


-J



sure, the original use case that triggered this seems non-optimal, and 
could be done in other ways that don't trigger the bug, but that doesn't 
mean fastpath is behaving correctly...



Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to 
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the 
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Tue, 2008-08-19 at 10:39 +1000, russell muetzelfeldt wrote:
 On 19/08/2008, at 10:13 AM, Jeff Rogers wrote:
  John Caruso wrote:
 
  The system needs to be free to do some things to improve  
  performance with the understanding that the user needs to be aware  
  of those things or risk bad behaviour.  I wouldn't call it an  
  unreasonable assumption that a file with the same name (and same  
  modtime etc) is the same file.
  You can run into a very similar problem with NFS (i.e., attribute  
  caching causing a modified file to appear not so) and people have  
  learned to deal with that.
 
 the problem is that this can occur even if the filename is changed,  
 and I'd argue that pretty convincingly violates the principle of  
 least surprise.
 
 yes, of course the system needs to make some assumptions about what  
 it can optimise, but if the contents of /tmp/userinfo-71562 might get  
 served back when I've asked for /tmp/userinfo-61453 then there's  
 something wrong.

If it were not for the fact that the same system is entirely responsible
for the situation, then I would agree. 

What you are really hoping for here is an idiot proof system. The big
hole in the reasoning here is that the important thing is the file name
with path, and that somehow this name is immutably linked to some
content. This is delusion. You want a transactional database but you are
using a filesystem. Grow up. 

BTW, fastpath has configuration parameters. Maybe bone up on those
first.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Bas Scheffers

On 19/08/2008, at 10:14 AM, Tom Jackson wrote:

No, fastpath is making the exact same assumptions that any archive
program would make, which is to record certain attributes at the time
something is cached and then compare them with the same attributes  
at a
Could the file name (just the name, not even the full path) not be  
added to the mix? Then using a random string as filename would make  
the problem go away, would it not?


Also, would it be possible to tell ns_returnfile to not use fastpath,  
if it is for one time use?


The alternative in this scenario would of course be to simply read the  
file and just ns_return it.


Bas.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 10:44 AM, Tom Jackson wrote:

On Tue, 2008-08-19 at 10:01 +1000, russell muetzelfeldt wrote:


sure, the original use case that triggered this seems non-optimal,
and could be done in other ways that don't trigger the bug, but that
doesn't mean fastpath is behaving correctly...


The use case is a bug. You can't violate the essential  
granularity of

the support system and call it a bug. The granularity is: inode, size,
timestamp. Now, if we could just slow down AOLserver so that this  
never

happens, that would be a great fix.


yes, that's exactly what I said - fastpath should be removed.


snark aside, if I say ns_returnfile /tmp/foo-abcd but nsd sends the  
contents of the now-deleted /tmp/bar-wxyz to the client then it's not  
doing what I've explicitly asked, and it's a bug.


just because the correct (imo) response is tag WONTFIX, document as  
a gotcha, document workaround doesn't mean that the behaviour is  
correct.



cheers

Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Jeff Rogers

Tom Jackson wrote:

No, because each file has a different inode. The bug requires that you
create and destroy one file and create another one within one second (so
they have the same timestamp) also required that the same inode is used
and that the file is the same exact size. 


But beyond that, hopefully your git checkout will maintain the original
timestamp with the file.


The bug conditions are actually slightly looser than this, because 
fastpath checks mtime and not ctime.  So a malicious user (or your 
version control system, if it makes the local files have the same 
timestamps as those in the repo) could overwrite a file at any point in 
the future, utime() it back to the same time and fastpath would still 
consider it the same.  So would any number of unix utilities, like 
rsync, tar, zip, etc.


Going back to my previous solutions, the only one on the server side 
that I still think is reasonable (names break hardlinks, cache flushing 
on unlink wasn't good in the first place, -nocache - why bother?) is to 
add a configuration option to exclude particular paths from fastpath. 
Actually not even a configuration option; that would involve a bit too 
much overhead for a marginal case; maybe a patch to fix this problem for 
users for whom it is a problem.


Using an unlinked file as a temporary is the right thing to do most of 
the time, but I imagine ti could be difficult to do when you need to 
pass the filename around to uncooperative external programs.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

On Monday 05:13 PM 8/18/2008, Jeff Rogers wrote:
Simply put, fastpath caching is inherently broken because it's not 
possible to guarantee that the file in question really should be served 
from cache (again, short of a cache-defeating checksum).


The same can be said about nearly any caching system: it is unable to 
handle changes in the data that happen outside of the cache's control or 
knowledge.  This is just the bargain you make when you use a cache.


I'd say nearly any is going too far, and in fact I'd say that for most 
caching systems to fail to return the correct data is a serious bug.  The 
NFS example you bring up isn't really analogous since it's only about 
attributes, which are frequently not a concern; were NFS to return 
incorrect *data* for a file, that would be a serious bug.  And in this 
case we're talking about a web server that may silently return data that's 
completely incorrect, which I'd say is very serious misbehavior.


The core problem here is that AOLserver is attempting to use the tuple of 
(dev, inode, mtime, size) as a unique determiner of a file's identity, and 
that's an inherently broken assumption--particularly so since the 
granularity of mtime is one second and inodes are reused on many 
filesystems (e.g. very common ones like ext3 and ufs).


I think you highlighting it is most of the fix.  From there, get the 
caveat inserted into the documentation and the knowledge into the 
community so that the next person who runs into this problem will have an 
easier, or at least less frustrating time solving it.


That'd be an improvement over the current situation, but it's still the 
case that AOLserver as currently shipped has a file cache mechanism built 
into it which 1) may return incorrect data and 2) is enabled by 
default.  Given the risk, I'd say fastpath caching should be disabled by 
default rather than enabled.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 11:06 AM, John Caruso wrote:

That'd be an improvement over the current situation, but it's still  
the case that AOLserver as currently shipped has a file cache  
mechanism built into it which 1) may return incorrect data and 2)  
is enabled by default.  Given the risk, I'd say fastpath caching  
should be disabled by default rather than enabled.


if someone's application is at risk of triggering this behaviour,  
that'd just delays any problem until their load is high enough that  
they need to turn on fastpath - and surely that's an even worse  
scenario.



cheers

Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Tue, 2008-08-19 at 11:04 +1000, russell muetzelfeldt wrote:
 snark aside, if I say ns_returnfile /tmp/foo-abcd but nsd sends the  
 contents of the now-deleted /tmp/bar-wxyz to the client then it's not  
 doing what I've explicitly asked, and it's a bug.
 
 just because the correct (imo) response is tag WONTFIX, document as  
 a gotcha, document workaround doesn't mean that the behaviour is  
 correct.

If your application wasn't the responsible party which violated the
expectation you state, I would agree (maybe).

The problem is that you think that the contents of a file remains
unchanged as long as the filename itself remains unchanged. 

Actually the problem is that someone is using a file to store volatile
data and then feeding this file through a cache. 

You really need to think about this insanity. Because it is insanity.

1. You waste time writing data to a file. 

2. You use ns_returnfile to send this data (reading from disk).

3. Fastpath puts this information into memory (taking space).

4. ns_returnfile uses the memory copy on later requests (but none
valid).

5. meanwhile the file is deleted, cache still exists taking up memory.

The above are ideal conditions. 

The bug is not in ns_returnfile.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 10:56 AM, Tom Jackson wrote:

You want a transactional database but you are using a filesystem.  
Grow up.


and

If your application wasn't the responsible party which violated the  
expectation you state, I would agree (maybe).




please go and re-read this thread, and get your parties straight.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread Tom Jackson
On Tue, 2008-08-19 at 11:37 +1000, russell muetzelfeldt wrote:
 On 19/08/2008, at 10:56 AM, Tom Jackson wrote:
 
  You want a transactional database but you are using a filesystem.  
  Grow up.
 
 and
 
  If your application wasn't the responsible party which violated the  
  expectation you state, I would agree (maybe).
 
 
 
 please go and re-read this thread, and get your parties straight.

Sorry, I don't follow. 

Until someone explains to me why we need to be able to create and delete
a file (then return it via fastpath), then create another file in the
same second, I'll maintain that there is no bug in fastpath. 

The whole thing is a waste of time and space. We don't need to fix
ns_returnfile so that it is easier to waste time or space. 

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread russell muetzelfeldt

On 19/08/2008, at 11:59 AM, Tom Jackson wrote:

On Tue, 2008-08-19 at 11:37 +1000, russell muetzelfeldt wrote:

On 19/08/2008, at 10:56 AM, Tom Jackson wrote:


You want a transactional database but you are using a filesystem.
Grow up.


and


If your application wasn't the responsible party which violated the
expectation you state, I would agree (maybe).


please go and re-read this thread, and get your parties straight.


Sorry, I don't follow.


ok, I'll spell it out.

it's not my application that's violated the expectation I state. you  
haven't been paying attention to the From: headers, and seem to have  
mistaken me for the original poster of this thread.


all I've been saying is that ns_returnfile filename returning the  
content of something other than filename, contrary to the  
documentation and common sense, is a bug. given that fastpath exists  
for a (good) reason, and that the behaviour which triggers the bug is  
marginal anyway, the correct response is the bug will not be fixed,  
here's why, and here's how to work around it.



cheers

Russell


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Data corruption with fastpath caching

2008-08-18 Thread John Caruso

On Monday 06:21 PM 8/18/2008, russell muetzelfeldt wrote:

On 19/08/2008, at 11:06 AM, John Caruso wrote:

That'd be an improvement over the current situation, but it's still
the case that AOLserver as currently shipped has a file cache
mechanism built into it which 1) may return incorrect data and 2)
is enabled by default.  Given the risk, I'd say fastpath caching
should be disabled by default rather than enabled.


if someone's application is at risk of triggering this behaviour,
that'd just delays any problem until their load is high enough that
they need to turn on fastpath - and surely that's an even worse
scenario.


I'd say it's still better, because it requires explicit action on the 
user's part to enable the flawed caching mechanism in that case.  And 
actually I don't think fastpath in its default configuration would be of 
much help in performance terms these days, given that the cache is only 
5MB large and file data is typically cached by the OS anyway (and servers 
generally have far more RAM than they did even five years ago).


I do think this should have been considered (and steps taken to address 
it) when the fastpath caching mechanism was initially developed, since 
it's a glaring flaw.  I've designed things that rely on shaky underlying 
assumptions in the past, but only in controlled circumstances where those 
assumptions were guaranteed to obtain.  I can think of situations in which 
a caching mechanism with this type of design limitation wouldn't be an 
issue, but in my opinion it has no place being a default-enabled mechanism 
in an enterprise-grade web server.


- John


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.