Re: [AOLSERVER] missing access log entries

Jim Davidson Fri, 03 Apr 2009 15:53:12 -0700

HI,

Sorry -- I wrote the original and goofy code leading to thisconfusion...

Based on this discussion I think it's NOT working as intended. Ifthere was a time I thought ns_adp_abort should mean "return NS_ERRORfor connection status and thus no-traces..." I've long sinceforgotten the rationale. What I remember is ns_adp_abort means"somebody did something weird but valid, e.g., with ns_returnXXX, sotry to stop Tcl execution and don't send the buffer content (ifany)...". In contrast, ns_adp_break means "ADP buffer is good butwe're done so try to stop executing Tcl code and send what'sthere...". The code is messy because the logical "break" is done byraising a Tcl error which is why the ADP exception is storedelsewhere. Also, I say "try to stop..." because I think one couldpossibly use "catch" to catch the attempt to unwind the Tcl stack andkeep executing code. I'm not aware of any draconian logical "jump"out of a script -- Tcl maintains state on the stack and it has to beunwound in the end.

Related, on the trace/filter/cleanup discussion, the code was writtenover successive versions and was never very smart or satisfying:

Traces: Stupid callbacks with no result code, they just all run ifthe connection is "ok" run in FIFO order

Filters: Slightly less stupid callbacks that could signal "stoprunning filters" or "don't bother with connection" (in case of pre-auth) in FIFO order

Conn cleanup: More specific and perhaps not stupid: Callbacks rununconditionally in LIFO order (suitable for cleanups). Evidently theuser could probe the connection context to see if things were bad orgood.

I think over the years I figured out that just skipping some callbacksbecause the connection was "bad" (itself a poorly defined state) wasdumb so the conn-cleanup stuff was introduced.

And, just as old as the dumb trace code is the accesslog module -- itwas written when only the trace stuff was around and has stayed thatway. I'm not sure if switching accesslog to use a conn-cleanup wouldwork -- it may be accessing fields in the connection which may not beinitialized on a request cancelled very early on. Anyway, it seemsfor the benefit of other filters (are there really any??), the ADPcode should be fixed so an "ns_adp_abort" is not an NS_ERROR.

BTW: I've wondered if accesslog is useful anyway -- seems like GoogleAnalytics gets you the insight you need on one end and a more specificlogging thing tied to the driver thread + cleanup would be useful forthe admin type folks. For example, I'm not sure you ever get a logentry for a invalid connection request or aborted socket connection --that's knowledge the driver thread would have (and would beinteresting to admin folks) but would never generate an accesslogbecause there would never have been any connection context to providefor logging.



-Jim









On Apr 3, 2009, at 3:11 PM, Tom Jackson wrote:

After reviewing the code from version 4.0 and up, it looks like the

intended behavior is to not log an access log entry if an erroroccurs.


In fact, the request was not serviced, so an entry in the access log

would be incorrect, the correct log entry would be a 500 internalserver

error.

I just found another bug, and the fact that errors are not returningan

internal server error message indicates that there are further bugs in
the way the code works.

My guess is that if the internal server handler runs that the logentry

will show up, we'll see.

Anyway, here is another test adp which exposes another bug:

<%

ns_adp_puts whatsup
ns_log Notice "Before include adp exception = [ns_adp_exception]"
ns_adp_include test.adp
ns_log Notice "adp exception = [ns_adp_exception]"
ns_log Notice "I shouldn't be running"

%>

The include file test.adp:

<%

ns_return 200 text/plain hi

ns_adp_abort

ns_log Notice "After abort, I shouldn't be running"

%>

<%

ns_log Notice "in separate adp block"

%>

With the bug the log notice after ns_adp_abort, inside the test.adpfile

does not execute (this is expected), but the statements after
ns_adp_include do run, this is bad. Execution should stop with the
include statement since test.adp returned ADP_ABORT.

Here is a fix for the bug from
line 1010 of nsd/adpeval.c
   /*
    * Clear the return exception and reset result.
    */
   switch (itPtr->adp.exception) {
   case ADP_OK:
        break;
   case ADP_RETURN:
        itPtr->adp.exception = ADP_OK;
        result = TCL_OK;
        break;
   case ADP_ABORT:
       result = TCL_ERROR;
       break;
   default:
        result = TCL_OK;
        break;
   }

But the code still has the bug which you identified, it may have to do
with an actual error not clearing the adp.exception setting.

tom jackson

On Fri, 2009-04-03 at 14:54 -0500, Andrew Steets wrote:

My original concern was with the access logging proc, which happensto

be run as a trace filter.  I think that the access log entries should
be generated regardless of whether or not ns_adp_abort is called.  I
don't care too much about anything else that was installed as a trace
filter.

Do you agree that access log entries should be generated if
ns_adp_abort is called?

-Andrew

On Fri, Apr 3, 2009 at 10:33 AM, Tom Jackson <[email protected]> wrote:
       Andrew,

       I wasted a little more time looking at the actual code. My
       impression is
       that everything is working as expected. If there is an error
       in a
       postauth filter or in adp processing (registered proc), trace
       filters
       are skipped. Until about 4.5, errors during preauth also
       skipped trace
       filters. Not sure why this change was made.

       The only think that matters is what happens in Ns_AdpRequest.
       If there
       were no errors, the request will be logged. In order to get
       ns_adp_abort
       to work correctly, the tcl result must be set to TCL_ERROR
       until code
       returns to Ns_AdpRequest. This is why an additional structure
       is
       maintained for the adp exception, which is independent of the
       tcl
       exception. In this case, adp.exception indicates what actually
       happened
       during adp processing.

       So things seem to be working as intended, and they have been
       working the
       same way for a long time. It might be possible that you are
       misusing
       ns_adp_abort, or something else is messing up.

       Could you provide a simple test case, probably a few nested
       adp
       includes, which repeats the issue? Without a test case of what
       you think
       should work differently, it is hard to give any more advice.

       In general, when an error occurs during a request, the
       response is by
       definition an error response, so the original request might
       get
       transformed into an internal redirect to your error handling
       page. An
       error in this page, or a missing error page could cause
       further
       problems.

       Bottom line: no reason to believe that this is a bug.

       tom jackson

       p.s. this case seems to validate my belief that the hardest
       bug to find
       and fix is one that doesn't actually exist.

       On Thu, 2009-04-02 at 18:03 -0500, Andrew Steets wrote:

What was the original purpose of "trace" filters?  At the C

       API level

there is a distinction between between a trace filter and a

       cleanup

callback, but it doesn't look like you can register a

       cleanup proc

from TCL.  Maybe this was mistakenly omitted?

The cleanup procs run unconditionally.  It seems like that

       is the most

appropriate place to handle "cleanup of resources."

        Alternatively we

could change the trace filters to run regardless of the
Ns_ConnRunRequest() return status, but then that would make

       them

basically the same as the cleanups.

I looked a little deeper into the source.  The confusion

       seems to

arise in NsAdpFlush() which is run at the end of all ADP

       processing.

The code there is smart enough to recognize when an abort

       exception

has been signalled; it sets the TCL result to "adp flush

       disabled: adp

aborted", but it still returns TCL_ERROR.  That is

       essentially where

the TCL exception gets turned into a full blown connection

       processing

error.  We could change NsAdpFlush() to return success when

it

recognizes the abort exception, or just not run NsAdpFlush()

       for abort

exceptions.

There would still be cases where trace filters would not run

       though.

For instance if you called ns_returnxxx without calling

       ns_adp_abort.

I'm not sure if that is a bad thing.

It would be nice to hear from anyone who knows about the

       original

motivation for the trace and cleanup filters.

-Andrew

On Thu, Apr 2, 2009 at 3:53 PM, Tom Jackson

       <[email protected]> wrote:

       Gustaf,

       You may be "using" traces but not realize it, it

       sounds like

       ns_adp_abort isn't don't what was originally

       intended.


       I wouldn't worry about an runtime error caused

       during running

       traces, it
       would be an error to even use ns_adp_abort in a

       trace filter

       because the
       connection is already finished. This is analogous to

       calling

       [break]
       outside of a loop.

       It seems important to consider ns_adp_abort,

       ns_adp_return and

       ns_adp_break as a unit. They add necessary loop type

       controls

       so that
       developers can create deeply nested code and still

       get out of

       it without
       the need to use [catch]. But, like a lot of

       AOLserver specific

       procedures, there is no hand-holding in their use.

       They can be

       misued.

       In this particular case, it looks like somewhere

       along the

       way,
       ns_adp_abort was modified to not work as expected.

       The desired effect is exactly what you would get by

       returning

       filter_return from a preauth or postauth filter.

       This effect

       is to skip
       to trace filters, not past them.

       Skipping trace filters even on an aborted connection

       would be

       a disaster
       for any application which relies on cleanup of

       resources.


       tom jackson


       On Thu, 2009-04-02 at 11:12 +0200, Gustaf Neumann

       wrote:

Andrew Steets schrieb:

The patch I sent earlier seems to fulfill these

       needs, but

       I am

worried about corner cases where LogTrace (from

       the nslog

       module)

could blow up.  Nothing about the state of the

       Conn *

       seems to be

guaranteed when the ConnCleanup callbacks are

       called.

Dear Andrew,

i think most (all?) of the repondents seems to

       agree that

       writing in the

about case to
the access log file. For me there are still two

       quesions

       open:


a) is it possoble to call ns_adp_abort at some

       time, where

       the server

might crash
  (in normal operations, everthing looks fine to

me,

       problems might

occur in
  when called from some traces; other calls are

       likely to

       have similar

problems)

b) the patch replaces the call to the regular

       server trace

       by a

connection cleanup call.
  this means, at least in 4.5.*, ns_adp_abort

       seems to

       cancel all

traces (also
  these registered with ns_register_trace). Is

       this

       desired?


  From Tom's website:

       http://rmadilo.com/files/nsapi/ns_adp_abort.html

  the doc of ns_adp_abort says

  ... Every ns_returnxxx call in an ADP should be

       followed

       with a call

   to ns_adp_abort....

  With this recommendation, cancelling traces

       seem wrong to

       me; or at

least,
  this should be documented.

We don't use traces, all of OpenACS does not use

       it, so this

       is no

current issue for us.

-gustaf neumann


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an

       email to

       <[email protected]> with the

body of "SIGNOFF AOLSERVER" in the email message.

       You can

       leave the Subject: field of your email blank.



       --
       AOLserver - http://www.aolserver.com/

       To Remove yourself from this list, simply send an

       email to

       <[email protected]> with the
       body of "SIGNOFF AOLSERVER" in the email message.

       You can

       leave the Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/



To Remove yourself from this list, simply send an email to

       <[email protected]> with the

body of "SIGNOFF AOLSERVER" in the email message. You can

       leave the Subject: field of your email blank.



       --
       AOLserver - http://www.aolserver.com/

       To Remove yourself from this list, simply send an email to
       <[email protected]> with the
       body of "SIGNOFF AOLSERVER" in the email message. You can
       leave the Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[email protected]> with thebody of "SIGNOFF AOLSERVER" in the email message. You can leave theSubject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[email protected]> with thebody of "SIGNOFF AOLSERVER" in the email message. You can leave theSubject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to 
<[email protected]> with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Re: [AOLSERVER] missing access log entries

Reply via email to