Perhaps this is a controversial subject, like asserts in code people
expect to run their businesses on (what, trace?  just use strace and
TCP dump, you idiot :) ).

We continue to have problems with the network filter (at least
core_input) with certain network interactions.  We continue to need to
reproduce some problems in more detail than we can pick up from the
access log.  Greg Ames and I have shot a number of bugs on apache.org
using the following patch.  The theory is that if I get a segfault I
want to see what data we read from the network and in what fragments.

Index: srclib/apr/include/arch/unix/networkio.h
===================================================================
RCS file: /home/cvs/apr/include/arch/unix/networkio.h,v
retrieving revision 1.48
diff -u -d -b -r1.48 networkio.h
--- srclib/apr/include/arch/unix/networkio.h    2001/07/16 20:36:59     1.48
+++ srclib/apr/include/arch/unix/networkio.h    2001/07/18 21:59:43
@@ -55,6 +55,8 @@
 #ifndef NETWORK_IO_H
 #define NETWORK_IO_H
 
+#define APR_DEBUG_NET
+
 #include "apr.h"
 #include "apr_private.h"
 #include "apr_network_io.h"
@@ -122,6 +124,15 @@
 #define POLLNVAL 32
 #endif
 
+#ifdef APR_DEBUG_NET
+struct apr_debug_net_buffer {
+    struct apr_debug_net_buffer *next;
+    apr_int32_t saved_len;
+    apr_int32_t actual_len;
+    char data[1]; /* actual data starts here */
+};
+#endif
+
 struct apr_socket_t {
     apr_pool_t *cntxt;
     int socketdes;
@@ -135,6 +146,10 @@
     int local_port_unknown;
     int local_interface_unknown;
     apr_int32_t netmask;
     apr_int32_t inherit;
+#ifdef APR_DEBUG_NET
+    struct apr_debug_net_buffer *head;
+    int num_saved_buffers;
+#endif
 };
 
--- /home/gregames/sendrecv.c.2_0_23virgin      Sat Aug 11 10:21:39 2001
+++ srclib/apr/network_io/unix/sendrecv.c       Sat Aug 11 11:05:26 2001
@@ -54,9 +54,14 @@
 
 #include "networkio.h"
 
+#ifdef APR_DEBUG_NET
+#include <stddef.h>
+#endif
+
 #if APR_HAS_SENDFILE
 /* This file is needed to allow us access to the apr_file_t internals. */
 #include "fileio.h"
+#include <assert.h>
 #endif /* APR_HAS_SENDFILE */
 
 apr_status_t apr_wait_for_io_or_timeout(apr_socket_t *sock, int for_read)
@@ -158,6 +163,20 @@
        sock->netmask |= APR_INCOMPLETE_READ;
     }
     (*len) = rv;
+#ifdef APR_DEBUG_NET
+    if (sock->num_saved_buffers < 20)
+    {
+        apr_size_t bytes_to_save = (*len > 1024) ? 1024 : *len;
+        struct apr_debug_net_buffer *new = apr_palloc(sock->cntxt,
+                                                      offsetof(struct 
+apr_debug_net_buffer, data) + bytes_to_save);
+        memcpy(new->data, buf, bytes_to_save);
+        new->saved_len = bytes_to_save;
+        new->actual_len = *len;
+        new->next = sock->head;
+        sock->head = new;
+        ++sock->num_saved_buffers;
+    }
+#endif
     if (rv == 0) {
         return APR_EOF;
     }

This has been extremely useful, though it doesn't show what core input
filter did with the data.  But this is ugly, it has no way to pick up
hints from the Apache configuration file, it can't make use of the
Apache trace, etc. 

The question is how to put this in an Apache module, as well as what
to do about the output side.  (My suspicions about the current CPU
spikes on daedalus are that they are due to faulty I/O interactions
with clients on bad connections.  But I don't know how to verify
that.  It would be great to have a module that could count the
total write-style calls on a connection and count the failed
write-style calls (EAGAIN) and make that available to the logging
facility.  Comparing that to the CPU ticks Greg is already logging
would be cool.)

For putting this in a module, the first thing I thought of was
replacing the code that creates a socket bucket so that it instead
creates some tracing-socket bucket registered by my module.  But that
code is in core_input.  It would be crazy to try to replace
core_input.

I think Ryan's work to be able to replace the I/O (which conceivably
would be fine for a module that wanted to trace; it would just invoke
the real I/O under the covers) requires that core_input and
core_output be replaced.  But those functions are part of what I need
to work around, because by putting some sort of trace function on top
I want to see what they did with the data they received.  And
replacing core_input and core_output is a showstopper anyway.  What we
have in the core code is kind of fragile.  To think of having to
maintain that in a module is pretty scary to me.

Thoughts?  Help!!!

(I don't think there is a problem tracing what goes on between core
in/out and higher-layer filters.  I once had a module that traced the
buckets (or if you chose, just the types and lengths) that passed
between filters.  It was registered as a content filter, though with
some trickery I suspect I can get it added anywhere I choose.)
-- 
Jeff Trawick | [EMAIL PROTECTED] | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Reply via email to