Hello.

I want to add page title to squid log for view the user's surfing history.
Thank's to Henrik Nordstrom and his reply at 2006 about this :-) http://www2.tr.squid-cache.org/mail-archive/squid-dev/200603/0009.html

Following his idea I parse web page content in function sendMoreData of client side routines (client_side_reply.cc) I found the page title and log it to access.log using new logformat token (for example "<tp").

But I have the problem:
The page title is not always logged.
For example I visit www.godaddy.com - I see in log his page title.
I visit www.nasa.gov - I don't see title in log :(
What I was wrong? Maybe not all pages are given to the client through the client_side_reply::sendMoreData function?

Thank for any idea.

I made the following changes: (squid 3.1.10, freebsd 8.2 stable, amd64)

AccessLogEntry.h:
+ added char *title; to AccessLogEntry class definition (public section, line 54);

access_log.cc:
+ added LFT_REPLY_PAGE_TITLE to end of enum logformat_bcode_t definition
+ added element "<tp" for LFT_REPLY_PAGE_TITLE to struct logformat_token_table
+ added new case to function accessLogCustom():
     case LFT_REPLY_PAGE_TITLE:
       if (al->title) {
          out = al->title;
       quote = 1;
       dofree = 1;
       }
       break;

client_side_reply.cc:
  In function sendMoreData() line 2078 I added block for parsing buffer:
  if (http->al.title == NULL) {
    // search TITLE tag
    const char *tag1 = "<title>";
    const char *tag2 = "</title>";
char *ans1 = strstr(buf, (char *)tag1, result.length-7); // search open tag in buf (length in result.length minus length of tag)
    if (ans1) {
char *ans2 = strstr(ans1+7, (char *)tag2, result.length - (ans1-buf)-7); // search close tag in rest of buffer
      if (ans2) {
         int titlelen = ans2 - ans1 - 7;  // title length
         http->al.title = (char *)xcalloc(titlelen + 1,1);
         xstrncpy(http->al.title, &ans1[7], titlelen);
      }
    }
  }

  Realisation of strstr function:
  char * strstr (char *haystack, char *needle, int strlen)
  {
    char *start;
    int tmplen = 0;
    for (start=haystack; tmplen<strlen; start++,tmplen++) {
      char *p = needle;
      char *q = start;
      while ( *p != '\0' && *p == tolower(*q) ) {
        p++;
        q++;
      }
      if ( *p == '\0' )
        return start; // reached end of needle without mismatch
    }
    return NULL;
    }

---
Regards,
Sergey

Reply via email to