Hi Andrew,

On Tue, Apr 07, 2015 at 04:52:38PM -0500, Andrew Hayworth wrote:
> It's often undesirable to log query params - and in some cases, it can
> create legal compliance problems. This commit adds a new log format
> variable that logs the HTTP verb and the path requested sans query
> string (and additionally ommitting the protocol). For example, the
> following HTTP request line:
> 
>   GET /foo?bar=baz HTTP/1.1
> 
> becomes:
> 
>   GET /foo
> 
> with this log format variable.

Just a question, did you find any benefit in doing it with a new tag
compared to %[path] ? It may just be a matter of convenience, I'm just
wondering.

Other comments on the code below :

> +char* strip_uri_params(char *str) {
> +     int spaces = 0;
> +     int end = strlen(str);
> +
> +     int i;
> +     int path_end = end;
> +     for (i = 0; i < end; i++) {
> +             if (str[i] == ' ' && spaces == 0) {
> +                     spaces++;
> +             } else if (str[i] == '?' || (str[i] == ' ' && spaces > 0)) {
> +                     path_end = i;
> +                     break;
> +             }
> +     }

What's the purpose of counting spaces to stop at the second one ? You
cannot not have any in the path, so I'm a bit puzzled.

> +     char *temp = malloc(path_end + 1);

Please avoid inline declarations, not only they're prone to many bugs,
but they don't build on some older compilers.

Also please avoid mallocs as much as possible. In your case this function
returns a string for immediate use, there's no need for allocating, you
can copy the string directly into the trash.

>  /* return the name of the directive used in the current proxy for which we're
>   * currently parsing a header, when it is known.
>   */
> @@ -1539,6 +1564,27 @@ int build_logline(struct session *s, char *dst, size_t 
> maxsize, struct list *lis
>                               last_isspace = 0;
>                               break;
>  
> +                     case LOG_FMT_PATH: // %p
> +                             if (tmp->options & LOG_OPT_QUOTE)
> +                                     LOGCHAR('"');
> +                             uri = txn->uri ? txn->uri : "<BADREQ>";
> +                             ret = encode_string(tmplog, dst + maxsize,
> +                                                    '#', url_encode_map, 
> uri);
> +                             if (ret == NULL || *ret != '\0')
> +                                     goto out;
> +
> +                             char *sanitized = strip_uri_params(tmplog);
> +                             if (sanitized == NULL)
> +                                     goto out;
> +
> +                             tmplog += strlen(sanitized);
> +                             free(sanitized);

Here I don't understand, you seem to be doing malloc+strncpy+strlen+free
just to get an integer which is the position of the question mark in the
chain that is determined very earély in your strip function, is that it ?
If so, that's totally overkill and not even logical!

In this case I'd simply do something like this which looks much much
cleaner and more efficient :

                        case LOG_FMT_PATH: // %p
                                if (tmp->options & LOG_OPT_QUOTE)
                                        LOGCHAR('"');
                                uri = txn->uri ? txn->uri : "<BADREQ>";
                                ret = encode_string(tmplog, dst + maxsize,
                                                       '#', url_encode_map, 
uri);
                                if (ret == NULL || *ret != '\0')
                                        goto out;

+                               uri = strchr(tmplog, '?');
+                               tmplog = uri ? uri : ret;

                                if (tmp->options & LOG_OPT_QUOTE)
                                        LOGCHAR('"');
                                last_isspace = 0;
                                break;

Or am I missing something ?

Thanks,
Willy


Reply via email to