On Feb 26, 2008, at 8:10 PM, Jeremy Wadsack wrote:
The format for these parameters is not typical for a web log. Usually
the query string is URL-escaped. In that case quote characters are
converted to their hex equivalent. In your example I would expect
something more like this:
F%2E+Scott+Fitzgerald%27s+evolving+American+dream:+the+
%22pursuit+of+happiness%22+in+Gatsby%2C+Tender+is+the+night
In this case Analog can parse the file just fine. For your files you
will probably need to pre-process the lines to convert them to
something
Analog can support.
From the Apache 2.2 documentation:
Some Notes
For security reasons, starting with version 2.0.46, non-printable and
other special characters in %r, %i and %o are escaped using \xhh
sequences, where hh stands for the hexadecimal representation of the
raw byte. Exceptions from this rule are " and \, which are escaped by
prepending a backslash, and all whitespace characters, which are
written in their C-style notation (\n, \t, etc). In versions prior to
2.0.46, no escaping was performed on these strings so you had to be
quite careful when dealing with raw log files.
So, for Apache, at least, what I'm seeing is expected behavior.
+------------------------------------------------------------------------
| TO UNSUBSCRIBE from this list:
| http://lists.meer.net/mailman/listinfo/analog-help
|
| Analog Documentation: http://analog.cx/docs/Readme.html
| List archives: http://www.analog.cx/docs/mailing.html#listarchives
| Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
+------------------------------------------------------------------------