I'm looking for additional information w/ respect to parsing the user-agent
field of the web log (IIS for now) in
        (a) for identifying Operating System (and flavors)
        (b) for identifying Browser type (and version)
        (c) in identifying likely robots or agents

For (a) and (b), I was wondering if the logic used by Analog is available.
(Should I just look in the Analog source code?)  Furthermore, does this
follow some sort of standard format that different OS and Browsers follow in
filling the user-agent field.  If so, would someone kindly give me a
pointer.

I'm actually more interested in being able to identify for potential
exclusion--at least from some reports--robots and agents that access the
sites I'm analysing.  The interpretation of the traffic statistics is often
heavily influenced by bot/agent requests.  E.g., on one site I'm monitoring,
roughly 25% of page requests come from KeyNote agents, identifiable by
user-agent="Mozilla/4.0+(compatible;+Keynote-Perspective+4.0)".

I've looked into the information on robot exclusion proposals (see
http://info.webcrawler.com/mak/projects/robots/robots.html and
www.kollar.com/robots) but these have weaknesses in reality as they are
non-obligatory.  I also don't know much about the currency or degree of
completeness were I to use the Web Robots Database on the webcrawler site.

Has anyone made much progress in detecting robots in conjunction w/ Analog
reports?  Also, are other Analog users interested in adding some sort of
robot/agent reporting capability, or at least a Robot/Agent category in
Analog's browser reports?

thanks,

Colin Cunningham

Intel Online Services


------------------------------------------------------------------------
This is the analog-help mailing list. To unsubscribe from this
mailing list, send mail to [EMAIL PROTECTED]
with "unsubscribe" in the main BODY OF THE MESSAGE.
List archived at http://www.mail-archive.com/[email protected]/
------------------------------------------------------------------------

Reply via email to