I'm trying to analyze web log entries like this one:
2004-03-28 09:32:52 81.192.2.222 - W3SVC1 DB db.jhuccp.org GET 
/dbtw-wpd/exec/dbtwpcgi.exe 
XC=%2Fdbtw-wpd%2Fexec%2Fdbtwpcgi.exe&BU=http%3A%2F%2Fdb..jhuccp.org%2Fcds%2Findex.htm&QB0=AND&QF0=TITRE+%7C+AUTEUR+%7C+EDITION+%7C+COLLATION+%7C+NOTES+%7C+DESCRIPTEURS+%7C+COTE&QI0=&QB1=AND&QF1=DESCRIPTEURS&QI1=&QB2=AND&QF2=TITRE&QI2=la+r%E9forme+hospitali%E9re&QB3=AND&QF3=AUTEUR&QI3=MSP&MR=20&TN=CDS-INAS&RF=web&DF=web&NP=3&AC=QBE_QUERY&MF=
 200 0 742 712 109 80 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+98;+Menara) 
- http://db.jhuccp.org/cds/index.htm 

I'm picking out the 10th space delineated field, which starts with "XC=%2Fdbtw-" and 
ends with "QBE_QUERY&MF=", picking out the variables identified by "QIn" and printing 
them out. The one that's giving me trouble is the 'QI2' one in this example: 
"QI2=la+r%E9forme+hospitali%E9re". The web input form, which I have no control over, 
has inserted what I think are called HTML Entities, in the string, like "%E9", for 
troublesome characters. Most often, these seem to be just extended ASCII characters, 
like '%29' for ')', but in the "%E9" example, it looks like an accented characters.

Can anyone tell me exactly what's going on here? Right now, my program is just using a 
bunch of substitution commands to substitute the extended ASCII character, but I don't 
want to write out every special character using this system. Is there a perl tool or 
module which would do this for me? I tried using HTML::Entities, but that didn't work. 
 If I just knew the proper term or keywords for what's going on, I could search for 
it. I tried searching the archives of this list for "ASCII" but didn't get anything 
pertaining to this problem.

Thanks for all your help and suggestions.

-Kevin Zembower

-----
E. Kevin Zembower
Unix Administrator
Johns Hopkins University/Center for Communications Programs
111 Market Place, Suite 310
Baltimore, MD  21202
410-659-6139


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to