If I create an HTML-file with the name æøå.html, and access it through Apache, the access log says "GET /%C3%A6%C3%B8%C3%A5.html". It seems to URL-encode it or something. If I then delete the HTML file an try to access it, the error log says "File does not exist: /var/www/\xc3\xa6\xc3\xb8\xc3\xa5.html". Not at all readable either... Maybe it's Apache's fault. I don't know how I can capture the raw e-mails that RT sends out for dashboard subscriptions. They are sent by RT through Postfix on the local server. Please tell me if you know how.
> Date: Tue, 31 Jul 2012 12:09:57 -0400 > From: [email protected] > To: [email protected] > Subject: Re: [rt-users] Charset for logs > > On Mon, Jul 23, 2012 at 09:36:27AM +0200, Ole Jon Bjørkum wrote: > > RT is installed from the Ubuntu repository, and the installation seems > > to log to > > /var/log/syslog and /var/log/apache2/error.log. However, I just > > discovered that it is only the > > Apache log that has charset problems. The syslog shows all characters > > correctly. Also, the > > Apache log logs in GMT while the syslog logs in the correct timezone, > > but I guess that is how > > it's supposed to be. > > RT prints logs in GMT, when those pass through syslog, syslog will add > an additional timestamp. Apache however keeps the RT timestamps. > Is it just RT's messages in the apache logs that are corrupt, or is > something as simple as a request to /Test/latin1pagename.html > corrupted in the access/error log? RT should be pushing out UTF-8 but > I'm not sure if RT is doing something wrong or if apache is corrupting > it. > > > I'm not quite sure what you mean by raw subject line. > > This is what shows up in Outlooks internet headers: Alle nye og ?pne > > saker > > This is how the subject line looks in Outlook: Alle nye og **pne saker > > jeg eier > > The question mark should be the character "aa", so the word should be > > "aapne" > > The message body uses the correct charset (I can see that UTF-8 is > > specified in the HTML). > > I mean the raw on-disk header. Subject: lines are encoded if they > contain UTF-8, so something like this: > Subject: =?UTF-8?B?4pyIVEhSRUUgQ29vbCBEZWFscyBGcm9tIEFtZXJpY2FuIEFpcmxpbmVz?= > If you have an email that is consistently corrupted when passing > through RT, if you can capture a raw version of the email (so not the > .msg file from Outlook, but something caught further upstream, before > it gets to rt-mailgate preferably) please zip it up and send it into > the RT bug tracker, along with your System Configuration page which > contains a ton of information such as perl module versions, some of > which are known-bad. > > -kevin > > > > > > Date: Fri, 20 Jul 2012 08:17:45 -0700 > > > From: [email protected] > > > To: [email protected] > > > Subject: Re: [rt-users] Charset for logs > > > > > > On Fri, Jul 20, 2012 at 09:24:22AM +0200, Ole Jon Bjo/rkum wrote: > > > > Ever since we started to use RT (before 3.8.7, now 4.0.4), it > > doesn't seem to use the > > correct > > > > charset for logging. All norwegian characters (aeo/aa) becomes: **. > > I can see this because > > we > > > > have scrips that contain norwegian characters, and every time a > > scrip is launched, it is > > > > logged to the Apache log. > > > > > > How are you logging, Syslog, Screen, File? RT has several different > > ways > > > to log and it's impossible to test without knowing. > > > > > > > Today I also noticed that if I subscribe to a dashboard with > > > > norwegian characters in its name, the subject of the email sent out > > also have this problem > > (** > > > > instead of ae, o/ or aa). The email body however, has the correct > > charset. There is no > > charset > > > > problems in the web UI. How can this be fixed? > > > > > > Please provide a raw Subject: line so we can see what's going on.
