"Sort of" fixed in commit cb0d0e <https://github.com/weewx/weewx/commit/cb0d0e50362e84dd64a2b6755af40ee6ff8c9662> .
The problem is that browsers default to a cp1252 encoding (aka, Windows-1252) for text files, *not* UTF-8. So, a single character (such as a degree sign) which takes two bytes to encode using UTF-8, is instead interpreted as two characters encoded in cp1252, in this case, the character 'Á' and the character '⁰'. We could change the encoding of the NOAA files to cp1252 in order to match the browser's expectations, but then station locations that cannot be encoded in cp1252 would cause an exception. This is unacceptable. In the end, I decided to go with a "normalized ascii" encoding as the default for NOAA reports. This will encode accented characters as their unaccented equivalent. For example, 'ö' will be replaced with 'o'. Unfortunately, that means no degree sign. If you know that your station location will properly encode in cp1252, then you can change the encoding to that (available with version 4.4.1). In that case, the degree sign will display properly. The fun world of character encodings! -tk On Fri, Feb 5, 2021 at 5:12 AM Tom Keffer <[email protected]> wrote: > Now I'm getting the same effect as you: the two-byte utf-8 encoding of the > degree sign is being interpreted as two separate characters encoded in > Latin-1, that is, the character 'Á' and the character '⁰'. > > This is if using a webserver. If you simply display the page as a file, it > displays correctly. > > Clearly a simple change of the encoding is not going to work. Let me work > on this a bit. > > Issue #646 <https://github.com/weewx/weewx/issues/646>. > > -tk > > On Fri, Feb 5, 2021 at 4:33 AM [email protected] <[email protected]> wrote: > >> http://grattans.org/wx/NOAA/NOAA-2021-02.txt >> >> On Thursday, February 4, 2021 at 6:55:52 PM UTC-5 [email protected] wrote: >> >>> Is the page on the web? If not, can you post it? >>> >>> On Thu, Feb 4, 2021 at 3:26 PM [email protected] <[email protected]> wrote: >>> >>>> Tom, >>>> It shows up in Edge, Chrome, and Firefox. All I changed was the >>>> "encoding = utf8" for the NOAA reports (cheetagenerator in skin.conf) as >>>> was discussed in this thread. >>>> I don't have any special characters in use but now see the degree >>>> symbol along with the other odd addition (°F) . >>>> Thanks. >>>> BG >>>> see* http://grattans.org/wx <http://grattans.org/wx>* under month of >>>> Feb. It's not showing in Jan as that report was created before my change >>>> to utf8. >>>> On Thursday, February 4, 2021 at 5:37:23 PM UTC-5 [email protected] >>>> wrote: >>>> >>>>> Are you seeing this in a browser? If so, this is generally caused by >>>>> the browser using the wrong encoding. For example, it's probably expecting >>>>> Latin-1 (or Windows-1252) instead of the actual utf-8. >>>>> >>>>> As for why, did you add anything else to your NOAA page? Such, as an >>>>> encoding directive? >>>>> >>>>> If no encoding directive, then which browser are you using? Chrome >>>>> autodetects the encoding, but generally gets it right. For others, you >>>>> must >>>>> set the encoding manually. >>>>> >>>>> -tk >>>>> >>>>> On Thu, Feb 4, 2021 at 2:15 PM [email protected] <[email protected]> >>>>> wrote: >>>>> >>>>>> Gary, >>>>>> My upgrade to 4.4.0 was not a new install and I need to change >>>>>> manually to utf8. I made changes in skin.conf but must have missed >>>>>> something. My NOAA daily now shows: >>>>>> TEMPERATURE (°F), RAIN (in), WIND SPEED (mph) >>>>>> >>>>>> old form with html_entities encoding was: >>>>>> TEMPERATURE (F), RAIN (in), WIND SPEED (mph) >>>>>> >>>>>> Is there something I need to delete and rebuild? >>>>>> >>>>>> Thanks >>>>>> BG >>>>>> >>>>>> >>>>>> >>>>>> On Wednesday, February 3, 2021 at 1:53:47 AM UTC-5 gjr80 wrote: >>>>>> >>>>>>> Just by way of background up until v4.3.0 WeeWX shipped with >>>>>>> strict_ascii encoding for NOAA reports. v4.4.0 changed that to utf8 but >>>>>>> for >>>>>>> new installs only, upgrades or earlier versions need to be changed >>>>>>> manually. >>>>>>> >>>>>>> Gary >>>>>>> >>>>>>> On Wednesday, 3 February 2021 at 16:27:31 UTC+10 [email protected] >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Gary, >>>>>>>> >>>>>>>> thanks again. Works now. >>>>>>>> [image: DustPM10.png] >>>>>>>> >>>>>>>> On Wednesday, February 3, 2021 at 7:18:43 AM UTC+1 Calo Geyer wrote: >>>>>>>> >>>>>>>>> Hi Gary, >>>>>>>>> >>>>>>>>> thanks. Yes, you are right. Got this in my skins.conf and it is >>>>>>>>> even written in the explanation. Will change and report back. >>>>>>>>> >>>>>>>>> >>>>>>>>> [CheetahGenerator] >>>>>>>>> >>>>>>>>> # Possible encodings are 'html_entities', 'utf8', or >>>>>>>>> 'strict_ascii' >>>>>>>>> encoding = html_entities >>>>>>>>> >>>>>>>>> [[SummaryByMonth]] >>>>>>>>> # Reports that summarize "by month" >>>>>>>>> [[[NOAA_month]]] >>>>>>>>> encoding = strict_ascii >>>>>>>>> template = NOAA/NOAA-%Y-%m.txt.tmpl >>>>>>>>> >>>>>>>>> [[SummaryByYear]] >>>>>>>>> # Reports that summarize "by year" >>>>>>>>> [[[NOAA_year]]] >>>>>>>>> encoding = strict_ascii >>>>>>>>> template = NOAA/NOAA-%Y.txt.tmpl >>>>>>>>> >>>>>>>>> On Wednesday, February 3, 2021 at 7:11:51 AM UTC+1 gjr80 wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> What encoding are you using for the NOAA format reports in your >>>>>>>>>> skin.conf? If you have the old encoding = strict_ascii then no you >>>>>>>>>> won’t >>>>>>>>>> see the mu character, you should see it though if you have encoding >>>>>>>>>> = utf8. >>>>>>>>>> >>>>>>>>>> Gary >>>>>>>>>> >>>>>>>>>> On Wednesday, 3 February 2021 at 16:00:13 UTC+10 >>>>>>>>>> [email protected] wrote: >>>>>>>>>> >>>>>>>>>>> Hi, I added my SDS011 data (which I read-in using file-pile) >>>>>>>>>>> however the unit in the header is not shows as microgramm per cubic >>>>>>>>>>> meter. >>>>>>>>>>> I changed the unit in units.py to see if it also is adapted in the >>>>>>>>>>> report >>>>>>>>>>> and was working however using the default pm10_0 unit shows as g/m >>>>>>>>>>> This is my entry in .tmpl >>>>>>>>>>> DUST PM10 ($unit.label.SDS011_PM10.strip()) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [image: NOAA Dust PM10.png] >>>>>>>>>>> >>>>>>>>>>> Any idea? >>>>>>>>>>> >>>>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "weewx-user" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/weewx-user/e8c77f07-a140-4cd0-a5c4-23d80dc182f0n%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/weewx-user/e8c77f07-a140-4cd0-a5c4-23d80dc182f0n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "weewx-user" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/weewx-user/37264028-a1af-4424-8a78-ed39f0adcb13n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/weewx-user/37264028-a1af-4424-8a78-ed39f0adcb13n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/596cdf6e-4101-4237-8b28-84921fd87d11n%40googlegroups.com >> <https://groups.google.com/d/msgid/weewx-user/596cdf6e-4101-4237-8b28-84921fd87d11n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/CAPq0zECawDpsX4z2omToKAkrMwffCbFkxxCmSUKPPria-R4C2g%40mail.gmail.com.
