Thanks to everyone to drawing our attention to this issue.

A couple of days ago the ticTOCs service moved to a new server where the data 
is stored as UTF-8 (which it wasn't before). We'd forgotten to remove the UFT-8 
conversion in text.php so we were serving double-encoded content (UTF-8 encoded 
as UTF-8) until our developer put it right in the middle of the discussion on 
this list (which started at 5pm our time!)

You should find the problem is fixed now.


Terry


Terry Bucknell
Electronic Resources Manager
Sydney Jones Library
University of Liverpool
Chatham St, PO Box 123
Liverpool, L69 3DA, UK
Tel: +44 (0)151 794 2692
Fax: +44 (0)151 794 2681



-----Original Message-----
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Glen 
Newton
Sent: 21 December 2009 17:52
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Character problems with tictoc

[I realise there was a recent related 'Character-sets for dummies'[1]
discussion recently] 

I am using tictocs[2] list of journal RSS feeds, and I am getting
gibberish in places for diacritics. Below is an example:

in emacs:
 221    Acta Ortop  dica Brasileira     
http://www.scielo.br/rss.php?pid=1413-7852&lang=en      1413-7852       
in Firefox:
 221    Acta Ortop  dica Brasileira     
http://www.scielo.br/rss.php?pid=1413-7852&lang=en      1413-7852

Note that the emacs view is both of a save of the Firefox, and from a
direct download using 'wget'.

Is this something on my end, or are the tictocs people not serving
proper UTF-8? 

The HTTP header from wget claims UTF-8:
> wget -S http://www.tictocs.ac.uk/text.php
> --2009-12-21 12:47:59--  http://www.tictocs.ac.uk/text.php
> Resolving www.tictocs.ac.uk... 130.88.101.131
> Connecting to www.tictocs.ac.uk|130.88.101.131|:80... connected.
> HTTP request sent, awaiting response... 
>   HTTP/1.1 200 OK
>   Date: Mon, 21 Dec 2009 17:42:05 GMT
>   Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.8k PHP/5.3.0 DAV/2
>   X-Powered-By: PHP/5.3.0
>   Content-Type: text/plain; charset=utf-8
>   Connection: close
> Length: unspecified [text/plain]
><....stuff removed>

Can someone validate if they are also experiencing this issue?

Thanks,
Glen

[1]https://listserv.nd.edu/cgi-bin/wa?S2=CODE4LIB&q=&s=character-sets+for+dummies&f=&a=&b=
[2]http://www.tictocs.ac.uk/text.php

-- 
Glen Newton | glen.new...@nrc-cnrc.gc.ca
Researcher, Information Science, CISTI Research
& NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/t l: 613-990-9163 | facsimile/t l copieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montr al
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--

Reply via email to