Re: Links with Accents

2007-04-24 Thread Thomas Fernandez
Hello MFPA,

On Mon, 23 Apr 2007 23:13:02 +0100 GMT (24/04/2007, 05:13 +0700 GMT),
MFPA wrote:

M Dunno. I was not aware accented characters were allowed in URLs.

They are now, since sometime last year.

M I note that if I cut and paste your link into Firefox, when the
M page loads the í in the URL in the address bar becomes %C3%AD

Firefox does it correctly, TB doesn't interpret the high-ASCII
characters correctly, if I understand RFC3986 correctly:

M 2.4.  When to Encode or Decode
M 
M[...]
M 
MWhen a URI is dereferenced, the components and subcomponents
Msignificant to the scheme-specific dereferencing process (if any)
Mmust be parsed and separated before the percent-encoded octets
Mwithin those components can be safely decoded, as otherwise the
Mdata may be mistaken for component delimiters. The only exception
Mis for percent-encoded octets corresponding to characters in the
Munreserved set, which can be decoded at any time. For For example,
Mthe octet corresponding to the tilde (~) character is often
Mencoded as %7E by older URI processing implementations; the %7E
Mcan be replaced by ~ without changing its interpretation.

So, a URL with umlauts or accents should not be cut off as TB does,
but shown in full. When you click on it, the corresponding decoding
should be used and sent to the browser. If the browser can understand
the encoded URL correctly (i.e. decoding it), there is no work for TB
to be done, just to allow the whole URL to be highlighted, so we can
send it to the browser. I would still prefer TB to do the decoding,
but the first step, not highlighting the whole URL, is a bug IMHO.

I call it a bug, but had a discussion with Vitalie Vrabie on TBBETA
last year, who disagreed. He said that TB is behaving correctly by not
recognising such URLs. That's why there is no BT entry.

I'm copying back to TBBETA for further discussion.

-- 

Cheers,
Thomas.

Atmen Sie bewusster: Jeweils zehn Minuten lang einatmen, Luft anhalten
und ausatmen. Das ermuedet rasch. *
http://thomas.fernandez.hat-gar-keine-homepage.de/

Message reply created with The Bat! 3.99.1
under Windows XP 5.1 Build 2600 Service Pack 2






Current version is 3.98.04 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Links with Accents

2007-04-23 Thread Chris W .

The following link stops at the 'í' in my version:
http://es.wikipedia.org/wiki/Día_Internacional_del_Libro

Can anyone else confirm?
A bug?

-- 
Chris

Using The Bat! v3.98.4 on Windows Vista 6.0 Build 6000.
Accessing a POP3 mailbox.

Today's Oxymoron: Peace force

pgpCJwHdyCsPw.pgp
Description: PGP signature

Current version is 3.98.04 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Re: Links with Accents

2007-04-23 Thread Urban
Monday, April 23, 2007, Chris W. wrote:

 The following link stops at the 'í' in my version:
 http://es.wikipedia.org/wiki/Día_Internacional_del_Libro

 Can anyone else confirm?

Yes

 A bug?

Maybe more a case of TB doing it by the standard too much.

Some characters in URL:s need to be encoded (there is a very nice list
at
http://users.easystreet.com/ovid/cgi_course/appendices/appendix2.html)
But of course, humans should never have to be bothered with such
trivialities. The program should do the conversion automagically

-- 
Urban

The moon is a planet just like the earth, only it is even deader.





Current version is 3.98.04 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Links with Accents

2007-04-23 Thread MFPA
Hi

On Monday 23 April 2007 at 10:01:50 PM, in
mid:[EMAIL PROTECTED], Chris W. wrote:

 The following link stops at the 'í' in my version:
 http://es.wikipedia.org/wiki/Día_Internacional_del_Libro

 Can anyone else confirm?

Here, it stops after the D, immediately before the í. Is that
the same as you?

 A bug?

Dunno. I was not aware accented characters were allowed in URLs.

I note that if I cut and paste your link into Firefox, when the
page loads the í in the URL in the address bar becomes %C3%AD
- so the URL reads as
http://es.wikipedia.org/wiki/D%C3%ADa_Internacional_del_Libro

-- 
Best regards,
 
MFPA

Consistency is the last refuge of the unimaginative

Using The Bat! v3.80.06 on Windows XP 5.1 Build 2600 Service Pack 1 



Current version is 3.98.04 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html


Re: Links with Accents

2007-04-23 Thread Jernej Simonèiè
On Monday, April 23, 2007, 23:49:55, Urban wrote:

 Some characters in URL:s need to be encoded

You can't really encode URLs like http://čšž.ena.si/ (well, you can
write the link as http://xn--bea2o7c.ena.si/, but that makes it
completely unreadable.

-- 
 Jernej Simončič  http://deepthought.ena.si/ 

If you assign N persons to write a compiler you'll get a N-1 pass compiler.
   -- Conway's Law #1



Current version is 3.98.04 | 'Using TBUDL' information:
http://www.silverstones.com/thebat/TBUDLInfo.html