URL encoding of a character consists of a "%" symbol, followed by the
two-digit hexadecimal representation (case-insensitive) of the ISO-
Latin code point for the character.

Vertical Bar/Pipe ("|") = decimal code point 124 in the ISO-Latin
set.

124 decimal = 7C in hexadecimal = Unicode code point U+007C

Vertical Bar/Pipe ("|") code points (Hex) is 7C, so the URL encoded
representation will be "%7C"



Unsafe Characters:
Significant sequences of spaces may be lost in some uses (especially
multiple spaces)
Space code points (Hex) is 20

Following characters are often used to delimit URLs in plain text:
Quotation marks code points (Hex) is 22
'Less Than' symbol ("<") code points (Hex) is 3C
'Greater Than' symbol (">") code points (Hex) is 3E

This is used in URLs to indicate where a fragment identifier
(bookmarks/anchors in HTML) begins.
'Pound' character ("#") code points (Hex) is 23

This is used to URL encode/escape other characters, so it should
itself also be encoded:
Percent character ("%") code points (Hex) is 25

Some systems can possibly modify these characters:
Vertical Bar/Pipe ("|") code points (Hex) is 7C
Left Curly Brace ("{") code points (Hex) is 7B
Right Curly Brace ("}") code points (Hex) is 7D
Backslash ("\") code points (Hex) is 5C
Caret ("^") code points (Hex) is 5E
Tilde ("~") code points (Hex) is 7E
Left Square Bracket ("[") code points (Hex) is 5B
Right Square Bracket ("]") code points (Hex) is 5D
Grave Accent ("`") code points (Hex) is 60


Where has this crazy idea started?

The specification for URLs (RFC 1738, Dec. '94) poses a problem, in
that it limits the use of allowed characters in URLs to only a limited
subset of the US-ASCII character set.

RFC 1738, Dec. '94 says ( http://www.rfc-editor.org/rfc/rfc1738.txt ):

Characters can be unsafe for a number of reasons.  The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems.  The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it.  The character "%" is unsafe because it is used for
encodings of other characters.  Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".

All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.



According to RFC 3987 ( http://www.ietf.org/rfc/rfc3987.txt ), URLs
must be converted to UTF-8 character encoding.

other resources and some code:
http://www.w3.org/International/O-URL-code.html

On Jun 15, 8:39 am, Ronak <[email protected]> wrote:
> Thanks for reply,
>
> I dont have any problem in using URL but the problem is just showing
> url in browser's address bar.
> Also I have checked this in google search and it have indexed with
> %7c, this is very unusual.
>
> Keywords for google search is
> ahmedabad.citizensindia +isro
>
> So pls guide me wht I have to do ?
>
> Thanks again.
>
> On Jun 15, 9:10 am, ravindra kumar <[email protected]> wrote:
>
> > you should use URL  decode function for getting exact url
>
> > On Mon, Jun 14, 2010 at 7:30 PM, Ronak <[email protected]> wrote:
> > > Hii,
>
> > > I am facing such a critical problem of special characters to show in
> > > Internet Explorer.
>
> > > Let explain with example
>
> > > My Url is
>
> > >http://ahmedabad.citizensindia.com/c_ISRO||INDIAN-SPACE-RESEARCH-ORGANISATION||SPACE-APPLICATION-CE
>
> > > In this url I have replaced "/" with "||", this is just because I am
> > > using URL rewrite.
>
> > > when I test this url in Firefox it works perfectly but when I try in
> > > Internet Explorer then
> > > it shows
>
> > >http://ahmedabad.citizensindia.com/c_ISRO%7C%7CINDIAN-SPACE-RESEARCH-...
>
> > > why it replaced special characters with encoded characters ?
>
> > > I dont understand.
>
> > > I want to exact URL same as Firefox in IE to users.
>
> > > Pls tell me solution asap.
>
> > > Thanks.
>
> > --
> > Ravindra kumar
> > delhi

Reply via email to