You're right about url encoding and forms. However, binary transmissions
have nothing to with HTML or XML. Binary has to be transmitted via HTTP
under specific MIME headers, so the web server/application won't try to
parse it as text. My rule of thumb is, don't stick protected characters in
text content, unless you're SURE it won't get parsed incorrectly. A sure bet
that it won't get messed up, is inside a <TEXTAREA> block for example. Sure,
you can use ">" and "<" anywhere you want, provided you don't break the
cardinal rules.

Try this on for size:

<HTML>
<blah/>
</HTML>

Then try:

<HTML>
< blah />
</HTML>

and finally:

<HTML>
&lt;blah/&gt;
</HTML>

 According to what I've been reading, all 3 of the above should display the
string "<blah\>". However, option 1 breaks the rules by not having white
space or "!" after "<" to identify that it's not an HTML element. Since
there is a "/" before the ">", the parser thinks it's a self-closing tag. In
this case, you will have to encode the string with either hex values or "&"
representations.

 On a side note, my XML subroutine will only extract tags and content
between matching pairs. "<>" will be included in any wrapping elements,
unless there is a matching "</>". Self-closing tags are handled differently,
due to their syntax difference. Is it 100% fool proof? No, it's not a
commercially developed application. It does a good job though.

<TAG><></TAG> will return <> as the element content for "TAG".
<TAG><></></TAG> will return "TAG" as null and delete the "" tag pair.
<TAG value="1"/> wil return TAG = 1.

Glen
http://picksource.com

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Craig Bennett
> Sent: Thursday, September 16, 2004 7:34 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [U2] [UV] Processing a string
>
>
> Glen,
>
> >   Per HTTP 1.0-1.2 specifications, ">" and "<" are not exempt
> from content
> > encoding requirements. They are protected characters and must
> be treated as
> > such when sending content. Light bulb going off yet?
> Surely you don't mean the HTTP specifications? (Which the W3 have
> officially closed at HTTP/1.1).
>
>  > If you must use a ">" or "<" character as a non-elemental string, in
>  > ANY
>  > media, transferred through an HTTP 1.0 to 1.2 compliant application
>  > then you
>  > MUST URL-encode them as &lt;, &gt; or their equiv. charset hex values
>  > as
>  > %XX;. Comments are an exception to this rule, but you can still have
>  > problems with general parsing if you put protected characters in the
>  > comments. I always url-encode my non-alpha-numeric strings.
>
> You do not have to URL encode these characters at all, otherwise you
> could never send XML or indeed binary data over HTTP (image/jpeg).
>
> If you are sending a body with a specific content then encoding rules
> will apply, but these are defined by other standards. Perhaps you are
> thinking of the HTML standards for POSTING data using the
> application/x-www-form-urlencoded content type?
>
>
> Craig
> -------
> u2-users mailing list
> [EMAIL PROTECTED]
> To unsubscribe please visit http://listserver.u2ug.org/
-------
u2-users mailing list
[EMAIL PROTECTED]
To unsubscribe please visit http://listserver.u2ug.org/

Reply via email to