[PHP] Re: stripping quotes from urls and images

2002-07-31 Thread Joel Boonstra

 hi guys i now have a problem with urls i need to remove the quotes from both
 href= and src=

 so
 a href=blah img src= needs to be a href= img src= and i cant
 remove quotes from all string matches :|

Someone mentioned this already, but you should really know that if you
remove quotes from your HTML attributes, your HTML will no longer be
forward-compatible with XHTML, which you really should strive for.  What
reason would you have for removing the quotes?  There must be a better
workaround than removing them.

Regarding your question, it really depends on how you can operate on
your HTML.  Where is your HTML?  Is it in a database, string, etc?  Is
it all in one chunk?  That is, do you have stuff like this:

  a href=some/urlimg src=some/imageThis link has quotes/img/a

that can't just have all quotes removed entirely from it?  If you really
need to remove quotes from attributes (and I don't think you do), and
all of your HTML is in one big string, you're looking at a regular
expression.  Read up on them here:

  http://www.php.net/manual/en/pcre.pattern.syntax.php

Yep, that's a lot of reading.  But you should learn about regular
expressions; they'll be useful all over the place.  If you happen to be
on a Linux/Unix/BSD/whatever machine that has man pages and perl
installed, check out `man perlre`.  Or read on-line here:

  http://www.perldoc.com/perl5.6.1/pod/perlre.html

Some things to watch out for -- even if you do want to remove quotes
from your attributes, I'm 100% sure you don't want to remove from *all*
of them.  Like this:

  img src=/some/image/ title=this is a pop-up description /

Removing the quotes from the title attribute will likely break at least
some browsers, if not all.  So your regular expression needs to be able
to handle that gracefully.

Good luck!

-- 
[ joel boonstra | [EMAIL PROTECTED] ]


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP] RE: stripping quotes from urls and images

2002-07-31 Thread electroteque

i sorter need a preg example i'm not very good at it , and its for a wysiwyg
dhtml editor , it reformats those tags if the quotes are there when i load
the content and stuffs the code

-Original Message-
From: Joel Boonstra [mailto:[EMAIL PROTECTED]]
Sent: Thursday, August 01, 2002 6:18 AM
To: [EMAIL PROTECTED]
Cc: electroteque
Subject: Re: stripping quotes from urls and images


 hi guys i now have a problem with urls i need to remove the quotes from
both
 href= and src=

 so
 a href=blah img src= needs to be a href= img src= and i cant
 remove quotes from all string matches :|

Someone mentioned this already, but you should really know that if you
remove quotes from your HTML attributes, your HTML will no longer be
forward-compatible with XHTML, which you really should strive for.  What
reason would you have for removing the quotes?  There must be a better
workaround than removing them.

Regarding your question, it really depends on how you can operate on
your HTML.  Where is your HTML?  Is it in a database, string, etc?  Is
it all in one chunk?  That is, do you have stuff like this:

  a href=some/urlimg src=some/imageThis link has quotes/img/a

that can't just have all quotes removed entirely from it?  If you really
need to remove quotes from attributes (and I don't think you do), and
all of your HTML is in one big string, you're looking at a regular
expression.  Read up on them here:

  http://www.php.net/manual/en/pcre.pattern.syntax.php

Yep, that's a lot of reading.  But you should learn about regular
expressions; they'll be useful all over the place.  If you happen to be
on a Linux/Unix/BSD/whatever machine that has man pages and perl
installed, check out `man perlre`.  Or read on-line here:

  http://www.perldoc.com/perl5.6.1/pod/perlre.html

Some things to watch out for -- even if you do want to remove quotes
from your attributes, I'm 100% sure you don't want to remove from *all*
of them.  Like this:

  img src=/some/image/ title=this is a pop-up description /

Removing the quotes from the title attribute will likely break at least
some browsers, if not all.  So your regular expression needs to be able
to handle that gracefully.

Good luck!

--
[ joel boonstra | [EMAIL PROTECTED] ]


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] RE: stripping quotes from urls and images

2002-07-31 Thread Justin French

I'm reminded of a reasonable quote:

It's easy to write HTML, and impossible to parse it  Because HTML is *SO*
easy to write, and has so many options, and is easy to screw up AND have
browsers still work, it's totally evil to parse.

You end up with massive regular expressions, or a state engine which reads
the string one character at a time, and can keep track of what's going on,
then make changes.

I've been working on a state engine for a while, with some success.

Basically, a browser is just a big state engine.


I think you should perhaps look at your problem from another angle... there
are MANY valid ways to write HTML tags... whatever your system does, it
should support user preferences.

A HREF=foo.php = valid
A HREF=foo.php = valid
A HREF='foo.php' = valid

Plus combinations of single and double quotes for different attributes, etc
etc.

Check out http://www.w3.org/TR/1999/REC-html401-19991224/intro/sgmltut.html


Justin French


on 01/08/02 10:43 AM, electroteque ([EMAIL PROTECTED]) wrote:

 i sorter need a preg example i'm not very good at it , and its for a wysiwyg
 dhtml editor , it reformats those tags if the quotes are there when i load
 the content and stuffs the code
 
 -Original Message-
 From: Joel Boonstra [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, August 01, 2002 6:18 AM
 To: [EMAIL PROTECTED]
 Cc: electroteque
 Subject: Re: stripping quotes from urls and images
 
 
 hi guys i now have a problem with urls i need to remove the quotes from
 both
 href= and src=
 
 so
 a href=blah img src= needs to be a href= img src= and i cant
 remove quotes from all string matches :|
 
 Someone mentioned this already, but you should really know that if you
 remove quotes from your HTML attributes, your HTML will no longer be
 forward-compatible with XHTML, which you really should strive for.  What
 reason would you have for removing the quotes?  There must be a better
 workaround than removing them.
 
 Regarding your question, it really depends on how you can operate on
 your HTML.  Where is your HTML?  Is it in a database, string, etc?  Is
 it all in one chunk?  That is, do you have stuff like this:
 
 a href=some/urlimg src=some/imageThis link has quotes/img/a
 
 that can't just have all quotes removed entirely from it?  If you really
 need to remove quotes from attributes (and I don't think you do), and
 all of your HTML is in one big string, you're looking at a regular
 expression.  Read up on them here:
 
 http://www.php.net/manual/en/pcre.pattern.syntax.php
 
 Yep, that's a lot of reading.  But you should learn about regular
 expressions; they'll be useful all over the place.  If you happen to be
 on a Linux/Unix/BSD/whatever machine that has man pages and perl
 installed, check out `man perlre`.  Or read on-line here:
 
 http://www.perldoc.com/perl5.6.1/pod/perlre.html
 
 Some things to watch out for -- even if you do want to remove quotes
 from your attributes, I'm 100% sure you don't want to remove from *all*
 of them.  Like this:
 
 img src=/some/image/ title=this is a pop-up description /
 
 Removing the quotes from the title attribute will likely break at least
 some browsers, if not all.  So your regular expression needs to be able
 to handle that gracefully.
 
 Good luck!
 
 --
 [ joel boonstra | [EMAIL PROTECTED] ]
 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




RE: [PHP] RE: stripping quotes from urls and images

2002-07-31 Thread electroteque

hi mate , i sorter just need a regular expression only when loading it , i
use tidy to reformat it back to go into the database  its how it handles
tags, can anyone help ?

-Original Message-
From: Justin French [mailto:[EMAIL PROTECTED]]
Sent: Thursday, August 01, 2002 11:59 AM
To: electroteque; Joel Boonstra; [EMAIL PROTECTED]
Subject: Re: [PHP] RE: stripping quotes from urls and images


I'm reminded of a reasonable quote:

It's easy to write HTML, and impossible to parse it  Because HTML is *SO*
easy to write, and has so many options, and is easy to screw up AND have
browsers still work, it's totally evil to parse.

You end up with massive regular expressions, or a state engine which reads
the string one character at a time, and can keep track of what's going on,
then make changes.

I've been working on a state engine for a while, with some success.

Basically, a browser is just a big state engine.


I think you should perhaps look at your problem from another angle... there
are MANY valid ways to write HTML tags... whatever your system does, it
should support user preferences.

A HREF=foo.php = valid
A HREF=foo.php = valid
A HREF='foo.php' = valid

Plus combinations of single and double quotes for different attributes, etc
etc.

Check out http://www.w3.org/TR/1999/REC-html401-19991224/intro/sgmltut.html


Justin French


on 01/08/02 10:43 AM, electroteque ([EMAIL PROTECTED]) wrote:

 i sorter need a preg example i'm not very good at it , and its for a
wysiwyg
 dhtml editor , it reformats those tags if the quotes are there when i load
 the content and stuffs the code

 -Original Message-
 From: Joel Boonstra [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, August 01, 2002 6:18 AM
 To: [EMAIL PROTECTED]
 Cc: electroteque
 Subject: Re: stripping quotes from urls and images


 hi guys i now have a problem with urls i need to remove the quotes from
 both
 href= and src=

 so
 a href=blah img src= needs to be a href= img src= and i cant
 remove quotes from all string matches :|

 Someone mentioned this already, but you should really know that if you
 remove quotes from your HTML attributes, your HTML will no longer be
 forward-compatible with XHTML, which you really should strive for.  What
 reason would you have for removing the quotes?  There must be a better
 workaround than removing them.

 Regarding your question, it really depends on how you can operate on
 your HTML.  Where is your HTML?  Is it in a database, string, etc?  Is
 it all in one chunk?  That is, do you have stuff like this:

 a href=some/urlimg src=some/imageThis link has quotes/img/a

 that can't just have all quotes removed entirely from it?  If you really
 need to remove quotes from attributes (and I don't think you do), and
 all of your HTML is in one big string, you're looking at a regular
 expression.  Read up on them here:

 http://www.php.net/manual/en/pcre.pattern.syntax.php

 Yep, that's a lot of reading.  But you should learn about regular
 expressions; they'll be useful all over the place.  If you happen to be
 on a Linux/Unix/BSD/whatever machine that has man pages and perl
 installed, check out `man perlre`.  Or read on-line here:

 http://www.perldoc.com/perl5.6.1/pod/perlre.html

 Some things to watch out for -- even if you do want to remove quotes
 from your attributes, I'm 100% sure you don't want to remove from *all*
 of them.  Like this:

 img src=/some/image/ title=this is a pop-up description /

 Removing the quotes from the title attribute will likely break at least
 some browsers, if not all.  So your regular expression needs to be able
 to handle that gracefully.

 Good luck!

 --
 [ joel boonstra | [EMAIL PROTECTED] ]



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php