Hi Harvey,

Regex is probably not the best thing to use to fix HTML. HTML Tidy
will probably be a better solution.

Looking at your regex, a few comments:

- Do you really need to use \s (which will match a space, tab,
carriage return, new line) or will a space suffice?
- The pattern in the capturing parentheses probably could be
simplified to something like: .*?
-- NOTE: you would wrap that pattern in capturing parentheses and put
a trailing space after the closing parenthesis

Hard to do regex here, but maybe something like this (untested):

src *= *(.*?)

NOTE: there is a trailing space in the regex. The replacement string
would be something like this (untested again):

"$1"

Hope this helps.

On Jun 28, 5:50 pm, [email protected] wrote:
> Thanks for the replies everyone. My mail is with Webdrive so I lost
> email shortly after posting this request, so I couldn't check replies or
> reply myself any sooner. I managed to find my own solution in the meantime.
>
> In this case, I only really cared about missing src attributes in img
> tags, so this is what I came up with.
>
> src\s*=\s*([/a-zA-z0-9].*?)(>|( [a-z]+)=)
>
> Which needs to be run at least twice to clean all attributes in a tag.
>
> Thanks,
>
> Harvey.
>
> On 28/06/2011 10:24 a.m., Matthew Whyte wrote:
>
>
>
>
>
>
>
>
>
> > Hi Harvey,
> > I don't have a regex handy, but from memory the last time I needed to
> > do something similar I used the "clean up HTML" option in Dreamweaver,
> > which did the trick. (I don't use Dreamweaver for anything else, I've
> > only got it because it came part of the Adobe Suite!)
>
> > Cheers,
>
> > Matthew Whyte
>
> > Managing Director | digiCreative
>
> > T
>
> > +64 7 959 8230
>
> > F
>
> > +64 7 974 9059
>
> > E
>
> > [email protected] <mailto:[email protected]>
>
> > W
>
> > digicreative.co.nz <http://digicreative.co.nz/>
>
> > digiCreative
>
> > 5 King St | PO Box 19492, Hamilton, New Zealand
>
> > ------------------------------------------------------------------------
>
> > The content of this email is confidential and may be legally
> > privileged.  If it is not intended for you, please email the sender
> > immediately and destroy the original message.
>
> > On Tue, Jun 28, 2011 at 10:17 AM, <[email protected]
> > <mailto:[email protected]>> wrote:
>
> >     Hi All,
>
> >     I need to fix up some sloppy HTML which is (in some cases) missing
> >     quotes around the HTML attributes.
>
> >     eg <img src=filename.jpg width=100 height=100>
>
> >     Does anyone have a tested regex sitting in their collection for
> >     adding back in those missing quotes?
>
> >     Thanks,
>
> >     Harvey.
>
> >     --
> >     Harvey Kane
>
> >     Phone:
> >     - Auckland: +64 9 950 4133
> >     - Wanaka: +64 3 746 8133
> >     - Mobile: +64 21 811 951
>
> >     Email: [email protected] <mailto:[email protected]>
> >      If you need to contact me urgently, please read my email policy
> >    www.ragepank.com/email/<http://www.ragepank.com/email/>
>
> >     --
> >     NZ PHP Users Group:http://groups.google.com/group/nzphpug
> >     To post, send email to [email protected]
> >     <mailto:[email protected]>
> >     To unsubscribe, send email to
> >     [email protected]
> >     <mailto:nzphpug%[email protected]>
>
> > --
> > NZ PHP Users Group:http://groups.google.com/group/nzphpug
> > To post, send email to [email protected]
> > To unsubscribe, send email to
> > [email protected]
>
> --
> Harvey Kane
>
> Phone:
> - Auckland: +64 9 950 4133
> - Wanaka: +64 3 746 8133
> - Mobile: +64 21 811 951
>
> Email: [email protected]
>   If you need to contact me urgently, please read my email 
> policywww.ragepank.com/email/

-- 
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]

Reply via email to