I need to process HTML which is not wellformed and tools like Tidy *cannot* make wellformed. I decided to apply some regexps to fulfill this task.
My structure is HTML with some extra tags, that I need to extract e.g.: ... <table border="0" cellspacing="0" cellpadding="0" width="100%"> <tr> <my-contenttype name="foo"> <td> <table border="0" cellspacing="0" cellpadding="0" width="100%"> <tr> <my-attribute name="bar"> <td class="headline_01"> FooBar </td> </my-attribute> </tr> </table> </td> </my-contenttype> </tr> <table> ... I need to extract every opening and closing my tag and also extract all text between my-attribute tags. The result of the regexp should be: <my-contenttype name="foo"> <my-attribute name="bar"> FooBar </my-attribute> </my-contenttype> Can anybody help. -- Geschenkt: 3 Monate GMX ProMail + 3 Top-Spielfilme auf DVD ++ Jetzt kostenlos testen http://www.gmx.net/de/go/mail ++ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]