do you want to keep any formatting, as your example trims empty lines before your current paragraph etc.
On 13 Oct, 09:30, Flintstone <[EMAIL PROTECTED]> wrote: > I wish it was that easy. A regex was my first thought but it only > works for my simple example. > > You are correct, I should have provided a more detailed example. > > Basically, the method should remove all extraneous HTML from a string. > The resulting HTML string is to be displayed in a div, so trailing > newlines, br tags, whitespace etc. need to be stripped. If the > complete string is inside a p tag then that p tag should be removed. > This extra stuff is added by pretty much any HTML edit control we can > get our hands on, free or commercial. > > So in a nutshell: > > input: > "<p><br> <br /></p>" & vbTab & vbCrLf & " <p><p>this is a test</p>" & > vbCrLf & " <p>It really is</p> </p> <p><br><br /></p> " & vbTab & > vbCrLf > > output: > "<p>this is a test</p>" & vbCrLf & " <p>It really is</p>" > > logic: > The first p block should be removed because it contains no real > textual information. > Leaving vbTab & vbCrLf & " <p><p>this is a test</p>" & vbCrLf & " > <p>It really is</p> </p> <p><br><br /></p> " & vbTab & vbCrLf > > The tabs and linefeeds should be removed because they have no real > meaning in HTML. > Leaving " <p><p>this is a test</p>" & vbCrLf & " <p>It really is</p> </ > p> <p><br><br /></p> " > > Again the whitespace should be trimmed: > Leaving "<p><p>this is a test</p>" & vbCrLf & " <p>It really is</p> </ > p> <p><br><br /></p>" > > The last p block should be removed because it contains no real textual > information. > Leaving "<p><p>this is a test</p>" & vbCrLf & " <p>It really is</p> </ > p> " > > Again the whitespace should be trimmed: > Leaving "<p><p>this is a test</p>" & vbCrLf & " <p>It really is</p> </ > p>" > > Having the whole HTML block inside of p tags in a div is useless so > these extra p tags should be removed: > Leaving "<p>this is a test</p>" & vbCrLf & " <p>It really is</p> " > > Finally the whitespace should be trimmed: > Leaving "<p>this is a test</p>" & vbCrLf & " <p>It really is</p>" > > I know this must be possible and there is surely an easy way to do it > but nothing seems to work. It is a much more complex problem than it > first appears :o( --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "DotNetDevelopment, VB.NET, C# .NET, ADO.NET, ASP.NET, XML, XML Web Services,.NET Remoting" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/DotNetDevelopment You may subscribe to group Feeds using a RSS Feed Reader to stay upto date using following url <a href="http://feeds.feedburner.com/DotNetDevelopment"> http://feeds.feedburner.com/DotNetDevelopment</a> -~----------~----~----~----~------~----~------~--~---
