I might use Regex instead to search for : 1. Any <p> optionally followed by spaces and mandatorily followed by one more <p>. 2. Any </p> optionally preceded by spaces and mandatorily preceded by one more </p>. --- (\<p\>\s?(?=\<p\>)|(?<=\</p\>)\s?\</p\>) ---
In the string "<p> <p>this is a test</p> <p>It really is</p> </p>", it matches the first "<p> " and last " </p>". Is that what you require ? If not please provide a more detailed example. (read 'more text') On Oct 2, 8:28 pm, Flintstone <[EMAIL PROTECTED]> wrote: > I am trying to strip useless HTML from a HTML string returned from a > user control. Basically extra <p> tags, spaces etc. need to be > removed. > > e.g. > Calling the method with > vbTab & vbCrLf & " <p><p>this is a test</p> <p>It really is</p> </ > p> " & vbTab & vbCrLf > should return "<p>this is a test</p> <p>It really is</p>" > > Here is the code I have so far but it breaks a few valid HTML strings > and strips too many tags. Has anybody managed to successfully do this? > > Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As > System.EventArgs) Handles btnGo2.Click > txtHTMLString.Text = vbTab & vbCrLf & " <p><p>this is a test</ > p> <p>It really is</p> </p> " & vbTab & vbCrLf > > txtHTMLString.Text = StripExtraneousHTML(txtHTMLString.Text) > End Sub > > Public Function StripExtraneousHTML(ByVal s As String) As String > Dim i, skip As Integer > Dim flag As Boolean = True > > If s Is Nothing Then Return Nothing > > While flag > flag = False > > s = s.Trim() > > Select Case Strings.Right(s, 6) > Case " " > flag = True > s = Strings.Left(s, s.Length - 6) > Case "<br />" > flag = True > s = Strings.Left(s, s.Length - 6) > End Select > > Select Case Strings.Right(s, 4) > Case "</p>" > flag = True > skip = 0 > For i = s.Length - 7 To 0 Step -1 > If s.Substring(i, 4) = "</p>" Then > skip += 1 > End If > > If s.Substring(i, 3) = "<p>" Then > If skip = 0 AndAlso i = 0 Then > s = Strings.Left(s, i) & > Strings.Mid(s, i + 4) > Exit For > Else > skip -= 1 > End If > End If > Next > > If i = 0 Then > s = Strings.Left(s, s.Length - 4) > End If > Case "<br>" > flag = True > s = Strings.Left(s, s.Length - 4) > Case "<br />" > flag = True > s = Strings.Left(s, s.Length - 6) > End Select > > If Strings.Right(s, 1) = vbCrLf Then > flag = True > s = Strings.Left(s, s.Length - 1) > End If > End While > > s = s.Replace(vbTab, "") > > Return s > End Function --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "DotNetDevelopment, VB.NET, C# .NET, ADO.NET, ASP.NET, XML, XML Web Services,.NET Remoting" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://cm.megasolutions.net/forums/default.aspx <p><a href="http://feeds.feedburner.com/DotNetDevelopment"><img src="http://feeds.feedburner.com/~fc/DotNetDevelopment?bg=99CCFF&fg=444444&anim=1" height="26" width="88" style="border:0" alt="" /></a></p> -~----------~----~----~----~------~----~------~--~---
