New topic: Stripping HTML, just cant get it right :P
<http://forums.realsoftware.com/viewtopic.php?t=38310> Page 1 of 1 [ 1 post ] Previous topic | Next topic Author Message silvs Post subject: Stripping HTML, just cant get it right :PPosted: Thu Mar 24, 2011 10:20 pm Joined: Fri Aug 27, 2010 12:28 am Posts: 3 Its really starting to drive me up the wall i have tried the following: mbs plugin, current examples in this forum examples from RBU regex (as below, plus some alternates, this seems to be the closest but some html still causes me grief) Code: rx.searchPattern ="<script.+?<\s?/script>" rx.searchPattern ="<style.+?<\s?/style>" rx.searchPattern ="<.+?>" a great example url that pretty much breaks my code every time is any product from amazon, ie. http://www.amazon.com/exec/obidos/ASIN/ ... 2/sofa-20/ the tool is for people to check their own urls and count keyword density, so i need to be able to analyse the plain text version of their pages. with the current regex and some other trimming i seem to be able to cover most urls, but the amazon example shows that its not good enough This is for xPlatform btw so the mac textutil is out of the question unfortunately _________________ RS 2011r1 Enterprise MBP 17" i7: 10.7 AMD2 64bit: Ubuntu 11.04 MBP 15" i7: Windows7 Top Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending Page 1 of 1 [ 1 post ] -- Over 1500 classes with 29000 functions in one REALbasic plug-in collection. The Monkeybread Software Realbasic Plugin v9.3. http://www.monkeybreadsoftware.de/realbasic/plugins.shtml [email protected]
