Which is why regex wont help that much. Using tidy with domdoc could help out very well:
http://php.net/manual/en/book.tidy.php <http://php.net/manual/en/book.tidy.php> Thank You Chuck Reeves On Fri, Nov 19, 2010 at 11:27 AM, Yitzchak Schaffer < yitzchak.schaf...@gmx.com> wrote: > > 2010/11/19 Peter Sawczynec <p...@blu-studio.com <mailto:p...@blu-studio.com > >> > > > > > I need to be able to grab the first 10 words or so from a HTML laden > > text string that is held in a db table field where the typical field > > content might be as follows below. After the processing, being left > > with only the straight text with punctuation would be fine > > > > On 19-Nov-10 10:06, Chuck Reeves wrote: > >> I would recommend using DOMDocument instead of regex. >> >> > ... unless of course the HTML is subject to wonkiness. > > -- > Yitzchak Schaffer > > _______________________________________________ > New York PHP Users Group Community Talk Mailing List > http://lists.nyphp.org/mailman/listinfo/talk > > http://www.nyphp.org/Show-Participation >
_______________________________________________ New York PHP Users Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/Show-Participation