I've been thinking about this problem for a little while, and the thing
is, I can think of ways of doing it, but they're not very nice, and I
don't think they're going to be fast.
Basically, I have a load of HTML formatted content in a database that
get displayed onto the site. It's part of a rudimentary CMS.
Currently, the titles for each article are displayed on a page, and each
title links to the full article. However, that leaves me with a page
which is essentially a list of links, and that's not ideal for SEO. What
I wanted to do to enhance the page is to have a short excerpt of x
number of words/characters beneath each article title. The idea being
that search engines will find the page as more than a link farm, and
visitors won't have to just rely on the title alone for the content.
Here's the rub though. As the content is in HTML form, I can't just grab
the first 100 characters and display them as that could leave an open
tag without a closing one, potentially breaking the page. I could use
strip_tags on the 100-character excerpt, but what if the excerpt itself
broke a tag in half (i.e. <acronym title="something"> could become
The only solutions I can see are:
* retrieve the entire article, perform a strip_tags and then take
* use a regex inside of mysql to pull out only the text
The thing is, neither of these seems particularly pretty, and I am sure
there's a better way, but it's too early in the week for my brain to be
fully functional I think!
Does anyone have any ideas about what I could do, or do you think I'm
seeing problems where there are none?