Try this, watch out for wrap and you'll prob have to change stuff according
to your circumstances...

The basic theory is that you run a query and then populate a new empty query
with the cleaned up results from the first query and then use the new clean
query to index verity with:

<cfquery name="myRs" datasource="#dsn#">
select myid, mytitle, mytext from mytable
</cfquery>

<cfscript>
cleanRs = querynew("myid,mytitle,mytext");
for (i=1;i lte myRs.recordcount;i=i+1) {
        queryaddrow(cleanRs);
        querysetcell(cleanRs, "myid", mrRs.myid[i]);
        querysetcell(cleanRs, "mytitle", rereplace(myRs.mytitle[i],
"<[^>]*>", "", "ALL"));
        querysetcell(cleanRs, "mytext", rereplace(myRs.mytext[i], "<[^>]*>",
"", "ALL"));
}
</cfscript>

<cfindex action="REFRESH"
         collection="#verity#"
         key="myid"
         type="CUSTOM"
         title="mytitle"
         query="cleanRs"
         body="mytext">

> -----Original Message-----
> From: Dave Phipps [mailto:[EMAIL PROTECTED]
> Sent: 09 June 2003 11:10
> To: [EMAIL PROTECTED]
> Subject: RE: [ cf-dev ] Strip html and img
> 
> 
> Yep, I am using verity and the summary is cutting tags in 
> half.  How do I 
> strip the html out at the indexing of the query stage?  By 
> the way the db 
> is in MySQL.
> 
> Cheers
> 
> Dave
> 
> At 10:47 6/9/2003 +0100, you wrote:
> >yea, that was what I was getting at.
> >
> >btw, I'm assuming Dave is using verity in this instance as 
> he mentioned
> >summary and tags getting cut off, but its possible he's not 
> using verity at
> >all as he didn't mention it.
> >
> > > -----Original Message-----
> > > From: Tom Smith [mailto:[EMAIL PROTECTED]
> > > Sent: 09 June 2003 10:51
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: [ cf-dev ] Strip html and img
> > >
> > >
> > > Rich,
> > >
> > > you remember we were discussing using fake queries for verity
> > > indexing of
> > > DBs?  well, when you are writing to the fake query, you might
> > > as well also
> > > strip out any html at the same time...
> > >
> > > HTH
> > >
> > > TOM
> > > ----- Original Message -----
> > > From: "Rich Wild" <[EMAIL PROTECTED]>
> > > To: <[EMAIL PROTECTED]>
> > > Sent: Monday, June 09, 2003 10:38 AM
> > > Subject: RE: [ cf-dev ] Strip html and img
> > >
> > >
> > > > yea - I've come across this before.
> > > >
> > > > Is it possible that you can strip out the HTML *before* 
> it goes into
> > > verity?
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Dave Phipps [mailto:[EMAIL PROTECTED]
> > > > > Sent: 09 June 2003 10:44
> > > > > To: [EMAIL PROTECTED]
> > > > > Subject: Re: [ cf-dev ] Strip html and img
> > > > >
> > > > >
> > > > > The stripHTML function seems to work for the most part
> > > but on the odd
> > > > > occasion I end up with an output like this:
> > > > >
> > > > > 20 May 2003 Church leaders propose development of national RE
> > > > > syllabus The
> > > > > leaders of three Church education bodies - Anglican,
> > > > > Methodist and Free
> > > > > Church - have together proposed the development of a new
> > > > > national statutory
> > > > > Religious Education (RE) syllabus.   <IMG
> > > > >
> > > > > I think it misses the <IMG because the summary cuts the text
> > > > > at a certain
> > > > > point and it happens to fall in the middle of the img tag
> > > > > call and so the
> > > > > strip function is missing it.
> > > > >
> > > > > Is there any way to modify the stripHTML function so that it
> > > > > will also
> > > > > remove anything like, <img or <a href, that doesn't have
> > > a closing >?
> > > > >
> > > > > Cheers
> > > > >
> > > > > Dave
> > > > >
> > > > > At 09:15 6/9/2003 +0100, you wrote:
> > > > > >that should be removing all HTML tags, basically anything
> > > > > string inside
> > > > > >angle brackets and the angle brackets themselves will be
> > > > > replaced with an
> > > > > >empty string.
> > > > > >
> > > > > >what happens?
> > > > > >
> > > > > >
> > > > > >>Hi,
> > > > > >>
> > > > > >>Really enjoyed CF_Europe.  It was good to put some
> > > faces to all the
> > > > > >>names.  Now onto business.
> > > > > >>
> > > > > >>I have a verity search which uses an index from a query
> > > > > which grabs some
> > > > > >>html from a db and then outputs the results.  I am using
> > > > > the following UDF
> > > > > >>
> > > > > >>function StripHTML(str) {
> > > > > >>         return REReplaceNoCase(str,"<[^>]*>","","ALL");
> > > > > >>}
> > > > > >>
> > > > > >>to strip out the html which seems to be mostly working.
> > > > > Does anyone know
> > > > > >>how I could modify the above to also remove any <img
> > > > > src...> tags so that
> > > > > >>all I get in the results summary is text with no html?
> > > > > >>
> > > > > >>Thanks
> > > > > >>
> > > > > >>Dave
> > > > > >>
> > > > > >>
> > > > > >>--
> > > > > >>** Archive:
> > > > http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
> > > > >>
> > > > >>To unsubscribe, e-mail: 
> [EMAIL PROTECTED]
> > > > >>For additional commands, e-mail: 
> [EMAIL PROTECTED]
> > > > >>For human help, e-mail: [EMAIL PROTECTED]
> > > > >
> > > > >
> > > > >--
> > > > >** Archive:
> >http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
> > > >
> > > >To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > >For additional commands, e-mail: [EMAIL PROTECTED]
> > > >For human help, e-mail: [EMAIL PROTECTED]
> > >
> > >
> > > --
> > > ** Archive: 
http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
> >
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > For human help, e-mail: [EMAIL PROTECTED]
> >
> >
> > --
> > ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
> >
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > For human help, e-mail: [EMAIL PROTECTED]
>
>
>--
>** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
>
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>For human help, e-mail: [EMAIL PROTECTED]
>
>
>--
>** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
>
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>For human help, e-mail: [EMAIL PROTECTED]


-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]


-- 
** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
For human help, e-mail: [EMAIL PROTECTED]

Reply via email to