Thanks to all for the suggestions.
Dave
At 11:10 6/9/2003 +0100, you wrote:
Try this, watch out for wrap and you'll prob have to change stuff according to your circumstances...
The basic theory is that you run a query and then populate a new empty query with the cleaned up results from the first query and then use the new clean query to index verity with:
<cfquery name="myRs" datasource="#dsn#"> select myid, mytitle, mytext from mytable </cfquery>
<cfscript> cleanRs = querynew("myid,mytitle,mytext"); for (i=1;i lte myRs.recordcount;i=i+1) { queryaddrow(cleanRs); querysetcell(cleanRs, "myid", mrRs.myid[i]); querysetcell(cleanRs, "mytitle", rereplace(myRs.mytitle[i], "<[^>]*>", "", "ALL")); querysetcell(cleanRs, "mytext", rereplace(myRs.mytext[i], "<[^>]*>", "", "ALL")); } </cfscript>
<cfindex action="REFRESH" collection="#verity#" key="myid" type="CUSTOM" title="mytitle" query="cleanRs" body="mytext">
> -----Original Message----- > From: Dave Phipps [mailto:[EMAIL PROTECTED] > Sent: 09 June 2003 11:10 > To: [EMAIL PROTECTED] > Subject: RE: [ cf-dev ] Strip html and img > > > Yep, I am using verity and the summary is cutting tags in > half. How do I > strip the html out at the indexing of the query stage? By > the way the db > is in MySQL. > > Cheers > > Dave > > At 10:47 6/9/2003 +0100, you wrote: > >yea, that was what I was getting at. > > > >btw, I'm assuming Dave is using verity in this instance as > he mentioned > >summary and tags getting cut off, but its possible he's not > using verity at > >all as he didn't mention it. > > > > > -----Original Message----- > > > From: Tom Smith [mailto:[EMAIL PROTECTED] > > > Sent: 09 June 2003 10:51 > > > To: [EMAIL PROTECTED] > > > Subject: Re: [ cf-dev ] Strip html and img > > > > > > > > > Rich, > > > > > > you remember we were discussing using fake queries for verity > > > indexing of > > > DBs? well, when you are writing to the fake query, you might > > > as well also > > > strip out any html at the same time... > > > > > > HTH > > > > > > TOM > > > ----- Original Message ----- > > > From: "Rich Wild" <[EMAIL PROTECTED]> > > > To: <[EMAIL PROTECTED]> > > > Sent: Monday, June 09, 2003 10:38 AM > > > Subject: RE: [ cf-dev ] Strip html and img > > > > > > > > > > yea - I've come across this before. > > > > > > > > Is it possible that you can strip out the HTML *before* > it goes into > > > verity? > > > > > > > > > > > > > -----Original Message----- > > > > > From: Dave Phipps [mailto:[EMAIL PROTECTED] > > > > > Sent: 09 June 2003 10:44 > > > > > To: [EMAIL PROTECTED] > > > > > Subject: Re: [ cf-dev ] Strip html and img > > > > > > > > > > > > > > > The stripHTML function seems to work for the most part > > > but on the odd > > > > > occasion I end up with an output like this: > > > > > > > > > > 20 May 2003 Church leaders propose development of national RE > > > > > syllabus The > > > > > leaders of three Church education bodies - Anglican, > > > > > Methodist and Free > > > > > Church - have together proposed the development of a new > > > > > national statutory > > > > > Religious Education (RE) syllabus. <IMG > > > > > > > > > > I think it misses the <IMG because the summary cuts the text > > > > > at a certain > > > > > point and it happens to fall in the middle of the img tag > > > > > call and so the > > > > > strip function is missing it. > > > > > > > > > > Is there any way to modify the stripHTML function so that it > > > > > will also > > > > > remove anything like, <img or <a href, that doesn't have > > > a closing >? > > > > > > > > > > Cheers > > > > > > > > > > Dave > > > > > > > > > > At 09:15 6/9/2003 +0100, you wrote: > > > > > >that should be removing all HTML tags, basically anything > > > > > string inside > > > > > >angle brackets and the angle brackets themselves will be > > > > > replaced with an > > > > > >empty string. > > > > > > > > > > > >what happens? > > > > > > > > > > > > > > > > > >>Hi, > > > > > >> > > > > > >>Really enjoyed CF_Europe. It was good to put some > > > faces to all the > > > > > >>names. Now onto business. > > > > > >> > > > > > >>I have a verity search which uses an index from a query > > > > > which grabs some > > > > > >>html from a db and then outputs the results. I am using > > > > > the following UDF > > > > > >> > > > > > >>function StripHTML(str) { > > > > > >> return REReplaceNoCase(str,"<[^>]*>","","ALL"); > > > > > >>} > > > > > >> > > > > > >>to strip out the html which seems to be mostly working. > > > > > Does anyone know > > > > > >>how I could modify the above to also remove any <img > > > > > src...> tags so that > > > > > >>all I get in the results summary is text with no html? > > > > > >> > > > > > >>Thanks > > > > > >> > > > > > >>Dave > > > > > >> > > > > > >> > > > > > >>-- > > > > > >>** Archive: > > > > http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > > > > >> > > > > >>To unsubscribe, e-mail: > [EMAIL PROTECTED] > > > > >>For additional commands, e-mail: > [EMAIL PROTECTED] > > > > >>For human help, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > > >-- > > > > >** Archive: > >http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > > > > > > > >To unsubscribe, e-mail: [EMAIL PROTECTED] > > > >For additional commands, e-mail: [EMAIL PROTECTED] > > > >For human help, e-mail: [EMAIL PROTECTED] > > > > > > > > > -- > > > ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > For human help, e-mail: [EMAIL PROTECTED] > > > > > > -- > > ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > For human help, e-mail: [EMAIL PROTECTED] > > >-- >** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] >For human help, e-mail: [EMAIL PROTECTED] > > >-- >** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/ > >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] >For human help, e-mail: [EMAIL PROTECTED]
-- ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] For human help, e-mail: [EMAIL PROTECTED]
-- ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] For human help, e-mail: [EMAIL PROTECTED]
-- ** Archive: http://www.mail-archive.com/dev%40lists.cfdeveloper.co.uk/
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] For human help, e-mail: [EMAIL PROTECTED]
