Guess that kinda got chopped up huh... I'll send you the .cfm offlist
sorry
----- Original Message -----
From: "Ewok" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Saturday, July 19, 2003 11:03 PM
Subject: Re: deleting duplicates in table
> here is a custom tag i wrote a while back to do just what you want i
believe
> if anyone wants to workaround the evalute()'s feel free and let me see : )
>
> someone i know had a huge DB table full of emails with duplicates and this
> is what i came up with to help... then i said set all email fiields to NO
> DUPLICATES when you make the table : )
>
>
> <!--- {REQUIRED) The data source to use in the queries --->
> <CFPARAM NAME="ATTRIBUTES.DataSource" DEFAULT="" TYPE="STRING">
>
> <!--- {REQUIRED} The table name to query for duplicates --->
> <CFPARAM NAME="ATTRIBUTES.TableName" DEFAULT="" TYPE="STRING">
>
> <!--- {REQUIRED} The primary key of the table defined in
> ATTRIBUTES.tbl_name" --->
> <CFPARAM NAME="ATTRIBUTES.PrimaryKey" DEFAULT="" TYPE="STRING">
>
> <!--- {REQUIRED} The name of the column to check duplicates for --->
> <CFPARAM NAME="ATTRIBUTES.Column" DEFAULT="" TYPE="STRING">
>
> <!--- {OPTIONAL} To NOT display output to the screen (OF DUPLICATES ONLY)
> set this to NO --->
> <CFPARAM NAME="ATTRIBUTES.DisplayOutput" DEFAULT="Yes" TYPE="STRING">
>
> <!--- {OPTIONAL} Set Mode to "ViewOnly" to see the duplicates and
> information that will be deleted Without Deleting anything --->
> <CFPARAM NAME="ATTRIBUTES.Mode" DEFAULT="Process" TYPE="STRING">
>
>
>
> <!---***********************************************// ERROR CHECKS
> //*******************************************************--->
>
> <!--- If "ATTRIBUTES.Column is undefined, inform the user that it is
> required --->
> <cfif ATTRIBUTES.Column EQ "">
> <cfthrow message="The attribute ''<b>Column</b>'' is required. It is
the
> column you want to check duplicates for.">
> </cfif>
>
> <!--- If "ATTRIBUTES.DisplayOutput is not equal to Yes or No, inform
> the user that these are the only acceptable values --->
> <cfif ATTRIBUTES.DisplayOutput NEQ "YES" AND ATTRIBUTES.DisplayOutput
NEQ
> "No">
> <cfthrow message="''Yes'' and ''No'' are the only acceptable values for
> ''<b>DisplayOutput</b>''. Default is Yes">
> </cfif>
>
> <!--- If ATTRIBUTES.Mode is not equal to "Process" or "ViewOnly", inform
> the user that these are the only acceptable values --->
> <cfif ATTRIBUTES.Mode NEQ "Process" AND ATTRIBUTES.Mode NEQ "ViewOnly">
> <cfthrow message="''Process'' and ''ViewOnly'' are the only acceptable
> values for ''<b>Mode</b>''. Default is Process">
> </cfif>
>
> <!--- If ATTRIBUTES.DataSource was not defined, inform the user --->
> <cfif ATTRIBUTES.DataSource IS "">
> <cfthrow message="The attribute, ''<b>DataSource</b>'' is required and
> must be a valid datasource.">
> </cfif>
>
> <!--- If ATTRIBUTES.TableName was not defined, inform the user --->
> <cfif ATTRIBUTES.TableName IS "">
> <cfthrow message="The attribute, ''<b>TableName</b>'' is required and
> must be a valid table name in the datasource defined in
> ''<b>DataSource</b>''.">
> </cfif>
>
> <!--- If ATTRIBUTES.PrimaryKey was not defined, inform the user --->
> <cfif ATTRIBUTES.PrimaryKey IS "">
> <cfthrow message="The attribute, ''<b>PrimaryKey</b>'' is required and
> must be a the tables Unique Identifier.">
> </cfif>
>
> <!--- If ATTRIBUTES.Mode is ViewOnly but ATTRIBUTES.DisplayOutput is No,
> Inform the user that this accomplishes nothing --->
> <cfif ATTRIBUTES.Mode IS "ViewOnly" AND ATTRIBUTES.DisplayOutput NEQ
> "YES">
> <cfthrow message="Displaying <b>No</b> output in ''<b>ViewOnly</b>''
mode
> will not accomplish anything.">
> </cfif>
>
> <!---*******************************************// END ERROR CHECKS
> //******************************************************--->
>
> <!--- The Final list of Unique identifiers that will be deleted from the
> database will be stored in "FinalList"--->
> <CFPARAM NAME="FinalList" DEFAULT="">
>
> <!--- A temp list to hold the UID's of the current item's duplicates --->
> <CFPARAM NAME="TempList" DEFAULT="">
>
> <!--- Get ALL specified fields and Primary keys to loop over, checking for
> duplicates --->
> <cfquery name="GetAll" datasource="#ATTRIBUTES.DataSource#"
> blockfactor="100">
> SELECT #ATTRIBUTES.PrimaryKey#, #ATTRIBUTES.Column#
> FROM #ATTRIBUTES.TableName#
> ORDER by #ATTRIBUTES.Column#
> </cfquery>
>
> <!--- Select UNIQUE Entries to compare against the first query to get a
> count of extras (This is used for display only)--->
> <cfquery name="GetDistinct" datasource="#ATTRIBUTES.DataSource#"
> blockfactor="50">
> SELECT DISTINCT #ATTRIBUTES.Column#
> FROM #ATTRIBUTES.TableName#
> ORDER BY #ATTRIBUTES.Column#
> </cfquery>
>
>
> <!--- If displayOutput is YES output a list of duplicates (if any) --->
> <cfif ATTRIBUTES.DisplayOutput IS "YES">
>
> <!--- Extra ID's will be the first queries recordcount minus the second
> queries recordcount --->
> <cfset extras = GetAll.RecordCount - GetDistinct.RecordCount>
>
> <cfoutput>
> Before Purge Of Duplicates:<br>
> ------------------------------------------------------<br>
> <b>#GetAll.RecordCount#</b> entries in the database.
> <b>#GetDistinct.RecordCount#</b> are unique <i>(<b>#extras#</b>
> extras)</i><br>
> ------------------------------------------------------<br><br>
> <cfif GetAll.RecordCount EQ GetDistinct.RecordCount>
> <b>There were no duplicates in the database. No data will be deleted.</b>
> </cfif>
>
> </cfoutput>
>
> </cfif>
>
> <!--- loop through grouping by the field chosen to check duplicates
for --->
> <cfoutput query="GetAll" group="#ATTRIBUTES.Column#">
>
> <!--- get duplicate emails --->
> <cfquery name="GetDups" datasource="#ATTRIBUTES.DataSource#"
> blockfactor="50">
> SELECT #ATTRIBUTES.PrimaryKey#, #ATTRIBUTES.Column#
> FROM #ATTRIBUTES.TableName#
> WHERE #ATTRIBUTES.Column# = <cfif NOT
> IsNumeric(ATTRIBUTES.Column)>'</cfif>#evaluate(ATTRIBUTES.COLUMN)#<cfif
NOT
> IsNumeric(ATTRIBUTES.Column)>'</cfif>
> </cfquery>
>
> <!--- if the Information is entered more than once --->
> <cfif GetDups.RecordCount GT 1>
>
> <!--- Set all of the Uniquie identifiers, that are associated with the
> current return, to a temporary list --->
> <cfset TempList =
Evaluate("ValueList(GetDups.#Attributes.PrimaryKey#)")>
>
> <cfif ATTRIBUTES.DisplayOutput IS "YES">
> <b>#Evaluate(ATTRIBUTES.Column)#</b> is in the database
> <b>#getdups.recordcount#</b> times<br><br>
> </cfif>
>
> <!--- Remove ONE of the current Itmes PrimaryKeys from the list, so we
> keep ONE of their entries in the database when we loop over the list to
> delete --->
> <cfset TempList = ListDeleteAt(TempList, 1, ",")>
> </cfif>
>
> <!--- Append the temp list we made during THIS loop to the larger
> collection of Primary Keys if it was not empty--->
> <cfif templist NEQ "">
> <cfset FinalList = listappend(FinalList, TempList, ",")>
>
> <!--- reset the temp list so we can start over --->
> <cfset templist="">
>
> </cfif>
>
> </cfoutput>
>
> <cfif ATTRIBUTES.DisplayOutput IS "YES" AND len(FinalList)>
> <cfoutput>
> <br>
> <b>The Following list is a list of Primary Keys that will need to be
> deleted to rid the table of duplicate entries</b>
> <br><br>
> #FinalList#
> </cfoutput>
> </cfif>
>
> <!--- If ATTRIBUTES.Mode is Process, then delete all entries with A
primary
> key that is found in our final list --->
> <cfif ATTRIBUTES.Mode IS "Process">
>
> <!--- I did not use "IN(List)" so each delete query could be seen if
debug
> output is on --->
> <cfloop list="#FinalList#" index="i">
> <cfquery name="DeleteEntries" datasource="#ATTRIBUTES.DataSource#">
> DELETE
> FROM #ATTRIBUTES.TableName#
> WHERE #ATTRIBUTES.PrimaryKey# = <cfif NOT
IsNumeric(I)>'</cfif>#i#<cfif
> NOT IsNumeric(I)>'</cfif>
> </cfquery>
> </cfloop>
>
> </cfif>
>
>
>
>
>
>
> "Paul Hastings" <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]
> > > Hi list, with the following script (below) I am
> > > checking duplicate eMails and I delete
> > > them after checking them.
> >
> > you don't say what db BUT if its sql server (or another db with similar
> > index functionality), i'd just let the db do the hard work:
> > create another table duplicating your original table but put a unique
> index
> > with ignore duplicate key property on your email column.
> >
> > CREATE UNIQUE INDEX emailIDX
> > ON blabla(email)
> > WITH IGNORE_DUP_KEY
> >
> > then simply INSERT/SELECT from the old table into the new one. sql
server
> > will accept the first email, but throw away (ignore) all the duplicate
> ones.
> >
> >
> >
> >
> >
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4
Subscription:
http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Your ad could be here. Monies from ads go to support these lists and provide more
resources for the community.
http://www.fusionauthority.com/ads.cfm
Unsubscribe:
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4